Question

In: Computer Science

Examine classification using logistic regression. In R console, type mtcars. The dataset mtcars is a generic...

Examine classification using logistic regression. In R console, type mtcars. The dataset mtcars is a generic dataset in R. This dataset comprises of fuel consumption and 10 aspects of automobile design and performance for 32 automobiles. Using only the variables am (0 = automatic, 1 = manual) and mpg, your task is to fit a logistic regression model. Complete the following steps using R.

  1. Create a scatter plot of am vs. mpg. Describe the relationship and explain why a simple linear regression model may not be suitable.
  2. Using the variables am and mpg, fit a logistic regression model. Use the function glm().
  3. Write the estimated logistic regression model from part #2 results.
  4. Suppose a car has 16 mpg. How would you classify the transmission: automatic or manual? Explain and show how you classified the transmission.

Solutions

Expert Solution

We first load and perform the task with the following codeFrom the above graph we see that the data is not continuous and is distributed on the extremities of both the axis. Such problems cannot be accurately resolved or worked with a traditional linear regression approach as linear regression best deals with continuous forms of data. However we can solve this using a logistic regression approach which allows us to categorize variables based on their tags and are very useful when we need to categorize them as in this case of manual vs automatic

The logistic regression model can be viewed in the above segment of code with it's complete summary.

When tested with a value of mpg=16,we obtain that the probability that it has a manual transmission is very low and the cut off being as a value of 0.5 predicts that it is automatic in nature.

Note:This might not be the optimal combination of attributes which lead to an accurate prediction of the value seen and better predictions can be obtained through more efficient decisions of better correlated attributes


Related Solutions

Using the Motor Trend Car Road Tests dataset mtcars, in faraway R package, fit a model...
Using the Motor Trend Car Road Tests dataset mtcars, in faraway R package, fit a model with mpg: Miles/(US) gallon as the response and the other variables as predictors. (a) Which variables are statistically significant at the 5% level? For each and every test provide the null and alternative hypotheses, critical region (or rejection region), test statistics and your conclusions. (30) (b) What interpretation should be given to the coefficient for vs: Engine? (3) (c) Compute 90 and 95% confidence...
R has many build-in dataset. The data mtcars is one of them. The following R code...
R has many build-in dataset. The data mtcars is one of them. The following R code read-in data and save the data to input.                   input <- mtcars[,c("am","cyl","hp","wt")]              Write a few line of R code to conduct a regression analysis with am as the response variable, and              cyl, hp, wt as explanation variables.
Please describe Logistic Regression classification methods and its usage?
Please describe Logistic Regression classification methods and its usage?
For the mtcars dataset (in R), comment on how Rear axle ratio is associated with Displacement...
For the mtcars dataset (in R), comment on how Rear axle ratio is associated with Displacement (cu.in.) and Miles/(US) gallon using the following steps: Draw appropriate scatterplot(s) [Show your code in “R Code” section. Leave “Answer” section blank. Justify your choice of charts in a few sentences in “Comments” section. No screenshots of the charts are required.] Determine correlation coefficient(s) [Show your code in “R Code” section. Show the answer in “Answer” section. Leave “Comments” section blank.] Calculate and plot...
R has a number of datasets built in. One such dataset is called mtcars. This data...
R has a number of datasets built in. One such dataset is called mtcars. This data set contains fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models) as reported in a 1974 issue of Motor Trend Magazine. We do not have to read in these built-in datasets. We can just attach the variables by using the code attach(mtcars) We can just type in mtcars and see the entire dataset. We can see the variable...
Please find one medical dataset that is suitable for correlation, logistic regression and linear regression.
Please find one medical dataset that is suitable for correlation, logistic regression and linear regression.
using the mtcars data set data(mtcars) USE data in mtcars library in R 5.Use k means...
using the mtcars data set data(mtcars) USE data in mtcars library in R 5.Use k means cluster analysis. 6. Get cluster means. 7. Visualize the clustering result.
Using R Studio: 1)Use the `mtcars` data (`data(mtcars)`) to answer these questions: a) Which rows of...
Using R Studio: 1)Use the `mtcars` data (`data(mtcars)`) to answer these questions: a) Which rows of the data frame contain cars that weigh more than 4000 pounds (the variable is `wt`, units are 1000 pounds). b) Which cars are these? (*Hint:* since rows are named by car name, use `row.names()`). c) What is the mean displacement (in inches^3^ ) for cars with at least 200 horsepower (`hp`). d) Which car has the highest fuel economy (`mpg`)? e) What was the...
Using R Question 3. kNN Classification 3.1 Read in iris dataset using “data(iris)”. Describe the features...
Using R Question 3. kNN Classification 3.1 Read in iris dataset using “data(iris)”. Describe the features in the data using summary 3.2 Randomize the iris data set, mix it up and normalize it 3.3 split data into training & testing (70/30 split) 3.4 Train model in data and use crosstable function to evaluate the results 3.5 Rerun your code for K=10 and 100. Compare results and explain
ANSWER USING R CODE Using the dataset 'LakeHuron' which is a built in R dataset describing...
ANSWER USING R CODE Using the dataset 'LakeHuron' which is a built in R dataset describing the level in feet of Lake Huron from 1872- 1972. To assign the values into an ordinary vector,x, we can do the following 'x <- as.vector(LakeHuron)'. From there, we can access the data easily. Assume the values in X are a random sample from a normal population with distribution X. Also assume the X has an unknown mean and unknown standard deviation. With this...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT