In: Statistics and Probability
Question 2 The rest of the questions deal with the Motor Trend Car and Sport data from 1974
# It is famous dataset called mtcars comes built in to R. Use the line of code below
# to familiarize yourself with it head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Question 2a # how many observations are there in this data set?
Question 2b # plot a histogram showing the frequencies of the "cyl" column # as always, make sure the plot is properly labeled.
Question 2c # which car has the highest "qsec"? # which car has the highest "mpg"?
Question 2d The next two questions are great practice for your final project! 1 # plot a scatter plot of mpg vs qsec. Are the variables correlated? If so, are they # negatively correlated or positively correlated?
Question 2e # plot a scatter plot of mpg vs disp. Are the variables correlated? If so, are they # negatively correlated or positively correlated?
R code with comments (all statement starting with # are comments and can be removed)
#Question 2a
# how many observations are there in this data set?
n<-nrow(mtcars)
sprintf('Number of observations in this data set is %g',n)
#output
#Question 2b
# plot a histogram showing the frequencies of the "cyl"
column
# as always, make sure the plot is properly labeled.
hist(mtcars$cyl,main='Histogram of number of cylinders',xlab='# of
Cylinders')
#get this plot
#Question 2c
# which car has the highest "qsec"?
cname<-rownames(mtcars)[which.max(mtcars$qsec)]
sprintf('The car that has the highest "qsec" is %s',cname)
# which car has the highest "mpg"?
cname<-rownames(mtcars)[which.max(mtcars$mpg)]
sprintf('The car that has the highest "mpg" is %s',cname)
#get this output
#Question 2d The next two questions are great practice for your
final project! 1
# plot a scatter plot of mpg vs qsec.
plot(mtcars$qsec,mtcars$mpg,xlab="qsec",ylab="mpg",main="mpg vs
qsec")
#get this plot
Are the variables correlated?
Yes the 2 variables are correlated, as we can see an approximate linear relationship between qsec and mpg.
We can see that as the value of qsec (time taken to cover 1/4 mile) increases, the mpg increases (slower car has higher mileage).
That means we can say that there are positively correlated
# Question 2e
# plot a scatter plot of mpg vs disp.
plot(mtcars$disp,mtcars$mpg,xlab="disp",ylab="mpg",main="mpg vs
disp")
#get this plot
Are the variables correlated?
Yes the 2 variables are correlated, because we can see an approximate linear relationship between disp and mpg.
We can see that as the value of disp (volume of engine displacement ) increases, the mpg decreases (higher volume engine has lower mileage).
That means we can say that there are negatively correlated