In: Statistics and Probability
We will use a data set in the “fpp” package for this question. You need to undertake all initial logistics as shown in class to do this problem. (i. e install FPP package. Then call it to script file using library command)
Data set = “fuel” : Fuel economy data on 2009 vehicles in the US.
R programme:
install.packages("fpp")
library(fpp)
df <- data.frame(fuel)
# a) Scatter plot
plot(df$Carbon,df$Highway, col= "blue")
# b) regression for Y-carbon nd X-Highway
reg1 <- lm(df$Carbon~df$Highway)
summary(reg1) #Summary of regression line
# c) Scatterplot
plot(df$Highway,df$Carbon)
#Regression line in scatter plot
abline(lm(df$Carbon~df$Highway), col= "red")
# d) Residual plots
par(mfrow = c(2, 2)) # Split the plotting panel into a 2 x 2
grid
plot(reg1) # Plot the model information
A) Scatter plot distribution of X and Y variables data. Plot representing that, Highway increases and carbon emission decreases.
b)
The regression line is: Y= 15.143511-0.258675*Highway
The test significant and P-value is < 0.05 with an accuracy of the model 0.8589.
c) Regression line in Scatter plot:
d) Residual plot: Residuals should not follow any trend.