In: Math
Use the R script to answer the following questions: (write down your answers in the R script with ##)
(1). Import FarmSize.csv to Rstudio. Use the correct function to build a linear regression model predicting the average size of a farm by the number of farms; Give the model a name (e.g. FarmSize_Model). Call the model name to inspect the intercept and slope of the regression model. Verify the answers in your manual calculation.
(2). Use the correct function to generate the residuals for the 12 examples in the dataset from the model. Create a residual plot, with x axis as independent variable and y axis as residual.
(3). Use the correct function to inspect SSE, Se and r². Write down the values for these measures. Verify the answers in your manual calculation.
(4). Use the correct function to inspect slope statistic testing result. What is the t value for the slope statistic testing? What is the p value? What is the statistical decision?
Year | NumberofFarms | AverageSize |
1950 | 5.65 | 213 |
1955 | 4.65 | 258 |
1960 | 3.96 | 297 |
1965 | 3.36 | 340 |
1970 | 2.95 | 374 |
1975 | 2.52 | 420 |
1980 | 2.44 | 426 |
1985 | 2.29 | 441 |
1990 | 2.15 | 460 |
1995 | 2.07 | 469 |
2000 | 2.17 | 434 |
2005 | 2.1 | 444 |
> NumberofFarms=c(5.65,4.65,3.96,3.36,2.95,2.52,2.44,2.29,2.15,2.07,2.17,2.1)
> AverageSize=c(213 ,258, 297, 340, 374, 420, 426, 441, 460, 469, 434, 444)
> FarmSize_Model=lm(AverageSize~NumberofFarms)
> FarmSize_Model
Call:
lm(formula = AverageSize ~ NumberofFarms)
Coefficients:
(Intercept) NumberofFarms
600.19 -72.33
>
> res=FarmSize_Model$residuals
> cat("the residuals are:", res)
the residuals are: 21.46762 -5.860465 -16.76685 -17.1637 -12.81821 2.080709 2.294462 6.445249 15.31932 18.53307 -9.234121 -4.297088>
> plot(x = NumberofFarms,y=res)
>
> anova(FarmSize_Model)
Analysis of Variance Table
Response: AverageSize
Df Sum Sq Mean Sq F value Pr(>F)
NumberofFarms 1 78078 78078 396.69 2.235e-09 ***
Residuals 10 1968 197
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> summary(FarmSize_Model)
Call:
lm(formula = AverageSize ~ NumberofFarms)
Residuals:
Min 1Q Median 3Q Max
-17.164 -10.130 -1.108 8.664 21.468
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 600.186 11.711 51.25 1.93e-13 ***
NumberofFarms -72.328 3.631 -19.92 2.24e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 14.03 on 10 degrees of freedom
Multiple R-squared: 0.9754, Adjusted R-squared: 0.973
F-statistic: 396.7 on 1 and 10 DF, p-value: 2.235e-09
t-value: -19.92
p-value: 2.24e-09
which is less that 0.05, so the indipendent variable is a significant predictor.