In: Statistics and Probability
The amount of beef produced on private farms in a particular country has changed over time. This table gives approximate values for the amount of beef produced for particular years in that country.
Years since 1989 | Amount of beef produced (million pounds; nearest hundredth) |
0 |
108.35 |
5 | 101.95 |
9 | 96.85 |
12 | 90.15 |
15 | 83.05 |
19 | 79.75 |
22 | 76.25 |
27 | 73.45 |
1.With linear regression, what is the value of the coefficient of determination (give the entire decimal)?
2. With quadratic regression, what is the value of the coefficient of determination (give the entire decimal)?
3. When making predictions that are within the original domain and range values of data from the table, which predictions would be 'best': when using linear regression OR when using quadratic regression?
I ran this regression with R
data
years.since.1989 Amount.of.beef.produced
1 0 108.35
2 5 101.95
3 9 96.85
4 12 90.15
5 15 83.05
6 19 79.95
7 22 76.25
8 27 73.45
>
1).
#fitting linear
model_lm =
lm(data$Amount.of.beef.produced~data$years.since.1989,data=data)
summary(model_lm)
Call:
lm(formula = data$Amount.of.beef.produced ~
data$years.since.1989,
data = data)
Residuals:
Min 1Q Median 3Q Max
-3.7855 -0.9760 -0.1049 1.3085 3.3224
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 107.72049 1.58349 68.03 6.79e-10 ***
data$years.since.1989 -1.39233 0.09894 -14.07 8.04e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.35 on 6 degrees of freedom
Multiple R-squared: 0.9706, Adjusted R-squared:
0.9657
F-statistic: 198 on 1 and 6 DF, p-value: 8.038e-06
Here, in linear model, Multiple R-squared: 0.9706, Adjusted R-squared: 0.9657
2).
#fitting quadratic model
model_quad =
lm(data$Amount.of.beef.produced~poly(data$years.since.1989,2,raw=TRUE))
summary(model_quad)
Call:
lm(formula = data$Amount.of.beef.produced ~
poly(data$years.since.1989,
2, raw = TRUE))
Residuals:
1 2 3 4 5 6 7 8
-1.4918 1.1566 2.6126 0.4314 -2.4911 -0.5521 -0.8711 1.2054
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 109.84182 1.73475 63.319 1.86e-08
poly(data$years.since.1989, 2, raw = TRUE)1 -1.90450 0.28298 -6.730
0.0011
poly(data$years.since.1989, 2, raw = TRUE)2 0.01896 0.01002 1.893
0.1170
(Intercept) ***
poly(data$years.since.1989, 2, raw = TRUE)1 **
poly(data$years.since.1989, 2, raw = TRUE)2
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.965 on 5 degrees of freedom
Multiple R-squared: 0.9829, Adjusted R-squared:
0.976
F-statistic: 143.4 on 2 and 5 DF, p-value: 3.843e-05
For quadratic regression, Multiple R-squared: 0.9829, Adjusted R-squared: 0.976
3).
Since, Adjusted R-squared for quadractic is greater than that of linear regression. Quadratic regression will give better fitted values. (0.976 > 0.9657)
model_lm$fitted.values
1 2 3 4 5 6 7 8
107.72049 100.75885 95.18953 91.01254 86.83555 81.26623 77.08924
70.12759
model_quad$fitted.values
1 2 3 4 5 6 7 8
109.84182 100.79341 94.23736 89.71855 85.54108 80.50210 77.12109
72.24460
As, it can been predictions of quadratic are closer to original
data
plot(data$years.since.1989,data$Amount.of.beef.produced)