In: Statistics and Probability
r studio answer to these questions
What is the intercept parameter (2dp) for the regression equation of height (y) versus age (x) using the pine_growth.csv data?
What is the total amount of variation explained by the regression model (SSr) of height (y) versus age (x) using the pine_growth.csv data?
What is the residual error sum of squares (SSe) of a regression model of height (y) versus age (x) using the pine_growth.csv data.
Use your regression equation fitted to the pine_growth.csv data to predict the height (1 dp) of pine trees at 21.8 years.
pine_growth.csv
age height
7.15576656 18.72849431
8.129775638 35.65477778
16.72602077 40.84925353
22.1494314 61.16668286
24.98842629 63.88029087
10.84800989 38.8948113
16.21167696 50.89108137
21.62648568 58.84822993
27.37032481 58.21976844
8.781800332 38.78419079
15.73044358 47.08995523
18.44298031 47.03292591
24.55722077 57.78373702
8.803465152 19.78892211
14.69364295 38.47185166
19.76821622 52.85251645
25.73732976 54.03867394
9.557438738 19.52645519
13.38045294 50.71181575
21.18576248 50.32075486
25.14524863 66.45636737
11.05895289 35.09854872
15.90885155 46.20593357
21.16517367 59.1335112
24.74997389 53.78095766
10.76039501 31.87024304
17.24982036 38.43437782
20.67469663 42.75693936
24.9187064 56.50248257
9.456849174 27.94407215
17.31785455 44.56102526
19.4035485 43.26457601
26.94362485 69.07639749
10.18743471 30.07622452
14.82076927 32.81933238
19.41960951 45.6665625
26.22703124 54.85557574
11.19384358 37.97390848
13.43821684 44.08033847
20.28160916 60.39976036
26.66076684 66.07282947
10.75153533 27.48342127
15.10263364 30.26795241
19.30307787 47.03971079
26.1097938 57.30252533
7.334193884 16.02310563
7.780802556 29.81303216
13.55907716 35.60521517
18.98254692 45.67647822
27.2336165 49.5956103
8.64498995 22.56785223
16.72757792 40.15677105
18.76667604 39.37013143
26.03969617 62.91864072
11.89543801 16.12208671
14.86010339 33.6554498
18.36353229 50.36578767
25.28054491 65.29231743
1)the intercept is 10.255
2)r square is 0.7712. So, 77.12% of the total variability is explained by the regression model
3) SSRes=2506.3
4)x=21.8, predicted y= 10.255+1.946*21.8=52.6778
r code I used and results:
fit=lm(y~x)
fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
10.255 1.946
> summary(fit)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-17.2883 -5.1722 -0.5533 5.6190 14.4112
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.2548 2.6176 3.918 0.000246 ***
x 1.9464 0.1417 13.740 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.69 on 56 degrees of freedom
Multiple R-squared: 0.7712, Adjusted R-squared:
0.7671
F-statistic: 188.8 on 1 and 56 DF, p-value: < 2.2e-16
anova(fit)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x 1 8448.8 8448.8 188.78 < 2.2e-16 ***
Residuals 56 2506.3 44.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
IF MY ANSWER IS HELPFUL PLEASE GIVE IT A LIKE