In: Statistics and Probability
The following dataset gives 15 individual men’s maximum adult height and, correspondingly, it gives the years until it took for them to reach that maximum adult height.
Years to reach max height |
Max Height In Inches |
20 |
67 |
21 |
60 |
25 |
74 |
18 |
74 |
19 |
73 |
17 |
70 |
24 |
69 |
14 |
69 |
17 |
63 |
18 |
70 |
25 |
59 |
23 |
67 |
22 |
65 |
20 |
69 |
19 |
68 |
Treat the independent variable as “max height”, and the dependent variable as “years to reach max height”. Draw an accurate scatterplot of the data using gridpaper (next page is gridpaper). Make sure your sketch is very neat and create a scale on the x- and y-axes that fits the page you are using.
What is the linear regression equation? Is the correlation positive, negative, or 0? Plot the linear regression equation accurately amongst the points in your scatterplot. Do NOT guess where the line goes! To do this, you will need to use your Math 98 skills for plotting a line y=a+bx … see Khan Academy if you need a refresher.
Use your equation to make a prediction for the years it would take for a man to grow to a height of 71 inches. Write down the steps and be very clear on your reasoning.
Are there any outliers for this data set? If so, what are they and how did you decide they were outliers. If not, explain how you know there are no outliers.
Fill in the following blank:
What is the correlation coefficient for this data? Is it significant? Use a hypothesis test to show why you think it is significant or not. Be sure to state the hypotheses, distribution for the test, test-statistic, and the p-value.
Also, describe what the p-value means for this test.
Describe why you reject or fail to reject the null hypothesis.
For a 1 inch increase in height, the best fit line predicts a ____________ year increase in time to grow to that height.
REGRESSION
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.223969286 | ||||
R Square | 0.050162241 | ||||
Adjusted R Square | -0.022902202 | ||||
Standard Error | 3.240647457 | ||||
Observations | 15 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 7.20998613 | 7.20998613 | 0.686547917 | 0.422293439 |
Residual | 13 | 136.5233472 | 10.50179594 | ||
Total | 14 | 143.7333333 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | 30.8534 | 12.9649 | 2.3798 | 0.0333 | 2.8444 |
max height | -0.1581 | 0.1908 | -0.8286 | 0.4223 | -0.5704 |
Y^= 30.8534 - 0.1581 x
y^= 30.8534 - 0.1581 *71
= 19.6283
this is standardized residual
Observation | Predicted years | Residuals | Standard Residuals |
1 | 20.26 | -0.26 | -0.08 |
2 | 21.37 | -0.37 | -0.12 |
3 | 19.15 | 5.85 | 1.87 |
4 | 19.15 | -1.15 | -0.37 |
5 | 19.31 | -0.31 | -0.10 |
6 | 19.79 | -2.79 | -0.89 |
7 | 19.94 | 4.06 | 1.30 |
8 | 19.94 | -5.94 | -1.90 |
9 | 20.89 | -3.89 | -1.25 |
10 | 19.79 | -1.79 | -0.57 |
11 | 21.52 | 3.48 | 1.11 |
12 | 20.26 | 2.74 | 0.88 |
13 | 20.58 | 1.42 | 0.46 |
14 | 19.94 | 0.06 | 0.02 |
15 | 20.10 | -1.10 | -0.35 |
we see that al residuals have magnitude less than 2
hence no residuals
r = -0.2240
TS = -0.8286
this is not significant as p-value of slope = 0.4223 > 0.05 (alpha)
we fail to reject the null hypothesis
for 1 inch increase in height the best fit line predicts a -0.1581 increase in time