In: Statistics and Probability
A research center is interested in investigating about the height and age of children who are between 5 to 9 years old. In order to do this, a sample of 15 children is selected and the data is given below.
Age (in years) | Height (inches) |
7 | 47.3 |
8 | 48.8 |
5 | 41.3 |
8 | 50.4 |
8 | 51 |
7 | 47.1 |
7 | 46.9 |
7 | 48 |
9 | 51.2 |
8 | 51.2 |
5 | 40.3 |
8 | 48.9 |
6 | 45.2 |
5 | 41.9 |
8 | 49.6 |
a.. Develop a scatter chart with age as the independent variable. What does the scatter chart indicate about the relationship between the height and age of children?
b. Use the data to develop an estimated regression equation that could be used to estimate the height based on the age. What is the estimated regression model? Round the results to 3 decimal places.
c. How much of the variation in the sample values of height does the model estimated in part (c)? Explain.
d. What is the predicted value of Y (height in inches) for a 6-year old? Develop a 95% confidence interval for the predicted value.
e. Interpret the standard error for this problem
f. State the hypothesis for the significance F and test the hypothesis.
g. State the hypothesis for the significance of the independent variable and test the hypothesis.
a) Scatterplot:
There is a positive relationship between the height and age of children.
b)
Age, X | Height, Y | XY | X² | Y² |
7 | 47.3 | 331.1 | 49 | 2237.29 |
8 | 48.8 | 390.4 | 64 | 2381.44 |
5 | 41.3 | 206.5 | 25 | 1705.69 |
8 | 50.4 | 403.2 | 64 | 2540.16 |
8 | 51 | 408 | 64 | 2601 |
7 | 47.1 | 329.7 | 49 | 2218.41 |
7 | 46.9 | 328.3 | 49 | 2199.61 |
7 | 48 | 336 | 49 | 2304 |
9 | 51.2 | 460.8 | 81 | 2621.44 |
8 | 51.2 | 409.6 | 64 | 2621.44 |
5 | 40.3 | 201.5 | 25 | 1624.09 |
8 | 48.9 | 391.2 | 64 | 2391.21 |
6 | 45.2 | 271.2 | 36 | 2043.04 |
5 | 41.9 | 209.5 | 25 | 1755.61 |
8 | 49.6 | 396.8 | 64 | 2460.16 |
Ʃx = | Ʃy = | Ʃxy = | Ʃx² = | Ʃy² = |
106 | 709.1 | 5073.8 | 772 | 33704.59 |
Sample size, n = | 15 |
x̅ = Ʃx/n = 106/15 = | 7.06666667 |
y̅ = Ʃy/n = 709.1/15 = | 47.2733333 |
SSxx = Ʃx² - (Ʃx)²/n = 772 - (106)²/15 = | 22.9333333 |
SSyy = Ʃy² - (Ʃy)²/n = 33704.59 - (709.1)²/15 = | 183.069333 |
SSxy = Ʃxy - (Ʃx)(Ʃy)/n = 5073.8 - (106)(709.1)/15 = | 62.8266667 |
Slope, b = SSxy/SSxx = 62.82667/22.93333 = 2.73953488
y-intercept, a = y̅ -b* x̅ = 47.27333 - (2.73953)*7.06667 = 27.9139535
Regression equation :
ŷ = 27.914 + (2.740) x
c) A unit increase in age increases the height by 2.740 inches.
d) Predicted value of y at x =6
ŷ = 27.914 + (2.7395) * 6 = 44.3512
Significance level, α = 0.05
Critical value, t_c = T.INV.2T(0.05, 13) = 2.1604
Sum of Square error, SSE = SSyy -SSxy²/SSxx = 183.06933 - (62.82667)²/22.93333 = 10.9534884
Standard error, se = √(SSE/(n-2)) = √(10.95349/(15-2)) = 0.91792
95% Prediction interval :
e) Standard error, se = √(SSE/(n-2)) = √(10.95349/(15-2)) = 0.91792
f) Null and alternative hypothesis:
Ho: β0 = β1 = 0
H1: At least one of the β is not equal to 0.
SSR = SSxy²/SSxx = 172.1158
F =(SSR/1) / (SSE/(n-2)) = (172.1158) / (10.95349/13) = 204.273
P-value = F.DIST.RT(204.273, 1, 13) = 0.0000
p-value < α Reject the null hypothesis.
g) Null and alternative hypothesis:
Ho: β₁ = 0
Ha: β₁ ≠ 0
n=15
α = 0.05
Slope, b = 2.73953
Test statistic:
t = b /(se/√SSx) = 14.2924
df = n-2 = 13
p-value = T.DIST.2T(ABS(14.2924), 13) = 0.0000
Conclusion:
p-value < α Reject the null hypothesis.