In: Statistics and Probability
A research studied the correlation between physical characteristics of sisters and brothers. Here are data on the heights (in inches) of 11 adult pairs.
Brother |
71 |
69 |
66 |
67 |
70 |
71 |
70 |
73 |
72 |
65 |
66 |
Sister |
69 |
63 |
65 |
63 |
65 |
62 |
65 |
64 |
66 |
59 |
62 |
a) Find the correlation and the equation of the least squares line for predicting sister’s height from brother’s height.
b) Carlos is 72 inches tall. Predict the height of his sister.
c) Based on the scatterplot and correlation r, do you expect your prediction to be very accurate? Why?
d) Is there evidence of a significant linear association between physical characteristics of sisters and brothers (a=0.05 level)?
e) What is the meaning of R2 in this example? Describe the meaning of R2 value in this example so that a non-statistician can understand.
Following table shows the caculations:
Brother, X | Sister, Y | X^2 | Y^2 | XY | |
71 | 69 | 5041 | 4761 | 4899 | |
69 | 63 | 4761 | 3969 | 4347 | |
66 | 65 | 4356 | 4225 | 4290 | |
67 | 63 | 4489 | 3969 | 4221 | |
70 | 65 | 4900 | 4225 | 4550 | |
71 | 62 | 5041 | 3844 | 4402 | |
70 | 65 | 4900 | 4225 | 4550 | |
73 | 64 | 5329 | 4096 | 4672 | |
72 | 66 | 5184 | 4356 | 4752 | |
65 | 59 | 4225 | 3481 | 3835 | |
66 | 62 | 4356 | 3844 | 4092 | |
Total | 760 | 703 | 52582 | 44995 | 48610 |
Sample size: n = 11
Now,
The coeffcient of correlation is :
Slope of the regression equation is
and intercept of the equation will be
So the regression equation will be
y'=26.8625+0.5362x
b)
The predicted value for x = 72 is
y'=26.8625+0.5362 *72 = 65.4689
Answer: 65.5 inches
c)
Following is the scatter plot:
Scatter plot and correlation coeffcient shows a linear positive and moderate relationship between the variables.
d)
Hypotheses are:
Degree of freedom: df=n-2=9
The critical value of r for correlation coefficient is: 0.602
Since r < 0.602 so relation between the variables is not significant.
That is there is no evidence of a significant linear association between physical characteristics of sisters and brothers.
e)
The R-square is:
That is 31.33% of variation in dependent variable Sister's height is explained by independent variable brother's height.