In: Statistics and Probability
A student wonders if people of similar heights tend to date each other. She measures herself, her roommate and the women in the adjoining rooms; then she measures the next man each woman dates. Here are the data (in inches):
Women | 66 | 64 | 66 | 65 | 70 | 65 | 60 | 70 | 72 | 63 |
Men | 72 | 68 | 70 | 68 | 71 | 65 | 64 | 66 | 70 | 69 |
A. What is the least squares regression line of the male height
on female height? Graph it on a scatterplot. Make sure all parts
are appropriately labeled.
B. Use your results from a. to predict the height of Jill’s next
date if she is 68 inches tall.
C. What is the correlation of the data? What does the correlation
describe?
D. Are there any influential outliers in the data set?
Women, X | Men, Y | XY | X² | Y² |
66 | 72 | 4752 | 4356 | 5184 |
64 | 68 | 4352 | 4096 | 4624 |
66 | 70 | 4620 | 4356 | 4900 |
65 | 68 | 4420 | 4225 | 4624 |
70 | 71 | 4970 | 4900 | 5041 |
65 | 65 | 4225 | 4225 | 4225 |
60 | 64 | 3840 | 3600 | 4096 |
70 | 66 | 4620 | 4900 | 4356 |
72 | 70 | 5040 | 5184 | 4900 |
63 | 69 | 4347 | 3969 | 4761 |
Ʃx = | Ʃy = | Ʃxy = | Ʃx² = | Ʃy² = |
661 | 683 | 45186 | 43811 | 46711 |
Sample size, n = | 10 |
x̅ = Ʃx/n = 661/10 = | 66.1 |
y̅ = Ʃy/n = 683/10 = | 68.3 |
SSxx = Ʃx² - (Ʃx)²/n = 43811 - (661)²/10 = | 118.9 |
SSyy = Ʃy² - (Ʃy)²/n = 46711 - (683)²/10 = | 62.1 |
SSxy = Ʃxy - (Ʃx)(Ʃy)/n = 45186 - (661)(683)/10 = | 39.7 |
A) Slope, b = SSxy/SSxx = 39.7/118.9 = 0.33389403
y-intercept, a = y̅ -b* x̅ = 68.3 - (0.33389)*66.1 = 46.2296047
Least squares regression line:
ŷ = 46.2296 + (0.3339) x
Scatterplot:
B) Predicted value of y at x = 68
ŷ = 46.2296 + (0.3339) * 68 = 68.93 cm
C) Correlation coefficient, r = SSxy/√(SSxx*SSyy) = 39.7/√(118.9*62.1) = 0.4620
The correlation describe that there is a weak positive relationship between women's height and men's height.
D) Yes, there are influential outliers in the data set.