In: Statistics and Probability
Age (in years) # of Free throws made (out of 100)
20 30
22 36
26 28
28 20
33 25
33 15
38 10
42 25
49 8
54 15
55 22
55 18
57 35
60 12
Provide the equation of the linear regression line (3
pts):
Y = a + bX
Sum of x is 572
Sum of y is 299
Mean of x is 40.8571
Mean of y is 21.3571
Sum of squares is 2575.7143
Sum of products is -669.2857
Provide the coefficient of correlation (2 pts):
How many would free throws would you expect a 42 year old to make
(2 pts):
What is the residual for the 42 year old we sampled (2
pts):
Solution:
Regression equation can be calculated as
Y = a+ bX
Here Y is dependent Variable i.e. No. of free throws
X is an independent Variable i.e. Age
a is Y-intercept of Line
b is Slope of regression line
Slope of regression line can be calculated as
Slope = ((n*Xi*Yi)
- (Xi
*
Yi))/((n*Xi^2)
- (Xi)^2)
=
Age(X) | No. of Free throws made(Y) | X^2 | Y^2 | XY |
20 | 30 | 400 | 900 | 600 |
22 | 36 | 484 | 1296 | 792 |
26 | 28 | 676 | 784 | 728 |
28 | 20 | 784 | 400 | 560 |
33 | 25 | 1089 | 625 | 825 |
33 | 15 | 1089 | 225 | 495 |
38 | 10 | 1444 | 100 | 380 |
42 | 25 | 1764 | 625 | 1050 |
49 | 8 | 2401 | 64 | 392 |
54 | 15 | 2916 | 225 | 810 |
55 | 22 | 3025 | 484 | 1210 |
55 | 18 | 3025 | 324 | 990 |
57 | 35 | 3249 | 1225 | 1995 |
60 | 12 | 3600 | 144 | 720 |
572 | 299 | 25946 | 7421 | 11547 |
Slope = ((14*11547) - (572*299))/((14*25946) -(572*572)) =
-9370/36060 = -0.26
Intercept of regression line a = ((Yi)
- Slope*(Yi))/n
= (299 - (-0.26*572))/14 = 31.97
So regression equation is
Y = 31.97 - 0.26*X
Solution(b)
Coefficient of correlation can be calculated as
Slope = ((n*Xi*Yi)
- (Xi
*
Yi))/sqrt(((n*Xi^2)
- (Xi)^2)*((n*Yi^2)
- (Yi)^2)))
= ((14*11547) - (572*299))/sqrt(((14*25946)
-(572*572))*((14*7421)-(299*299))) = -9370/sqrt(36060*14493) =
-0.41
Solution(c)
From Regression equation Y = 31.97 - 0.26*X
At X= 42, we need to calculate Free throws which can be calculated
as
Y = 31.97 - 0.26*X = 31.97 - 0.26*42 = 21.05 after round off
21
Solution(d)
Residual for the 42 year = No. of free throws predicted from
regression equation - No. of free throws from actual data = 21 - 25
= -4
that means according to the regression line we can say that the
number of free throws made is less by 4 as compared to actual.