In: Statistics and Probability
(please provide detailed solution with formula and figures)
the researcher considers using regression analysis to establish a linear relationship between the two variables – hours worked per week and yearly income. a) What is the dependent variable and independent variable for this analysis? Why? b) Use an appropriate plot to investigate the relationship between the two variables. Display the plot. On the same plot, fit a linear trend line including the equation and the coefficient of determination R2 . c) Estimate a simple linear regression model and present the estimated linear equation. Display the regression summary table and interpret the intercept and slope coefficient estimates of the linear model. d) Display and interpret the value of the coefficient of determination, R-squared (R2 ).
Hours Per Week |
Yearly Income ('000's) |
18 |
43.8 |
13 |
44.5 |
18 |
44.8 |
25.5 |
46.0 |
11.5 |
41.2 |
18 |
43.3 |
16 |
43.6 |
27 |
46.2 |
27.5 |
46.8 |
30.5 |
48.2 |
24.5 |
49.3 |
32.5 |
53.8 |
25 |
53.9 |
23.5 |
54.2 |
30.5 |
50.5 |
27.5 |
51.2 |
28 |
51.5 |
26 |
52.6 |
25.5 |
52.8 |
26.5 |
52.9 |
33 |
49.5 |
15 |
49.8 |
27.5 |
50.3 |
36 |
54.3 |
27 |
55.1 |
34.5 |
55.3 |
39 |
61.7 |
37 |
62.3 |
31.5 |
63.4 |
37 |
63.7 |
24.5 |
55.5 |
28 |
55.6 |
19 |
55.7 |
38.5 |
58.2 |
37.5 |
58.3 |
18.5 |
58.4 |
32 |
59.2 |
35 |
59.3 |
36 |
59.4 |
39 |
60.5 |
24.5 |
56.7 |
26 |
57.8 |
38 |
63.8 |
44.5 |
64.2 |
34.5 |
55.8 |
34.5 |
56.2 |
40 |
64.3 |
41.5 |
64.5 |
34.5 |
64.7 |
42.3 |
66.1 |
34.5 |
72.3 |
28 |
73.2 |
38 |
74.2 |
31.5 |
68.5 |
36 |
69.7 |
37.5 |
71.2 |
22 |
66.3 |
33.5 |
66.5 |
37 |
66.7 |
43.5 |
74.8 |
20 |
62.0 |
35 |
57.3 |
24 |
55.3 |
20 |
56.1 |
41 |
61.5 |
a)
Hours Per Week : it is independent variable as it does not depend on other
Yearly Income ('000's) : it dependent variable as it depend on other
.................
b
)R² = (Sxy)²/(Sx.Sy) =
0.4435
........
c)
ΣX | ΣY | Σ(x-x̅)² | Σ(y-ȳ)² | Σ(x-x̅)(y-ȳ) | |
total sum | 1941.8 | 3726.3 | 4142.505538 | 4374.0 | 2834.77 |
mean | 29.87 | 57.33 | SSxx | SSyy | SSxy |
sample size , n = 65
here, x̅ = Σx / n= 29.87 ,
ȳ = Σy/n = 57.33
SSxx = Σ(x-x̅)² = 4142.5055
SSxy= Σ(x-x̅)(y-ȳ) = 2834.8
estimated slope , ß1 = SSxy/SSxx = 2834.8
/ 4142.506 = 0.6843
intercept, ß0 = y̅-ß1* x̄ =
36.8847
so, regression line is Ŷ =
36.8847 + 0.6843
*x
.................
Regression Statistics | |
Multiple R | 0.6660 |
R Square | 0.4435 |
Adjusted R Square | 0.4347 |
Standard Error | 6.2159 |
Observations | 65 |
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 36.8847 | 2.9863 | 12.3511 | 0.0000 | 30.9169 | 42.8524 |
X | 0.6843 | 0.0966 | 7.0857 | 0.0000 | 0.4913 | 0.8773 |
intercept : if hours per week is 0, then yearly income will be 36.8847 units
slope: if x is increased by 1 unit , then income will increase by 0.6843 units
..........
R² = (Sxy)²/(Sx.Sy) = 0.4435
44.35% of variation is explained by hours worked per week of yearly income
...................
THANKS
revert back for doubt
please upvote