In: Statistics and Probability
Suppose data were collected on the number of customers that frequented a grocery stores on randomly selected days before and after the governor of the state declared a lock down due to COVID 19. A sample of 6 days before the lockdown were chosen as well as 6 days randomly chosen after the lock down was in place. The number of shoppers each day were as follows:
Before lock down |
After lock down |
100 |
60 |
110 |
50 |
115 |
70 |
120 |
90 |
145 |
40 |
130 |
50 |
This is interval/ratio data because they are characteristics of the days.
ΣX | ΣY | Σ(x-x̅)² | Σ(y-ȳ)² | Σ(x-x̅)(y-ȳ) | |
total sum | 720.00 | 360.00 | 1250.00 | 1600.00 | -550.00 |
mean | 120.00 | 60.00 | SSxx | SSyy | SSxy |
Sample size, n = 6
here, x̅ = Σx / n= 120.000
ȳ = Σy/n = 60.000
SSxx = Σ(x-x̅)² = 1250.0000
SSxy= Σ(x-x̅)(y-ȳ) = -550.0
estimated slope , ß1 = SSxy/SSxx =
-550/1250= -0.4400
intercept,ß0 = y̅-ß1* x̄ = 60- (-0.44
)*120= 112.8000
Regression line is, Ŷ= 112.800 +
( -0.440 )*x
SSE= (SSxx * SSyy - SS²xy)/SSxx =
1358.0000
std error ,Se = √(SSE/(n-2)) =
18.4255
Anova table | |
variation | SS |
regression | 242.00 |
error, | 1358.00 |
total | 1600.000 |
Eta square = 242.00/ 1600.000
= 0.1513
Please let me know in case of any doubt.
Thanks in advance!
Please upvote!