In: Statistics and Probability
Consider the following data for 15 subjects with two predictors. The dependent variable, MARK, is the total score for a subject on an examination. The first predictor, COMP, is the score for the subject on a so-called compulsory paper. The other predictor, CERTIF, is the score for the subject on a previous exam.
Student |
MARK |
COMP |
CERTIF |
1 |
476 |
111 |
68 |
2 |
457 |
92 |
46 |
3 |
540 |
90 |
50 |
4 |
551 |
107 |
59 |
5 |
575 |
98 |
50 |
6 |
698 |
150 |
66 |
7 |
545 |
118 |
54 |
8 |
574 |
110 |
51 |
9 |
645 |
117 |
59 |
10 |
556 |
94 |
97 |
11 |
634 |
130 |
57 |
12 |
637 |
118 |
51 |
13 |
390 |
91 |
44 |
14 |
562 |
118 |
61 |
15 |
560 |
109 |
66 |
a.Run a stepwise regression on the dataset
b.Does CERTIF add anything to predicting MARK, above and beyond that of COMP?
c. Write out the prediction equation
d. A statistician wishes to know the sample size needed in a multiple regression study. She has four predictors and can tolerate at most a .10 drop-off in predictive power. But she wants this to be the case with .95 probability. From previous related research, the estimated squared population multiple correlation is .62. How many subjects are needed?
a.Run a stepwise regression on the dataset
Load the data into Excel.
Go to Data>Megastat.
Select the option Correlation/Regression and go to Regression.
Select COMP and CERTIF as the independent variable(s), x.
Select MARK as the dependent variable, y.
Click OK.
The output will be as follows:
R² | 0.583 | |||||
Adjusted R² | 0.514 | n | 15 | |||
R | 0.764 | k | 2 | |||
Std. Error | 54.463 | Dep. Var. | MARK | |||
ANOVA table | ||||||
Source | SS | df | MS | F | p-value | |
Regression | 49,791.1563 | 2 | 24,895.5782 | 8.39 | .0052 | |
Residual | 35,594.8437 | 12 | 2,966.2370 | |||
Total | 85,386.0000 | 14 | ||||
Regression output | confidence interval | |||||
variables | coefficients | std. error | t (df=12) | p-value | 95% lower | 95% upper |
Intercept | 124.0641 | |||||
COMP | 3.5120 | 0.8975 | 3.913 | .0021 | 1.5566 | 5.4674 |
CERTIF | 0.8346 | 1.1346 | 0.736 | .4761 | -1.6375 | 3.3068 |
b.Does CERTIF add anything to predicting MARK, above and beyond that of COMP?
CERTIF add a value 0.8345 to predict MARK, which is beyond that of COMP.
c. Write out the prediction equation.
The prediction equation is:
MARK = 124.0641 + 3.5120*COMP + 0.8346*CERTIF
Or
The total score for a subject on an examination = 124.0641 + 3.5120*Score for the subject on a so-called compulsory paper + 0.8346*Score for the subject on a previous exam
d. A statistician wishes to know the sample size needed in a multiple regression study. She has four predictors and can tolerate at most a .10 drop-off in predictive power. But she wants this to be the case with .95 probability. From previous related research, the estimated squared population multiple correlation is .62. How many subjects are needed?
The formula for calculating sample size is:
n = p(1 - p)(z/E)2
We are given:
p = 0.62 and E = 0.1
The critical value at the 0.05 significance level is 1.96.
Subject needed = 0.62(1 - 0.62)(1.96/0.1)2
= 0.2356*(19.6)2
= 0.2356*384.16
= 90.5 = 91
Therefore, we need atleast 91 subjects.