In: Statistics and Probability
Administrators of a computer system are gathering data to try to
explain the number of
interruptions in their network. For a sample of 22 days, they
measured the number of
interruptions per day and the daily usage (measured by the average
number of users of the
system per hour of that day). Fit an appropriate regression model
to predict the number of
interruptions based on the usage. Assess the fit of the model.
Formally assess whether usage
is a significant predictor of mean interruptions, providing
numerical justification (test statistic and
P-value) for your conclusion. Carefully interpret what the
estimated model tells you about how
the expected number of interruptions changes as the daily usage
changes. Predict the
expected number of interruptions for a day that has 150 users per
hour on average, using a
point estimate and a 95% interval.
DATA four;
INPUT interruptions usage;
cards;
0 104.2
2 124.6
5 176.3
6 169.3
1 104.6
2 115.8
3 127.8
6 179.4
8 210.5
4 126.7
0 100.5
1 119.5
1 123.8
0 106.4
4 156.7
3 148.2
5 156.2
6 167.3
8 198.2
2 124.6
3 145.9
4 156.2
;
run;
using excel>addin>phstat>Regression
we have
Regression Analysis | ||||||
Regression Statistics | ||||||
Multiple R | 0.957817698 | |||||
R Square | 0.917414742 | |||||
Adjusted R Square | 0.913285479 | |||||
Standard Error | 9.278932038 | |||||
Observations | 22 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 19128.8634 | 19128.8634 | 222.1739715 | 2.70383E-12 | |
Residual | 20 | 1721.971595 | 86.09857976 | |||
Total | 21 | 20850.835 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 101.5836195 | 3.402697427 | 29.85385026 | 4.61625E-18 | 94.485717 | 108.6815219 |
usage | 12.2683834 | 0.82307754 | 14.90550138 | 2.70383E-12 | 10.55147374 | 13.98529307 |
Confidence Interval Estimate | |
Data | |
X Value | 150 |
Confidence Level | 95% |
Intermediate Calculations | |
Sample Size | 22 |
Degrees of Freedom | 20 |
t Value | 2.085963 |
XBar, Sample Mean of X | 3.363636 |
Sum of Squared Differences from XBar | 127.0909 |
Standard Error of the Estimate | 9.278932 |
h Statistic | 169.2332 |
Predicted Y (YHat) | 1941.841 |
For Average Y | |
Interval Half Width | 251.7952 |
Confidence Interval Lower Limit | 1690.046 |
Confidence Interval Upper Limit | 2193.636 |
For Individual Response Y | |
Interval Half Width | 252.538 |
Prediction Interval Lower Limit | 1689.303 |
Prediction Interval Upper Limit | 2194.379 |
let y = interruptions
x= usage
the regression model is y = 101.5836 +12.2684 x
For the predictor usage, the value of test stat t= 14.9055
p-value is 0.0000
since p-value is less than 0.05 so we can say that usage is a significant predictor of mean interruptions.
the estimated model tells us that for everyone number increase in usage there is a corresponding 12.2684 unit increase in interruptions.
the expected number of interruptions for a day that has 150 users per hour on average lies in between (1690.046 , 2193.636)