In: Statistics and Probability
1. Run a simple regression using payroll in thousands to predict workers compensation premiums in thousands with a 99% confidence interval.
a. Please explain to the risk manager how useful this model is. (1 point)
b. Interpret the relationship between the independent variable and the dependent variable in terms of the coefficients. (1 point)
2. Run a multiple regression using both number of payroll in thousands and the indicator variable manufacturing to predict workers compensation premiums.
a. Please explain how useful this model is to the risk manager. (1 point)
b.Write out the estimated regression equation in terms of Y = a+b1X1+b2X2 (1 point)
c. Are both independent variables useful in predicting workers compensation premiums? Please explain why or why not? (2 points)
d. Interpret the relationship between each independent variable and the dependent variable in terms of the coefficients. (2 points)
e. Explain specifically how confident you are with regard to each of the coefficients of the model (using 95% confident interval). (2 points)
3. Run a multiple regression using all three independent variables: payroll, manufacturing, and metropolitan. Are all three variables useful in predicting workers compensation premiums? Please explain why or why not. (1.5 points)
4. Compare all three models - which one is the best? Please explain why. (1 point)
5. Using the best model, predict the manufacturing company’s workers compensation premiums assuming payroll of $850,000 and that the company is a manufacturing company. (1.5 points)
Data:
Company | WC premium in thousands | Payroll in thousands | Manufacturing | Metropolitan |
A | 7 | 380 | 0 | 0 |
B | 7.3 | 410 | 0 | 0 |
C | 7.8 | 443 | 0 | 1 |
D | 8.2 | 480 | 0 | 0 |
E | 8.5 | 520 | 0 | 1 |
F | 9.2 | 566 | 0 | 0 |
G | 9.9 | 616 | 1 | 0 |
H | 10.6 | 672 | 0 | 0 |
I | 11.4 | 733 | 1 | 1 |
J | 12.2 | 802 | 0 | 1 |
K | 12.9 | 878 | 1 | 0 |
L | 13.5 | 963 | 0 | 1 |
M | 14.5 | 1057 | 1 | 1 |
N | 15.6 | 1161 | 1 | 0 |
O | 16.8 | 1277 | 1 | 1 |
P | 17.3 | 1405 | 1 | 0 |
Q | 18.2 | 1548 | 1 | 0 |
R | 19.8 | 1706 | 1 | 1 |
S | 20.5 | 1882 | 1 | 1 |
T | 21.5 | 2077 | 1 | 0 |
U | 23 | 2293 | 1 | 1 |
V | 24.1 | 2534 | 1 | 1 |
1
we enter the data in excel and then goto data > data analysis tab and select regression
the results are
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.987940576 | |||||||
R Square | 0.976026582 | |||||||
Adjusted R Square | 0.974827911 | |||||||
Standard Error | 0.857386831 | |||||||
Observations | 22 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 598.5705 | 598.5705 | 814.2573 | 1.11708E-17 | |||
Residual | 20 | 14.70224 | 0.735112 | |||||
Total | 21 | 613.2727 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 99.0% | Upper 99.0% | |
Intercept | 4.976298951 | 0.367747 | 13.53187 | 1.58E-11 | 4.209192801 | 5.743405 | 3.929935 | 6.022663 |
Payroll in thousands | 0.008208885 | 0.000288 | 28.53519 | 1.12E-17 | 0.007608804 | 0.008809 | 0.00739 | 0.009027 |
useful
we see the r2 value is 0.9760 , this means that the model is able to explain 97.6% variation in the data , which is a very good model
Interpretation
we see the coefficient is 0.0082 for Payroll
this means that for every unit increase in the value of payroll the premium increases by 0.0082 units
Please note that we can answer only 1 question at a time , as per the answering guidelines