In: Statistics and Probability
The president of a company that manufactures car seats has been concerned about the number and cost of machine breakdowns. The problem is that the machines are old and becoming quite unreliable. However, the cost of replacing them is quite high, and the president is not certain that the cost can be made up in today’s slow economy. To help make a decision about replacement, he gathered data about last month’s costs for repairs in dollars (y) and the ages in months (x) of the plant’s 20 welding machines as recorded :
Ages X | Cost Y |
110 | 655.34 |
113 | 753.36 |
114 | 785.04 |
134 | 886.28 |
93 | 685.24 |
141 | 952.32 |
115 | 649.48 |
115 | 677.96 |
115 | 866.9 |
142 | 1052.74 |
96 | 724.84 |
139 | 897.52 |
89 | 670.54 |
93 | 701.88 |
91 | 583.62 |
109 | 935.6 |
138 | 948.96 |
83 | 708.3 |
100 | 840.22 |
137 | 832.08 |
a) Find the estimated regression equation to predict the repair cost of a machine from its age.
b) Interpret the slope in the estimated regression equation.
c) Find and interpret the correlation between the repair cost of a machine and its age.
d) Find the coefficient of determination, and discuss what this statistic tells you.
Variables | Ages X | cost Y | (x-x̅)(y-y̅) | (x-x̅)2 | (y- ) |
---|---|---|---|---|---|
110 | 655.34 | 452.48785 | 11.2225 | 18244.17504 | |
113 | 753.36 | 12.96785 | 0.1225 | 1372.776601 | |
114 | 785.04 | -3.49115 | 0.4225 | 28.847641 | |
134 | 886.28 | 1979.69485 | 426.4225 | 9190.865161 | |
93 | 685.24 | 2140.22985 | 414.1225 | 11060.93924 | |
141 | 952.32 | 4476.78385 | 764.5225 | 26214.52428 | |
115 | 649.48 | -232.53615 | 2.7225 | 19861.54676 | |
115 | 677.96 | -185.54415 | 2.7225 | 12645.2274 | |
115 | 866.9 | 126.20685 | 2.7225 | 5850.567121 | |
142 | 1052.74 | 7515.72585 | 820.8225 | 68816.50424 | |
96 | 724.84 | 1137.65685 | 301.0225 | 4299.556041 | |
139 | 897.52 | 2747.34585 | 657.9225 | 11472.33788 | |
89 | 670.54 | 2918.85885 | 592.9225 | 14369.05664 | |
93 | 701.88 | 1801.60585 | 414.1225 | 7837.737961 | |
91 | 583.62 | 4621.77885 | 499.5225 | 42762.51768 | |
109 | 935.6 | -631.57215 | 18.9225 | 21079.84572 | |
138 | 948.96 | 3908.23285 | 607.6225 | 25137.7854 | |
83 | 708.3 | 2492.06885 | 921.1225 | 6742.216321 | |
100 | 840.22 | -664.95015 | 178.2225 | 2480.936481 | |
137 | 832.08 | 985.47185 | 559.3225 | 1736.305561 | |
Total | 2267 | 15808.22 | 35599.023 | 7196.55 | 311204.2692 |
Mean | 113.35 | 790.411 | 1779.95115 | 359.8275 | 15560.21346 |
a) Regression equation is estimated by using the following equation
y = a + bx
where a and b are estimated by
Now according to the sum
Var(X) = 359.8275 VAR(Y) = 15560.21346
[ The detailed calculations are shown in the table ]
There fore a = 790.41 - (1779.95115 * 113.35 ) / 359.8275 = 229.7049
b = 1779.95115 / 359.8275 = 4.946679
Thus regression equation becomes
y = 229.7 + 4.95x
b) The slope is 4.95 which means for one month increase/decrease in age we will have an increment/decrement of 4.5 in the value of cost.
c ) correlation coefficient is found by the following formula
from the calculations done above we get the value of r to be
From this value it can be said that x and y are strongly correlated positively or in other words we can say there is a high positive corelation between the cost and the age of machine i.e. if the age of the machine increases then the cost of the machine increases and if the age decreases then the cost decreases.
d) Coefficient of determination is equal to r2 = (0.752234)2 = 0.5659
This means that 56.59% of the total variability of Y can be accounted for by its linear regression on X.R2 is a statistical measure of how close the data are to the fitted regression line. Therefore we can say that the data is 56.59% close to the fitted regression line.(y = 229.7 + 4.95x)