In: Math
a. An experiment was performed on a certain
metal to determine if the strength is a function of heating time
(hours). Results based on 25 metal sheets are given below. Use the
simple linear regression model.
∑X = 50
∑X2 = 200
∑Y = 75
∑Y2 = 1600
∑XY = 400
Find the estimated y intercept and slope. Write the equation of the
least squares regression line and explain the coefficients.
Estimate Y when X is equal to 4 hours. Also determine the standard
error, the Mean Square Error, the coefficient of determination and
the coefficient of correlation. Check the relation between
correlation coefficient and Coefficient of Determination. Test the
significance of the slope.
b. Consumer Reports provided extensive testing and ratings for more than 100 HDTVs. An overall score, based primarily on picture quality, was developed for each model. In general, a higher overall score indicates better performance. The following (hypothetical) data show the price and overall score for the ten 42-inch plasma televisions (Consumer Report data slightly changed here):
Brand |
Price (X) |
Score (Y) |
Dell |
3800 |
50 |
Hisense |
2800 |
45 |
Hitachi |
2700 |
35 |
JVC |
3000 |
40 |
LG |
3500 |
45 |
Maxent |
2000 |
28 |
Panasonic |
4000 |
57 |
Phillips |
3200 |
48 |
Proview |
2000 |
22 |
Samsung |
3000 |
30 |
Use the above data to develop and estimated regression equation. Compute Coefficient of Determination and correlation coefficient and show their relation. Interpret the explanatory power of the model. Estimate the overall score for a 42-inch plasma television with a price of $3600 and perform significance test for the slope.
a)
Ʃx = | 50 |
Ʃy = | 75 |
Ʃxy = | 400 |
Ʃx² = | 200 |
Ʃy² = | 1600 |
Sample size, n = | 25 |
x̅ = Ʃx/n = 50/25 = | 2 |
y̅ = Ʃy/n = 75/25 = | 3 |
SSxx = Ʃx² - (Ʃx)²/n = 200 - (50)²/25 = | 100 |
SSyy = Ʃy² - (Ʃy)²/n = 1600 - (75)²/25 = | 1375 |
SSxy = Ʃxy - (Ʃx)(Ʃy)/n = 400 - (50)(75)/25 = | 250 |
Slope, b = SSxy/SSxx = 250/100 = 2.5
y-intercept, a = y̅ -b* x̅ = 3 - (2.5)*2 = -2
Regression equation :
ŷ = -2 + (2.5) x
Predicted value of y at x = 4
ŷ = -2 + (2.5) * 4 = 8
Sum of Square error, SSE = SSyy -SSxy²/SSxx = 1375 - (250)²/100 = 750
Standard error, se = √(SSE/(n-2)) = √(750/(25-2)) = 5.71040
MSE = (SSE/(n-2)) = (750/(25-2)) = 32.6087
Correlation coefficient, r = SSxy/√(SSxx*SSyy) = 250/√(100*1375) = 0.6742
Coefficient of determination, r² = (SSxy)²/(SSxx*SSyy) = (250)²/(100*1375) = 0.4545
Correlation coefficient is the square root of Coefficient of Determination.
Slope Hypothesis test:
Null and alternative hypothesis:
Ho: β₁ = 0 ; Ha: β₁ ≠ 0
α = 0.05
Slope, b = 2.5
Test statistic:
t = b/(se/√SSxx) = 4.3780
df = n-2 = 23
p-value = T.DIST.2T(ABS(4.378), 23) = 0.0002
Conclusion:
p-value < α Reject the null hypothesis.
-----------------------------------------------------------------------------------------------------
b)
Ʃx = | 30000 |
Ʃy = | 400 |
Ʃxy = | 1259600 |
Ʃx² = | 94060000 |
Ʃy² = | 17096 |
Sample size, n = | 10 |
x̅ = Ʃx/n = 30000/10 = | 3000 |
y̅ = Ʃy/n = 400/10 = | 40 |
SSxx = Ʃx² - (Ʃx)²/n = 94060000 - (30000)²/10 = | 4060000 |
SSyy = Ʃy² - (Ʃy)²/n = 17096 - (400)²/10 = | 1096 |
SSxy = Ʃxy - (Ʃx)(Ʃy)/n = 1259600 - (30000)(400)/10 = | 59600 |
Slope, b = SSxy/SSxx = 59600/4060000 = 0.0146798
y-intercept, a = y̅ -b* x̅ = 40 - (0.01468)*3000 = -4.0394089
Regression equation :
ŷ = -4.0394 + (0.0147) x
Predicted value of y at x = 3600
ŷ = -4.0394 + (0.0147) * 3600 = 48.8079
Correlation coefficient, r = SSxy/√(SSxx*SSyy) = 59600/√(4060000*1096) = 0.8935
Coefficient of determination, r² = (SSxy)²/(SSxx*SSyy) = (59600)²/(4060000*1096) = 0.7983
Sum of Square error, SSE = SSyy -SSxy²/SSxx = 1096 - (59600)²/4060000 = 221.083744
Standard error, se = √(SSE/(n-2)) = √(221.08374/(10-2)) = 5.25694
Slope Hypothesis test:
Null and alternative hypothesis:
Ho: β₁ = 0 ; Ha: β₁ ≠ 0
α = 0.05
Test statistic:
t = b/(se/√SSxx) = 5.6266
df = n-2 = 8
p-value = T.DIST.2T(ABS(5.6266), 8) = 0.0005
Conclusion:
p-value < α Reject the null hypothesis.