In: Statistics and Probability
An experiment is performed to study the fatigue performance of a high strength alloy. The number of cycles to crack initiation is measured for twenty specimens over a range of applied pseudo-stress amplitude (PSA) levels. Use the data in the table provided to fit the following three regression models with y = Cycles and x = PSA (note the natural log transform of y for all models):
PSA (x) Cycles (y)
80, 97379
80, 340084
80, 246163
80, 239348
100, 34346
100, 23834
100, 70423
100, 51851
120, 9139
120, 9487
120, 8094
120, 17956
140, 5640
140, 3338
140, 6170
140, 5608
160, 1723
160, 3525
160, 2655
160, 1732
i. A simple linear regression model: lny=β0+β1∙x .
ii. A quadratic polynomial model: lny=γ0+γ1∙x+γ2∙x2 .
iii. A simple linear regression model with a logarithm transformation on PSA: lny=δ0+δ1∙ln(x) .
Solution:-
Given that
An experiment is performed to study the fatigue performance of a high strength alloy. The number of cycles to crack initiation is measured for twenty specimens over a range of applied pseudo-stress amplitude (PSA) levels. Use the data in the table provided to fit the following three regression models with y = Cycles and x = PSA (note the natural log transform of y for all models):
a.(i).
SUMMARY OUTPUT
Regression Statistics | ||||||
Multiple R | 0.732833479 | |||||
R square | 0.537044908 | |||||
Adjusted R square | 0.511325181 | |||||
Standard Error | 68732.05708 | |||||
Observations | 20 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 98642240334 | 9.86E+10 | 20.88066 | 0.000237653 | |
Residual | 18 | 85033722059 | 4.72E+09 | |||
Total | 19 | 1.83676E+11 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 356881.15 | 66991.72252 | 5.327242 | 4.6E-05 | 216136.7636 | 497625.5364 |
x | -2482.97 | 543.3746216 | -4..56954 | 0.000238 | -3624.557719 | -1341.382281 |
Since p-value of F-test = 0.000237653 < 0.05
so Overall model is significant.
ii.
SUMMARY OUTPUT
Regression Statistics | ||||||
Multiple R | 0.8821044 | |||||
R square | 0.778108173 | |||||
Adjusted R square | 0.752003252 | |||||
Standard Error | 48963.48877 | |||||
Observations | 20 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 2 | 1.4292E+11 | 7.15E+10 | 29.80695 | 2.76822E-06 | |
Residual | 17 | 40756194946 | 2.4E+09 | |||
Total | 19 | 1.83676E+11 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 1312922.293 | 227524.1028 | 5.770476 | 2.26E-05 | 832888.3964 | 1792956.189 |
x | -19354.28429 | 3944.850497 | -4.90621 | 0.000133 | -27677.19132 | -11031.37725 |
x^2 | 70.29714286 | 16.35755352 | 4.297534 | 0.000488 | 35.78572163 | 104.8085641 |
Since p-value of F test =
so overall model is significant
b. (iii).
Since p-value corresponding = 0.000488 < 0.05
so is significantly different from zero.
c.
Output of Model (iii)
SUMMARY OUTPUT
Regression Statistics | ||||||
Multiple R | 0.973049464 | |||||
R square | 0.946825259 | |||||
Adjusted R square | 0.943871107 | |||||
Standard Error | 0.398494771 | |||||
Observations | 20 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 50.89583163 | 50.89583 | 320.5066 | 6.46066E-13 | |
Residual | 18 | 2.85836549 | 0.158798 | |||
Total | 19 | 53.75419712 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 40.67886966 | 1.733510441 | 23.46618 | 6E-15 | 37.03689936 | 44.32083995 |
log (x) | -6.513561728 | 0.363831296 | -17.9027 | 6.46E-13 | -7.277942916 | -5.74918054 |
Coefficient of determination for model (i) = 0.537044908
Coefficient of determination for model (ii) = 0.778108173
Coefficient of determination for model (iii) = 0.946825259
Hence model (iii) is the best model since it explains 94.68% of total variation in y and this is maximum among three.
However three models are significant.
Thanks for supporting...
Please give positive rating...
Please give thumbs up