In: Statistics and Probability
The R output below gives the summary of the multiple regression model for birth weight based on both gestation length and smoking status:
lm(formula = Weight ~ Weeks + SmokingStatus, data = births)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1724.42 558.84 -3.086 0.00265 **
Weeks 130.05 14.52 8.957 2.39e-14 ***
SmokingStatusSmoker -294.40 135.78 -2.168 0.03260 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 484.6 on 97 degrees of freedom
Multiple R-squared: 0.4636, Adjusted R-squared: 0.4525
F-statistic: 41.92 on 2 and 97 DF, p-value: 7.594e-14
(e) Based on the model output, what is the estimated birth weight for a birth at 35 weeks gestation to a non-smoking mother? [1 mark]
(f) Briefly interpret the value ‘130.05’ in the output. [1 mark]
g) Why do the residuals have 97 degrees of freedom? [1 mark]
h) Based on the multiple regression model, is there any evidence of a difference in mean birth weight between smoking and non-smoking mothers? Justify your conclusion with reference to the R output above. [2 marks]
(i) Briefly explain why the conclusion from the multiple regression model might be different to the conclusion from the two-sample t-test in (d). [2 marks]
The R output below gives the summary of the multiple regression model for birth weight based on both gestation length and smoking status:
lm(formula = Weight ~ Weeks + SmokingStatus, data = births)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1724.42 558.84 -3.086 0.00265 **
Weeks 130.05 14.52 8.957 2.39e-14 ***
SmokingStatusSmoker -294.40 135.78 -2.168 0.03260 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 484.6 on 97 degrees of freedom
Multiple R-squared: 0.4636, Adjusted R-squared: 0.4525
F-statistic: 41.92 on 2 and 97 DF, p-value: 7.594e-14
(e) Based on the model output, what is the estimated birth weight for a birth at 35 weeks gestation to a non-smoking mother? [1 mark]
The estimated regression line is
Birth weight = -1724.42+130.05*weeks-294.40*smokingstatus.
Estimated Birth weight = -1724.42+130.05*35 -294.40*0
=2827.33
(f) Briefly interpret the value ‘130.05’ in the output. [1 mark]
When weeks increases by 1 unit, the birth weight increases by 130.05.
g) Why do the residuals have 97 degrees of freedom? [1 mark]
total sample size 100, therefore total Df = 100-1 =99 and df for regression is 2
df for residuals = 99-2=97.
h) Based on the multiple regression model, is there any evidence of a difference in mean birth weight between smoking and non-smoking mothers? Justify your conclusion with reference to the R output above. [2 marks]
calculated t = -2.168, P= 0.03260 which is < 0.05 level of significance. Regression coefficient for SmokingStatus is significant. Therefore there is sufficient evidence of a difference in mean birth weight between smoking and non-smoking mothers.
(i) Briefly explain why the conclusion from the multiple regression model might be different to the conclusion from the two-sample t-test in (d). [2 marks]
Multiple regression model might be different to the conclusion from the two-sample t-test because it considers the two independent variables together. two-sample t-test considers the two independent variables one at a time.