In: Statistics and Probability
This problem uses the data set Heights from the alr4 package, which contains the
heights of n = 1375 pairs of mothers (mheight) and daughters (dheight) in inches. (Solve this problem in r)
(a) Compute the regression of dheight on mheight, and report the estimates, their standard errors, the value of the coefficient of determination, and the estimate of variance. Write a sentence or two that summarizes the results of these computa- tions.
(b) Obtain a 99% confidence interval for β1 from the data.
(c) Obtain a predicted value and 90% prediction interval for a daughter whose mother
is 58 inches tall.
a) R Code with comments
#load the library alr4
library(alr4)
#print some records from the dataset
head(Heights)
#part a)
#regression
fit<-lm(dheight~mheight,data=Heights)
#get the summary information
fit.s<-summary(fit)
#print the results
sprintf('Estimate of the intercept is %.4f',fit.s$coef[1,1])
sprintf('The standard error of the intercept estimate is
%.4f',fit.s$coef[1,2])
sprintf('Estimate of the slope is %.4f',fit.s$coef[2,1])
sprintf('The standard error of the slope estimate is
%.4f',fit.s$coef[2,2])
sprintf('The value of the coefficient of determination is
%.4f',fit.s$r.squared)
sprintf('The value of the estimate of variance is
%.4f',fit.s$sigma^2)
# get this
the estimated regression equation is
ans: The positive slope value of 0.5417 indicates that the heights of mother and daughter move in the same direction. That means, for 1 inch increase in the height of the mother, the predicted height of a daughter increases by 0.5417 inches. The coefficient of determination of 0.2408 indicates that 24.08% variation in daughter's height is explained by the height of the mother.
b) 99% confidence interval for slope
R code
#part b
ci<-confint(fit,'mheight',level=0.99)
sprintf('A 99%% confidence interval for the slope is
[%.4f,%.4f]',ci[1],ci[2])
#get this
c) 90% prediction interval for mheight=58 inches
R code
#part c)
p.i<-predict(fit,newdata=list(mheight=58),interval="prediction",
level=.90)
sprintf('The predicted value of the height of a daughter %.4f
inches',p.i[1])
sprintf('The 90%% prediction interval for a daughter is
[%.4f,%.4f]',p.i[2],p.i[3])
#get this
All code together
------------
#install the package for the first time
#install.packages('alr4')
#load the library alr4
library(alr4)
#print some records from the dataset
head(Heights)
#part a)
#regression
fit<-lm(dheight~mheight,data=Heights)
#get the summary information
fit.s<-summary(fit)
#print the results
sprintf('Estimate of the intercept is %.4f',fit.s$coef[1,1])
sprintf('The standard error of the intercept estimate is
%.4f',fit.s$coef[1,2])
sprintf('Estimate of the slope is %.4f',fit.s$coef[2,1])
sprintf('The standard error of the slope estimate is
%.4f',fit.s$coef[2,2])
sprintf('The value of the coefficient of determination is
%.4f',fit.s$r.squared)
sprintf('The value of the estimate of variance is
%.4f',fit.s$sigma^2)
#part b
ci<-confint(fit,'mheight',level=0.99)
sprintf('A 99%% confidence interval for the slope is
[%.4f,%.4f]',ci[1],ci[2])
#part c)
p.i<-predict(fit,newdata=list(mheight=58),interval="prediction",
level=.90)
sprintf('The predicted value of the height of a daughter %.4f
inches',p.i[1])
sprintf('The 90%% prediction interval for a daughter is
[%.4f,%.4f]',p.i[2],p.i[3])
-----------