In: Statistics and Probability
Exercise physiologists investigated the relationship between lean body mass (in kg) and resting metabolic rate (in calories per day). They measured these values from a sample of 35 males.
Predictor Coeff St. Dev t p
Constant 264.00 276.90 0.95 0.363
Mass 22.563 6.360 3.55 0.005
S = 144.9 R-Sq= 55.7% R-Sq(adj) = 51.3%
a) Write the least squares regression line, defining any variables used in the equation.
b) If a male in the study had a residual of -9.4 and a predicted resting metabolic rate of 1798.28, find the actual resting metabolic rate for this man.
c) Interpret the slope of this regression line in context of the data.
d) Name and interpret the coefficient of non-determination in this study. (This is the complement of the coefficient of determination)
e) Suppose that the exercise physiologists decided to modify this investigation by including the effect of exercise. They have 60 available male volunteers for the study and believe that 30 minutes of exercise will have an effect on the variables. They plan to carry out a completely randomized design. Why would a randomized block, blocked for age, be a better experimental design?
f) If the mean resting metabolic rate was 1,922 for the mean in the original study with a standard deviation of 39, what is the probability of a male with a resting metabolic rate over 2,000?
g) Using your answer to part f, what is the probability that in a group of 12 men, exactly 2 will have a resting metabolic rate over 2,000? Show the formula without actually doing the calculation.
a)
Regression line is 'y=a+bx+e'; where y=response variable,
x=predictor variable, a=intercept/constant, b=slope=coefficient of
predictor, e=error term.
Least Squares Regression line is 'yhat=ahat+bhat*x'.
Here, y=metabolic rate and x=body mass
ahat=constant= 264.00
bhat=slope=coefficient of Mass= 22.563
Thus,
Least Squares Regression line is 'Rate = 264 + 22.563 *
Mass'.
b)
Residual(e)=y-yhat
y=actual value of response and yhat=predicted value of
response
Here, e=-9.4, yhat=1798.28, y=?
y=e+yhat=-9.4+1798.28= 1788.88
The actual resting rate for this man is 1788.88
(calories per day).
c)
Slope is indicative of change in y, when there is a unit change in
x.
Here, it means that when mass changes by one unit (kg), the
metabolic rate changes (in fact increases) by a factor of '22.563'.
[Since, bhat is positive, so increases.]
d)
Coefficient of Determination is 'R-Sq' which gives the proportion
of variation explained by the regression model.
Here, R-Sq=55.7%
Complement=Coefficient of Non-determination: This means how much proportion of variation is not explained by the model.
In this case, 44.3% [i.e. 100-55.7] of variation is not explained by the model.