In: Statistics and Probability
14. The regression equation and the standard error of estimate
Stewart Fleishman specializes in the psychiatric aspects of symptom management in cancer patients. Pain, depression, and fatigue can appear as single symptoms, in conjunction with one other symptom, or all together in patients with cancer. You are interested in testing a new kind of exercise therapy for the treatment of the simultaneous clustering of fatigue and depression in cancer patients. The following scores represent the decrease in symptom intensity (on a 10-point scale) following the new exercise therapy.
Scores |
||
---|---|---|
Patient |
Fatigue (X) |
Depression (Y) |
A | 0 | 10 |
B | 2 | 6 |
C | 4 | 7 |
D | 6 | 1 |
E | 8 | 2.5 |
Create a scatter plot of these scores on the grid. For each of the five (X, Y) pairs, drag the orange points (square symbol) in the upper-right corner of the diagram to the appropriate location on the grid.
Scores02468101086420DEPRESSIONFATIGUE
Calculate the means and complete the following table by calculating the deviations from the means for X and Y, the squares of the deviations, and the products of the deviations.
Scores |
Deviations |
Squared Deviations |
Products |
|||
---|---|---|---|---|---|---|
X |
Y |
X – MXX |
Y – MYY |
(X – MXX)² |
(Y – MYY)² |
(X – MXX)(Y – MYY) |
0 | 10 | |||||
2 | 6 | -2.00 | 0.70 | 4.00 | 0.49 | -1.40 |
4 | 7 | |||||
6 | 1 | 2.00 | -4.30 | 4.00 | 18.49 | -8.60 |
8 | 2.5 |
Calculate the sum of the products and the sum of squares for X. SP = and SSXX = .
Find the regression line for predicting Y given X. The slope of the regression line is and the Y intercept is .
Calculate the Pearson correlation coefficient, the predicted variability, and the unpredicted variability. The Pearson correlation is r = . The predicted variability is SSregressionregression = . The unpredicted variability is SSresidualresidual = .
Calculate the standard error of the estimate. The standard error of the estimate is .
Suppose you want to predict the depression score for a new patient. The only information given is that this new patient is similar to patients A through E; therefore, your best guess for the new patient’s level of depression is . The error associated with this guess (that is, the “standard” amount your guess will be away from the true value) is .
Suppose that now you are told the fatigue score for this new patient is 5.5. Now your best guess for the new patient’s level of depression is . The error associated with this guess (that is, the “standard” amount your guess will be away from the true value) is .
Finally, suppose before estimating the regression equation, you first transform each of the original scores into a z-score. The regression equation you estimate is:
ẑYY | = | zXX |
Sol:
Following is the scatter plot of the data:
Following table shows the calculations:
X | Y | X-Mx | Y-My | (X-Mx)^2 | (Y-My)^2 | (X-Mx)(Y-My) | |
0 | 10 | -4 | 4.7 | 16 | 22.09 | -18.8 | |
2 | 6 | -2 | 0.7 | 4 | 0.49 | -1.4 | |
4 | 7 | 0 | 1.7 | 0 | 2.89 | 0 | |
6 | 1 | 2 | -4.3 | 4 | 18.49 | -8.6 | |
8 | 2.5 | 4 | -2.8 | 16 | 7.84 | -11.2 | |
Total | 20 | 26.5 | 40 | 51.8 | -40 |
The mean are
We have
and
The slope is :
The intercept:
The correlation coeficient is
The standard error of estimate will be
The best estimate for Y is mean of Y, that is 5.3. The standard error associated with the estimate is 1.9833.
The regression equation is
y ' = 9.3 - x = 9.3 - 5.5 = 3.8
The standard deviation of X is
The formula for z -score of data X is
The regression equation will be
If you Satisfy with Answer, Please give me "Thumb Up".
It is very useful for me.
Thank you for your support.