In: Statistics and Probability
A medical study, a doctor is interested in determining what factors affect forced expired volume values (FEV). The doctor randomly samples 15 patients and records their smoking status and age. Using the data, create the regression output (including the y-intercept and both independent variables) then answer the questions below. Note that smoking status and age are recorded as categorical variables.
Test Statistic |
|
P-Value |
|
Is the model significant? (Circle your answer) |
Yes No |
FEV | Smoking Status | Age |
0.79 | smoker | < 30 |
0.81 | smoker | 30 - 40 |
0.93 | non-smoker | 30 - 40 |
0.59 | smoker | 40 - 50 |
0.77 | non-smoker | > 50 |
0.61 | smoker | 40 - 50 |
0.95 | non-smoker | < 30 |
0.87 | non-smoker | < 30 |
0.63 | smoker | 40 - 50 |
0.88 | non-smoker | 30 - 40 |
0.61 | smoker | > 50 |
0.64 | smoker | < 30 |
0.67 | smoker | 30 - 40 |
0.87 | non-smoker | 40 - 50 |
0.93 | non-smoker | < 30 |
Let y = Forced Expired Volume (FEV)
x1 = Smoking status
x2 = Age
For calculation purpose lets categorize x1 and x2 in numeric formats.
if smoker then x1 = 1, if non-smoker then x1 = 0.
If age < 30 then x2 = 1, if 30 < age < 40 then x2 = 2
if 40 < age < 50 then x2 = 3, if age > 50 then x2 = 4
so the regression equation is as below;
y = m1x1 + m2x2 + b
where m1 and m2 are coefficients for each of x1 and x2 respectively and b is constant (intercept).
............................ (1)
Find attached image for the detailed regression analysis output by using the Regression Data analysis pack from excel.
From above image we get below values:
Test Statistic | F-value in above table = 32.28562325 |
P-value | For intercept = 0.0000000000035370710 |
Is the model significant? |
No |
From above output we get the regression equation as:
y = (-0.199808774)x1 + (-0.045748031)x2 + 0.977210349
.......................(2)
Now we want to predict FEV(y) for smoker(x1) above 50 years(x2).
Here x1 = 1 and x2 = 4 by using categorization defined in (1) above.
so from given values regression equation will be as:
y = (-0.199808774)*1 + (-0.045748031)*4 + 0.977210349
y = 0.594409 = 0.59(2 decimals)
Hence the FEV for Smoker of Age above 50 years = 0.59
Now we want to estimate average FEV for smokers between 30-40 years old.
From given data there are total 8 smoker patients. Of this 8 there are 2 patients of age between 30-40 years as below
FEV | Smoking Status | Age |
0.81 | smoker | 30 - 40 |
0.67 | smoker | 30 - 40 |
Hence in given situation the average FEV = (0.81 + 0.67)/2 = 0.74
Average FEV for Smoker between 30-40 years old = 0.74