Question

In: Statistics and Probability

FEV (forced expiratory volume) is an index of pulmonary function that measures the volume of air...

  1. FEV (forced expiratory volume) is an index of pulmonary function that measures the volume of air expelled after one second of constant effort. The data fev.csv contains determinations of FEV on 654 children ages 3-19 who were seen in the Childhood Respiratory Disease Study in East Boston, Massachusetts. The variables in the data include:

    ID:                       subject ID number
    Age:                    age in years
    FEV:                    FEV in liters
    Height:               height in inches
    Sex:                     Male or Female
    Smoker:              non = nonsmoker, Current = current smoker
    1. Make a boxplot of the FEV for children with age 3-8 years, 9-12 years, and 13 years or above. Does it appear that the FEV is the same for children from these three age groups?

    2. Is FEV the same across the three age groups? Perform a hypothesis test to answer the question. Use α = 0.05.

    3. Rank the three age groups using the multiple comparison approach.
    4. What assumptions are made with regard to the analysis in part b? Check whether these assumptions are violated.

    5. Is FEV is more strongly related to sex or smoking status? Carry out appropriate statistical analysis to answer the question.

  1. The investigator is also interested in how height is associated with age. Construct the scatter plot of height against age. What is the relationship between height and age?
  2. Regardless of what you observed in f, fit the regression model with height as the response and age as the independent variable. What is the fitted regression equation?

  3. Test whether there is a positive correlation between age and height. Perform the hypothesis test using α = 0.05.
  4. Is it appropriate to use the above regression model? Why or why not?

Solutions

Expert Solution

SOLUTION

a. Make a boxplot of the FEV for children with age 3-8 years, 9-12 years, and 13 years or above. Does it appear that the FEV is the same for children from these three age groups?

I have used EXCEL to construct box plot of the FEV for children with age 3-8 years, 9-12 years, and 13 years or above. The first step to sort the data from youngest to oldest age group. Divide the data into three groups age 3-8 years, 9-12 years, and 13 years or above. Insert > Box & Whisker

No, it doesn't seem that FEV is same for children from these three age groups.

Findings from boxplot:

  • FEV Age 9-12 have few outliers (dots in boxplot)
  • FEV Age 3-8 and Age >=13 is slightly skewed towards above.
  • All three data is roughly normal as data is almost equally distributed around the median.
  • Median of each group seems to be different.

b. Is FEV the same across the three age groups? Perform a hypothesis test to answer the question. Use alpha = 0.05.

To find the difference between the three age groups we performed ANOVA analysis on the data

The null and alternative hypotheses:

Calculate the value of the test statistic:

EXCEL > DATA> Data Analysis (AddIn) > ANOVA Single factor

Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
FEV Age 3-8 215 399.519 1.858228 0.176462
FEV Age 9-12 322 903.802 2.806839 0.410088
FEV Age >=13 117 421.133 3.599427 0.633303
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 248.0558 2 124.0279 332.4582 3.2E-100 3.00956
Within Groups 242.8641 651 0.373063
Total 490.9198 653

F(2,651) = 332.4582, p = 3.2E-100

Determine if the results are statistically significant (using rejection region or p-value approaches),

P = 3.2E-100 which is highly significant and F value (332.4582) is way above F critical value of 3.00956

State your conclusion in terms of the problem

Since ANOVA shows that the between-group difference is significant thus we reject the null hypothesis and accept the alternate hypothesis that there is a significant difference in FEV values between the age groups.

c. Rank the three age groups using the multiple comparison approaches.

Do Not function available in EXCEL for post host tests like Tukey etc. So I have perfomed paired t test for each pair

3-8 vs 9-12 p = 8.64E-45

9-12 vs >=13 p = 2.47E-23

3-8 vs >=13 p = 2.57E-50

All three pair are significant

d. What assumptions are made with regard to the analysis in part b? Check whether these assumptions are violated.

There are three main assumptions, listed here:

  1. The dependent variable is normally distributed in each group that is being compared in the one-way ANOVA

  2. There is the homogeneity of variances. This means that the population variances in each group are equal.

  3. Independence of observations. This is mostly a study design issue and, as such, you will need to determine whether you believe it is possible that your observations are not independent based on your study design

We carried out the descriptive analysis of the FEV in three age groups. The result obtained is as follows

FEV Age 3-8 FEV Age 9-12 FEV Age >=13
Mean 1.858228 Mean 2.806839 Mean 3.59942735
Standard Error 0.028649 Standard Error 0.035687 Standard Error 0.07357204
Median 1.79 Median 2.756 Median 3.519
Mode 1.624 Mode 2.352 Mode 3.297
Standard Deviation 0.420073 Standard Deviation 0.640381 Standard Deviation 0.795803284
Sample Variance 0.176462 Sample Variance 0.410088 Sample Variance 0.633302868
Kurtosis -0.0875 Kurtosis 0.804245 Kurtosis -0.227353716
Skewness 0.242282 Skewness 0.693698 Skewness 0.411303102
Range 2.202 Range 3.766 Range 3.595
Minimum 0.791 Minimum 1.458 Minimum 2.198
Maximum 2.993 Maximum 5.224 Maximum 5.793
Sum 399.519 Sum 903.802 Sum 421.133
Count 215 Count 322 Count 117
Confidence Level(95.0%) 0.05647 Confidence Level(95.0%) 0.07021 Confidence Level(95.0%) 0.145718695
Shapiro-Wilk Test
FEV Age 3-8 FEV Age 9-12 FEV Age >=13
W-stat 0.988854391 0.97202266 0.977644226
p-value 0.093440645 6.72808E-06 0.047848341
alpha 0.05 0.05 0.05
normal yes no no

We also performed Shapiro Wilks test RealStats(AddIn) in EXCEL. Which shows that two of the age groups is not normal.

Sample variance of three groups are different in numerical value.

e. Is FEV is more strongly related to sex or smoking status? Carry out appropriate statistical analysis to answer the question.

Using EXCEL > AddIn > RealStats > Data Analysis > Corr > Correlation test

Carried correlation test between FEV and SEX and FEV and SMOKING

RESULT of correlation test on FEV and SMOKING

Correlation Coefficients
Pearson -0.245424571
Spearman -0.258349236
Kendall -0.211145277
Pearson's coeff (t test) Pearson's coeff (Fisher)
Alpha 0.05 Rho 0
Tails 2 Alpha 0.05
Tails 2
corr -0.245424571
std err 0.037965248 corr -0.245424571
t -6.464453173 std err 0.039133024
p-value 1.99285E-10 z -6.392409034
lower -0.319973478 p-value 1.63292E-10
upper -0.170875664 lower -0.316142406
upper -0.171994479

RESULT of correlation test on FEV and SEX

Correlation Coefficients
Pearson -0.20841
Spearman -0.14364
Kendall -0.11739
Pearson's coeff (t test) Pearson's coeff (Fisher)
Alpha 0.05 Rho 0
Tails 2 Alpha 0.05
Tails 2
corr -0.20841
std err 0.038303 corr -0.208414959
t -5.44121 std err 0.039133024
p-value 7.5E-08 z -5.39671039
lower -0.28363 p-value 6.78738E-08
upper -0.1332 lower -0.28059776
upper -0.13388797

Conclusion:

  • Both sex and smoking have a significant correlation with FEV and have small negative correlation (-0.1 to -0.3).
  • Smoking is more strongly correlated with spearmen correlation of -0.25 while has a correlation of -0.14

f. The investigator is also interested in how height is associated with age. Construct the scatter plot of height against age. What is the relationship between height and age?

The height and Age have a linear relationship with R square valued of 0.6272

g. Regardless of what you observed in f, fit the regression model with height as the response and age as the independent variable. What is the fitted regression equation?

Regression Analysis of height as response and Age as the independent variable

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.791943602
R Square 0.627174669
Adjusted R Square 0.626602851
Standard Error 3.485201717
Observations 654
ANOVA
df SS MS F Significance F
Regression 1 13322.52461 13322.52461 1096.808 8.0598E-142
Residual 652 7919.603418 12.14663101
Total 653 21242.12803
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 45.95779737 0.478358112 96.07404205 0 45.01848904 46.89710571
Age 1.529099387 0.046171116 33.1180949 8.1E-142 1.438437365 1.619761409

The regression equation will be

height = 45.95 + 1.52*Age

Thus for one every unit increase in Age, height will increase with 1.52. The regression analysis is significant with p <0.00000001. The coefficient of determination, R square = 0.62 which implies 62% variance in height is explained by Age.

h.Test whether there is a positive correlation between age and height. Perform the hypothesis test using alpha = 0.05.

Correlation test on height and age

Correlation Coefficients
Pearson 0.791973295
Spearman 0.818796185
Kendall 0.660161965
Pearson's coeff (t test) Pearson's coeff (Fisher)
Alpha 0.05 Rho 0
Tails 2 Alpha 0.05
Tails 2
corr 0.791973295
std err 0.023929566 corr 0.791973295
t 33.09601623 std err 0.039163022
p-value 0 z 27.450651
lower 0.744984849 p-value 6.8244E-166
upper 0.838961742 lower 0.761521493
upper 0.818936333

The Pearson correlation coefficient is 0.79 which depicts a strong positive correlation.


Related Solutions

FEV (forced expiratory volume) is an index of pulmonary function that measures the volume of air...
FEV (forced expiratory volume) is an index of pulmonary function that measures the volume of air expelled after one second of constant effort. 654 children ages 3-19 who were seen in the Childhood Respiratory Disease Study in East Boston, Massachusetts. Variables: Age: age in years FEV: FEV in liters Age group 3-8: n=215 mean = 1.859 sd=.421 Age group 9-12: n=322 mean= 2.81 sd=.640 Age group 13-19: n=117 mean= 3.59 sd= .796 1. Is FEV the same across the three...
1. significance of forced expiratory volume in 1 second (FEV1) and forced vital capacity (FVC) 2....
1. significance of forced expiratory volume in 1 second (FEV1) and forced vital capacity (FVC) 2. how most carbon dioxide travels from tissues to lungs
The volume of inhaled air where there is no exchange of gases with the pulmonary capillaries...
The volume of inhaled air where there is no exchange of gases with the pulmonary capillaries is called the: A. Residual Volume B. Functional Residual Capacity C. Inspiratory Volume D. Dead Space E. Vital Capacity
A 26-year old female patient (height 5 foot 6 inches tall) exhibits a forced expiratory volume...
A 26-year old female patient (height 5 foot 6 inches tall) exhibits a forced expiratory volume in 1 second (FEV1) that is 1860 ml. Her VC = 3100 ml. Describe what type of respiratory disease the woman may have. Explain how this disease may affect the resistance in her conducting zone. Would you also expect to see damage in the respiratory zone of the lungs? Explain.
Explain the relationship among vital capacity, tidal volume, forced inhalation volume, and forced exhalation volume
Explain the relationship among vital capacity, tidal volume, forced inhalation volume, and forced exhalation volume
How would the volume measurements (tidal volume, inspiratory reserve volume, expiratory reserve volume, residual volume, vital...
How would the volume measurements (tidal volume, inspiratory reserve volume, expiratory reserve volume, residual volume, vital capacity) change when one's at rest compared to when one's exercising?
How does hyperinflated breathing affect the forced expired volume (FVC) and forced expired volume in 1...
How does hyperinflated breathing affect the forced expired volume (FVC) and forced expired volume in 1 second (FEV1)?
The health department of a large city has developed an air pollution index that measures the...
The health department of a large city has developed an air pollution index that measures the level of several air pollutants that cause respiratory distress in humans. The following table gives the pollution index (on a scale of 1 to 10, with 10 being the worst) for 7 randomly selected summer days and the number of patients with acute respiratory problems admitted to the emergency rooms of the city’s hospitals. Air pollution index 4.5 6.7 8.2 5.0 4.6 6.1 3.0...
a person has a resting tidal volume of .5L, an expiratory reserve volume of 1 L,...
a person has a resting tidal volume of .5L, an expiratory reserve volume of 1 L, a residual volume of 1.5L and a vital capacity of 5L. About 150ml of the tidal volume represents dead space and 350 ml represents alveolar ventilation. at the end of a normal expiration, 2500 ml of alveolar gas remains in the alveoli and airways (the function residual capacity) In the subsequent inspiration, this volume is mixed with 359 ml of inspirer air to give...
The most complete and informative measurement of pulmonary function is a Vital capacity. b Tidal volume....
The most complete and informative measurement of pulmonary function is a Vital capacity. b Tidal volume. c Alveolar ventilation. d Residual volume. e Total pulmonary ventilation.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT