In: Economics
2. The data files of Q2 are measures of depression levels- the higher the number, the higher the level of depression. The first file- Medical 1 includes 60 healthy individuals and the second file- Medical 2 includes 60 individuals with chronic issues. Each file has 20 individuals from NY, FL, and NC, respectively. The complete data set has 120 observations and you may download the data file,
At the .05 level of significance, use ANOVA to test for any significant difference of depression levels due to health condition (Medical 1-healthy individuals versus Medical 2- individuals with chronic issues). (Ignore the effect of location- 3 states.) Justify your answers with your hypotheses, test statistics, and critical value.
Hypothesis: _______________________________________________
(Define or describe your hypothesis clearly.)
Test statistics: _____________________________________________
Critical value: _____________________________________________
Conclusion and Interpretation: ) (Explain why you reject the hypothesis if you reject the Ho.)
Data from Medical 1- healthy individuals | Data from Medical 2- with chronic issues | ||||||
Florida | New York | N. Carolina | Florida | New York | N. Carolina | ||
3 | 8 | 12 | 13 | 14 | 12 | ||
7 | 11 | 7 | 12 | 9 | 12 | ||
7 | 9 | 3 | 17 | 15 | 15 | ||
3 | 7 | 5 | 17 | 12 | 18 | ||
8 | 8 | 11 | 20 | 16 | 12 | ||
8 | 7 | 8 | 21 | 24 | 14 | ||
8 | 8 | 4 | 16 | 18 | 17 | ||
5 | 4 | 3 | 14 | 14 | 8 | ||
5 | 13 | 7 | 13 | 15 | 14 | ||
2 | 10 | 8 | 17 | 17 | 16 | ||
6 | 6 | 8 | 12 | 20 | 18 | ||
2 | 8 | 7 | 9 | 11 | 17 | ||
6 | 12 | 3 | 12 | 23 | 19 | ||
6 | 8 | 9 | 15 | 19 | 15 | ||
9 | 6 | 8 | 16 | 17 | 13 | ||
7 | 8 | 12 | 15 | 14 | 14 | ||
5 | 5 | 6 | 13 | 9 | 11 | ||
4 | 7 | 3 | 10 | 14 | 12 | ||
7 | 7 | 8 | 11 | 13 | 13 | ||
3 | 10 | 11 | 17 | 30 | 11 |
By simple R commands, we can find the required results. Below is the command for data input and stacking. The data above is copied to a text file and the <tab> are replaced by <whitespace>.
---------------------------------------------------------------
> library(readr)
> d <- read_table2("textfile")
> x1 <- c(d$Florida,d$NewYork,d$N.Carolina)
> x2 <- c(d$Florida_1,d$NewYork_1,d$N.Carolina_1)
---------------------------------------------------------------
The anova table would be as below.
---------------------------------------------------------------
> summary(aov(x1~x2))
Df Sum Sq Mean Sq F value Pr(>F)
x2
1 11.3 11.252 1.606 0.21
Residuals 58 406.5 7.008
---------------------------------------------------------------
The anova table with data switched have the similar result but a bit different output as below. This is to note that the result would be same irrespective which group (medical 1 and medical 2) is taken as control and treatment.
---------------------------------------------------------------
> summary(aov(x2~x1))
Df Sum Sq Mean Sq F value Pr(>F)
x1
1 24.3 24.26 1.606
0.21
Residuals 58 876.3 15.11
---------------------------------------------------------------
The critical F can be found as below.
---------------------------------------------------------------
> qf(0.95,1,58)
[1] 4.006873
---------------------------------------------------------------
Hypothesis : and .
Test statistic : .
Critical value : .
As the calculated F is less than critical F, ie , we fail to reject the null hypothesis. Hence, the variation in both groups are not significantly different from each other. The p-value of the F-statistic, which is 0.21, is also much greater than 0.05 significance level, which is also sufficient to accept the null hypothesis.