In: Statistics and Probability
Write a code in R markdown
Independent random samples of companies in the three countries yielded the following data on financial losses (in trillion dollars).
A | B | C |
13 | 17 | 11 |
10 | 12 | 7 |
13 | 18 | 9 |
14 | 13 | 13 |
15 | 15 | 24 |
(1) Simulate and plot the null distribution of the ANOVA test statistic, including a vertical line corresponding to the observed test statistic value.
(2) Calculate the P-value
(3) Is there a significant difference in the mean profit losses among the three countries? Compare your conclusion with the ordinary one-way ANOVA in the problem above.
(4) Are the residuals normal? Why or why not? Is the ANOVA procedure justified?
R Code:
## Exercise 1
##1
```{r}
la=c(13,10,13,14,15)
lb=c(17,12,18,13,15)
lc=c(11,7,9,13,24)
company=c(rep(1,5),rep(2,5),rep(3,5))
company=as.factor(company)
loss=c(la,lb,lc)
w=aov(lm(loss~company))
anova(w)
```
Thus we get the observed F value as .4096. Now the null
distribution is F with 2 and 12 degrees of freedom.
```{r}
curve(df(x,2,12),0,10,col="red",ylab="Density")
abline(v =.4096, untf = FALSE,col="green")
```
#2.
The p value is calculated as .6729.
#3.
Since the p value is more than the 5\%%\ level, no significant difference among the mean profit losses exists.
#4.
```{r}
h=resid(lm(loss~company))
boxplot(h)
boxplot(la,lb,lc,names=c("A","B","C"))
```
In order to check normality, we provide histogram and boxplot of
the residuals. We find that the distribution of the residual is
more or less negatively skewed. Since it is not symmetric,
normality assumption seems not justified. We also provide side by
side boxplots to compare the variabilities. We also find
differences in variability and that all the distributiuons are not
symmetric (specially for Company A, the variability is least). Thus
usual ANOVA procedure is not justified.