In: Statistics and Probability
Cotinine level(ng/ml) was measured in the meconium of newborns of mothers who were active, passive or nonsmokers. There were consecutive women arriving for delivery at one hospital. The alkaloid, cotinine is the main metabolite of nicotine. with a half-life of around 20 hours and detectable for several days after exposure, it is a biomarker for exposure to tobacco smoke.
Cotinine level (ng/ml)
Active Smokers (490, 418, 405, 328, 700, 292, 295, 272, 240, 232)
Passive Smokers ( 254, 219, 287, 257, 271, 282, 148, 273, 350, 293)
Nonsmoker ( 158, 163, 153, 207, 211, 159, 199, 187, 200, 213)
1. Create your own Reditol file to store, analyze adn graph this data set as called for in the questions below. Save this R program file as smoking Elba.R if your name is Elba, else use your name. I should be able to execute your code to produce the answers and the graph you submitted for this problem.
Descriptive statistics for Cotinine level for these 3 smoking groups. Round-off to appropriate levels.
smoking group N Mean SD 95% CI Mean (by t-distribution)
Active Smokers 10 ______ _______ _____________
Passive Smokers 10 ________ _______ __________________
Nonsmokers 10 ________ _________ __________________
2. Perform one-way Anova. Report p-value to ful resolution; round-off the other statistics to appropriate levels.
source df SS MS F P
Age group 2 ______ _______ _______ _____
Unexplained 27 ______ ________ ______ ______
3. Perform Kruskal-Wallis test.
P= ______
4. R-square : Among groups= _______%
5. Cohen's D= ___________( standardized effect size)
6. using plotmeans() in the gplots package, construct a publication quality graph of the means with their 95% confidence inervals calculated using the t-distribution.
7. In several sentences suitable for scientific journal, express the results of the one-way Anova and associated analysis including insight from effect sizes, CI, multiple comparisons and graphical visualization.
n=10
AS<-c(490,418,405,328,700,292,295,272,240,232)
PS<-c(254,219,287,257,271,282,148,273,350,232)
NS<-c(158,163,153,207,211,159,199,187,200,213)
M<-c(mean(AS),mean(PS),mean(NS))
barplot(M,names.arg=c("Active Smokers","Passive Smokers","nonsmokers"),ylim=c(0,400),main=c("Barplot of Mean level of biomarkers for three groups"))
m_as<-mean(AS)
sd_as<-sd(AS)
a=qt(0.975,df=n-1)*sd_as/sqrt(n)
CI_l_as=m_as-a
CI_u_as=m_as+a
m_ps<-mean(PS)
sd_ps<-sd(PS)
b=qt(0.975,df=n-1)*sd_ps/sqrt(n)
CI_l_ps=m_ps-b
CI_u_ps=m_ps+b
m_ns<-mean(NS)
sd_ns<-sd(NS)
c=qt(0.975,df=n-1)*sd_ns/sqrt(n)
CI_l_ns=m_ns-c
CI_u_ns=m_ns+c
########## ANOVA ######
data<-c(AS,PS,NS)
x<-as.factor(c(rep("AS",10),rep("PS",10),rep("NS",10)))
model<-aov(data~x)
summary(model)
######### Kruskal-Wallis test ########3
kruskal.test(data~x)
###### R^2 ##########
summary(lm(data~x))
###### Cohen's D #########
library(effsize)
cohen.d(AS,PS)
cohen.d(AS,NS)
cohen.d(PS,NS)
####### plotmeans() ########
library(gplots)
plotmeans(data~x,mean.labels=TRUE,main=c("Meanplot of the data along with 95% CI"))
######### Output #######
## Active smokers
> m_as
[1] 367.2
> sd_as
[1] 143.6708
> CI_l_as
[1] 264.4241
> CI_u_as
[1] 469.9759
## Passive smokers
> m_ps
[1] 257.3
> sd_ps
[1] 52.26439
> CI_l_ps
[1] 219.9123
> CI_u_ps
[1] 294.6877
## Non-smokers
> m_ns
[1] 185
> sd_ns
[1] 24.22579
> CI_l_ns
[1] 167.6699
> CI_u_ns
[1] 202.3301
### ANOVA table
> summary(model)
Df Sum Sq Mean Sq F value Pr(>F)
x 2 168340 84170 10.54 0.000414 ***
Residuals 27 215638 7987
### Kruskal-Wallis test
Kruskal-Wallis chi-squared = 18.5777, df = 2, p-value = 9.245e-05
### R-square
Multiple R-squared: 0.4384
## Cohen's D
> cohen.d(AS,PS)
Cohen's d
d estimate: 1.016616 (large)
95 percent confidence interval:
inf sup
-0.04246702 2.07569838
> cohen.d(AS,NS)
Cohen's d
d estimate: 1.768508 (large)
95 percent confidence interval:
inf sup
0.5823635 2.9546523
> cohen.d(PS,NS)
Cohen's d
d estimate: 1.774947 (large)
95 percent confidence interval:
inf sup
0.5874927 2.9624005
### Interpretation
The mean levels for the three factors are significanly different
among themselves
at very high significance level. The mean as well as the 95% CI for
mean is lowest
for NS, then PS and AS is much more high value than other two
groups.