In: Statistics and Probability
Begnning in 1988 through 2013, the yearly numbers of publications in referred journals by Professors Swartz are as follows: 2,0,0,1,5,1,2,4,2,3,4,2,3,3,2,2,4,2,2,5,1,3,2,6,1,3. In a test of whether his publication rate is 3 papers per year, the p-value = 2P(X<= 2.5) = 2 P( t26 <= (2.5-3)/(1.5033/ sqrt (26) ) = 2P ( t25 <= -1.696) = 0.102 is obtained and the null hypothesis is not rejected . There are various probelms with the assumptions underlying the test procedure. Describe some of the problems.
Sol:
H0:mu=3
H1:mu not =3
alpha=0.05
Rcode to get t statistic and p value is:
no_of_publications <-
c(2,0,0,1,5,1,2,4,2,3,4,2,3,3,2,2,4,2,2,5,1,3,2,6,1,3)
t.test(no_of_publications,mu=3)
Result:
One Sample t-test
data: no_of_publications
t = -1.6959, df = 25, p-value = 0.1023
alternative hypothesis: true mean is not equal to 3
95 percent confidence interval:
1.892792 3.107208
sample estimates:
mean of x
2.5
t=-1.6959
p=0.1023
p>0.05
do not reject H0
Accept Null hypothesis.
various probelms with the assumptions underlying the test procedure:
NORMALITY ASSUMPTION:
If the distribution of sample data of yearly number of publicatons is normal ,fine to use a t test.
But if the underlying distribution is not normal and sample size is small(rule of thumb n>30 per group if not too skewed;n>100 if the distribution is really skewed,the central imit theorem takes some time to kick in.cannot use t test
t test is very robust against the normality assumption.
if its not normal use wilcoxson rank sum test
INDEPENDENCE (CRITICAL)
For single mean as in this case,the observations within a sample must be indpendent.