In: Statistics and Probability
In R,
Part 1. Learn to understand the significance level α in hypothesis testing.
a) Generate a matrix “ss” with 1000 rows and 10 columns. The elements of “ss” are random samples from standard normal distribution.
b) Run the following lines:
mytest <- function(x) {
return(t.test(x,mu=0)$p.value)
}
mytest(rnorm(100))
Note that, when you input a vector in the function mytest, you will get the p-value for the one sample t-test H0 : µ = 0 vs Ha : µ =/= 0.
c) Conduct one sample t-test H0 : µ = 0 vs Ha : µ =/= 0 for each row of “ss”. (Hint: use either functions apply() or for() and a function mytest())
d) For the 1000 tests you conducted in c), what is the ratio of rejection if the significance level α = 0.05? How about α = 0.1 or 0.01?
Part 2. Let’s start from R built-in dataset “sleep” (You may access it by running sleep in R or running data(sleep) in R).
a) Load dataset sleep and open the description file.
b) Draw histogram for column “extra”. Comment on the shape of the histogram.
c) Define two vectors “x” and “y” as following:
x<-sleep$extra[1:10]
y<-sleep$extra[11:20]
d) Conduct two-sample t-test to check whether the means of x and y are significantly different. Make sure to state your hypothesis, test statistic, p-value, decision rule and conclusion for the test -
e) Define a vector “z” as following:
z<-x-y
f) Conduct one-sample t-test to check whether the mean of z is significantly different from 0.
As required the R code is provided below,
___________________________________________________________________________
##Part 1a : Create 'ss' matrix
ss=matrix(0,nrow=1000,ncol = 10)
for(j in 1:1000){
r = rnorm(10)
for(i in 1:10){
ss[i,j]=r[i]
}
}
print(ss)
##Part 1b : Run the following lines:
mytest <- function(x) {
return(t.test(x,mu=0)$p.value)
}
mytest(rnorm(100))
##Part 1c : Conduct one sample t-test H0 : µ = 0 vs Ha : µ =/= 0 for each row of “ss”
pval=array(dim=1)
## array to store all 1000 p-values
for(i in 1:1000){
pval[i]=mytest(ss[i,])
}
print(pval)
##Part 1d : what is the ratio of rejection if the significance level α = 0.05? How about α = 0.1 or 0.01?
ratio5 = length(pval[which(pval<0.05)])/1000
ratio1 = length(pval[which(pval<0.1)])/1000
ratio01 = length(pval[which(pval<0.01)])/1000
print(ratio5)
print(ratio1)
print(ratio01)
___________________________________________________________________________
##Part 2a : Load dataset sleep
data(sleep)
d=sleep
##Part 2b : Draw histogram for column “extra” and
comment
hist(d$extra) ## This
gives the required histogram
##Conclusion : The data
seems somewhat symmetric about 2 with a bell-like curve as depicted
##above
##Part 2c : Define two vectors “x” and “y”
x<-sleep$extra[1:10]
y<-sleep$extra[11:20]
##Part 2d : Conduct 2 sample t-test for x and
y.
t.test(x,y)
##Part 2e : Define vector z
z<-x-y
##Part 2f : Conduct 1-sample t-test on z if mean(z) is significantly different from 0.
t.test(z)