In: Statistics and Probability
Use R to generate two random numbers n11, n21 from the Binomial distribution: Bin(10, 0.4). Print your results.
Please don’t forget to use the command set.seed(101) before the commands gen- erating the random numbers.
(b) (2 points) Use the R command ntable < − array(data = c(n11, n21, n1plus-n11, n2plus-n21), dim = c(2,2)) to create a 2 × 2 table using the numbers generated in part (a) above. Print your table.
(c) (3 points) Perform the Fisher’s exact test on the 2 × 2 table you created in part (b) above. Print your output. What is the P-value? Do you reject the null hypothesis of independence of the row and column variables at the 5 percent level of significance? Give reasons to your answer.
(d) (4 points) Repeat the above steps 10000 times. You don’t have to print the 10000 tables. Print the first five p-values. How many times (out of all 10000 tests), do you reject the null hypothesis of independence of the row and column variables at the 5 percent level of significance? Comment on your results.
(e) (2 points) Repeat part (d) above, but this time with n11, n21 from the Binomial distribution: Bin(100, 0.4). How many times (out of all 10000 tests), do you reject the null hypothesis of independence of the row and column variables at the 5 percent level of significance? Comment on your results.
Please do part d and e.
Things that are given: n11=3, n21=1, n1plus=10, n2plus=10
Please do not copy and paste the solution for this question that was solved last time. The parameters are clearly defined.
d) The R code to run a simulation 10000 times is given below
N <- 10000
set.seed(199)
n <- 10
p <- 0.4
n_rej <- 0
n1plus <- n
n2plus <- n
for (i in 1:N)
{
n11 <- rbinom(1,n,p)
n21 <- rbinom(1,n,p)
ntable <− array(data = c(n11, n21, n1plus-n11, n2plus-n21), dim
= c(2,2))
p <- fisher.test(ntable,alternative="two.sided")$p.value
if(i<= 5)
{
print (p)
}
if (p <0.05)
{
n_rej <- n_rej+1
}
}
n_rej
The first 5 P-values are printed below:
[1] 0.1408669
[1] 1
[1] 1
[1] 1
[1] 1
>
> n_rej
[1] 0
The null hypothesis of independence of the row and column variables at the 5 percent level of significance is rejected 0 times out of 10000.
e) Code modified for .
N <- 10000
set.seed(199)
n <- 100
p <- 0.4
n_rej <- 0
n1plus <- n
n2plus <- n
for (i in 1:N)
{
n11 <- rbinom(1,n,p)
n21 <- rbinom(1,n,p)
ntable <− array(data = c(n11, n21, n1plus-n11, n2plus-n21), dim
= c(2,2))
p <- fisher.test(ntable,alternative="two.sided")$p.value
if(i<= 5)
{
print (p)
}
if (p <0.05)
{
n_rej <- n_rej+1
}
}
n_rej
The first 5 P-values are printed below:
[1] 0.3123252
[1] 0.5427761
[1] 0.6714837
[1] 1
[1] 1
The null hypothesis of independence of the row and column variables at the 5 percent level of significance is rejected 0 times out of 10000.
Check your definitions of n11=3, n21=1, n1plus=10, n2plus=10. This is not correct. Another student defined n1plus, n2plus in a different way. I am not responsible. I have given you the code.