Question

In: Statistics and Probability

2. Write a simulation in R that shows the distribution of the t-test statistic when the...

2. Write a simulation in R that shows the distribution of the t-test statistic when the null hypothesis is true. To do this, you should use a for loop that repeatedly performs t-tests comparing sample means of data that come from distributions with the same population mean and standard deviation. Use rnorm() to take samples, t.test() to perform the t-tests, and use “$statistic” to extract the t-test

statistic from the t.test() procedure (e.g. t.test(x,y)$statistic). Make a histogram of the test statistics. If you need help, look back at the notes on for loops.

One assumption of the t-test is that the populations you sample from have the same standard deviation. Violating this assumption can affect the distribution of the t-test statistic. This is especially the case when sample sizes are unequal.
1. Re-do the simulation from 2, but this time sample from normal distributions with the same mean but where one has a standard deviation of 1 and a sample size of 20, and the other has a standard deviation of 5 and a sample size of 100. Plot a histogram of the test statistics. How does this differ from the histogram in part 2?
2. Perform the procedure in part a. above, but this time use the “pooled variance” t-test. To do this, add “var.equal=TRUE” as an argument in the t.test function. Plot a histogram of the test statistics. How does this differ from the histogram in part a. above
  
  using Rstudio

Expert Solution

We need to write a simulation in R that shows the distribution of the t-test statistic when the null hypothesis is true.

Let x1,x2,...xn1 be n1 random samples from variable X with mean 1 and and y1,y2,...yn2 be n2 random samples from variable Y with mean 2

Then null hypothesis is

H0 : 1 = 2

We reject null hypothesis if Test statistics value is greater than t-table value

Where Test statistics : TS =

And t-table value is

If |TS| > , we reject null hypothesis .

Now we dose not need to do it manually , we will use R-software only to calculate Test statistics value which we need to plot .

We need to simulate when the null hypothesis is true. Thus mean of two samples need to be same

Thus we will take 2 diferent samples from normal distribution of different size but with same population mean and standard deviation .

R - Code and output

{

TS=1.0                  # define variable TS
for(i in 1:100)          # to simulate 100 times
{
x=rnorm(50,5,2)            # First sample of size 50 , mean = 5 ,sd =2
y=rnorm(55,5,2)            # Second sample of size 55 , mean = 5 ,sd =2
TS[i]=t.test(x,y)$statistic        # to store test statistics values
}
hist(TS,xlab="Test Statistic",col=2)        # to plot histogram of test statistics .

}

Now we need to simulate , but this time sample from normal distributions with the same mean but where one has a standard deviation of 1 and a sample size of 20, and the other has a standard deviation of 5 and a sample size of 100 .

Let both have same mean equal to 5 .

R - Code and output

{

TS=1.0                  # define variable TS
for(i in 1:100)          # to simulate 100 times
{
x=rnorm(20,5,1)               # First sample of size 20, mean =5 , sd =1
y=rnorm(100,5,5)             # Second sample of size 100 , mean =5 , sd =5
TS[i]=t.test(x,y)$statistic        # to store statistics values
}
hist(TS,xlab="Test Statistic",col=2)        # to plot histogram of test statistics .

We can this differ from the histogram in part 2 ,as we obser more extreme values of test statistics will will result into rejection of null hypothesis . Thus in these part we we have same ,mean but different sample size and standard deviation , we will observe more rejection of null hypothesis compared to that of lst part i.e t-test will falsly concluded that mean of two sample is not same .

Now we need to perform the procedure in part a. above, but this time use the “pooled variance” t-test. To do this, we need to add “var.equal=TRUE” as an argument in the t.test function .

These time we need to asuume that varation are equal.

We will use same previous random samples from rnorm(20,5,1) & rnorm(100,5,5)

R - code and output

{

TS=1.0
for(i in 1:100)
{
x=rnorm(20,5,1)
y=rnorm(100,5,5)
TS[i]=t.test(x,y,var.equal=TRUE)$statistic # t-test with “var.equal=TRUE”
}
hist(TS,xlab="Test Statistic (with var.equal=TRUE)",col=3)

}

In all case null hypothesis is not rejected as all Test Statistic values are very smaller compared to that of last part when we have not consider “var.equal=TRUE” .

For these case we can see range of all Test Statistic is from -1.5 to 1.5 which is far smaller.

After using “var.equal=TRUE” t-test statistic values shows the null hypothesis to be true.

orchestra answered 2 years ago

Write a simulation in R that shows the distribution of the t-‐test statistic for a two-‐sample...

Write a simulation in R that shows the distribution of the t-‐test statistic for a two-‐sample t test when the null hypothesis is true (i.e. when H0: µ1 -‐ µ2 = 0 is true). To do this, you should use a for loop that repeatedly performs t-‐tests comparing sample means of data that come from distributions with the same population mean and standard deviation. Use rnorm() to take samples, t.test() to perform the t-‐ tests, and use “$statistic” to extract...

) Explain when you will use the test statistic and when you will use the test statistic...

) Explain when you will use the test statistic and when you will use the test statistic . 2) Explain why using the smaller of n1 − 1 or n2 − 1 degrees of freedom to determine the critical t instead of is more precautious. 3) you learned how to test hypotheses regarding two sample proportions. Explain when you will use the test statistic and when you will use the test statistic .

1. Match the following situations with the correct test statistic distribution. Provide the correct test statistic...

1. Match the following situations with the correct test statistic distribution. Provide the correct test statistic you would use (or what type of test). Provide an explanation as to why this is the correct distribution or test. (2 points) A. normal distribution B. t distribution with 29 degrees of freedom C. t-distribution with 70 degrees of freedom D. Chi-square with 2 degrees of freedom E. Chi-square with 1 degree of freedom Match Question Items __ ___ A. The sponsors of...

For the data below, test the significance of the t test statistic using a=0.01 and the...

For the data below, test the significance of the t test statistic using a=0.01 and the claim p=0. x: 3 5 12 9 7 6 8 10 11 4 y: 21 16 4 9 13 14 11 6 5 18

preform a Monte Carlo simulation in R to generate the probability distribution of the sum of...

preform a Monte Carlo simulation in R to generate the probability distribution of the sum of two die (for example 1st die is 2 and second die is 3 the random variable is 2+3=5). The R-script should print out (display in R-studio) or have saved files for the following well labeled results: 1. Histrogram or barchart of probability distribution 2. Mean of probability distribution 3. Standard deviation of probability distribution

The test statistic of a two-tailed t-test for a sample size of 12 was found to...

The test statistic of a two-tailed t-test for a sample size of 12 was found to be 2.1. At a significance level of α = 0.05 ,which of the following statements is correct? Group of answer choices a.The P-value is smaller than 0.05 b.We fail to reject the Null hypothesis c.The test statistic is in the critical region d.The alternative hypothesis is H1: A market researcher wants to estimate the proportion of households in the Bay Area who order food...

Using a t-distribution table or software or a calculator, report the t-statistic which is multiplied by...

Using a t-distribution table or software or a calculator, report the t-statistic which is multiplied by the standard error to form the margin of error for the following cases. a. 90% confidence interval for a mean with 16 observations. b. 90% confidence interval for a mean with 26 observations. c. 98% confidence interval for a mean with 26 observations. a. t = _____ (round to three decimal places) b. t = _____ (round to three decimal places) c. t =...

When do you use each test? 1) 1-Proportion z test 2) T-test 3) 2 sample t...

When do you use each test? 1) 1-Proportion z test 2) T-test 3) 2 sample t test 4) Matched pairs test

Which statement below describes a property of the t statistic of the one-sample t test? A....

Which statement below describes a property of the t statistic of the one-sample t test? A. The distribution of the t statistic (t distribution) under H0 has a smaller (i.e. more narrow) spread than a standard normal distribution. B. The distribution of the t statistic (t distribution) under H0 depends on n. C. The distribution of the t statistic (t distribution) under H0 cannot be estimated unless n is large. D. Both A and B.

When conducting a two sample t-test using R, which of the following is not a possible...

When conducting a two sample t-test using R, which of the following is not a possible alternative hypothesis? A. true difference in means is not equal to zero B. true difference in means is greater than zero C. true difference in means is less than zero D. true difference in means is equal to zero