Question

In: Statistics and Probability

These are a few short answer questions I am stumped on. 1. What is the sampling...

These are a few short answer questions I am stumped on.

1. What is the sampling distribution of the difference between means? Why can’t you conduct an independent samples t-test without it?

2. What are the assumptions of a two-sample t-test?

3. Why do we “pool” variance for a two-sample t-test? What are the assumptions that make this possible? How does it benefit us?

5. Why is a confidence interval not a probability statement?

9. What is an effect size? What happens to an effect size when sample size increases/decreases? Why?

10. What is power? How is it related to the α-level, sample size, and effect size?

Solutions

Expert Solution

ANSWER1; which is the difference between the two population means. The variance of the distribution of the sample differences is equal to ( / ) + ( / ). Therefore, the standard error of the differences between two means would be equal to . To convert to the standard normal distribution, we use the formula,

The sampling distribution of the difference between means can be thought of as the distribution that would result if we repeated the following three steps over and over again: (1) sample n1 scores from Population 1 and n2 scores from Population 2, (2) compute the means of the two samples (M1 and M2), and (3) compute the difference between means, M1 - M2. The distribution of the differences between means is the sampling distribution of the difference between means.

As you might expect, the mean of the sampling distribution of the difference between means is:

which says that the mean of the distribution of differences between sample means is equal to the difference between population means.

For example, say that the mean test score of all 12-year-olds in a population is 34 and the mean of 10-year-olds is 25. If numerous samples were taken from each age group and the mean difference computed each time, the mean of these numerous differences between sample means would be 34 - 25 = 9.

In order to test whether there is a difference between population means, we are going to make three assumptions:

  1. The two populations have the same variance. This assumption is called the assumption of homogeneity of variance.
  2. The populations are normally distributed.
  3. Each value is sampled independently from each other value. This assumption requires that each subject provide only one value. If a subject provides two scores, then the scores are not independent. The analysis of data with two scores per subject is shown in the section on the correlated t test later in this chapter.

The consequences of violating the first two assumptions are investigated in the simulation in the next section. For now, suffice it to say that small-to-moderate violations of assumptions 1 and 2 do not make much difference. It is important not to violate assumption 3.

We saw the following general formula for significance testing in the section on testing a single mean:

In this case, our statistic is the difference between sample means and our hypothesized value is 0. The hypothesized value is the null hypothesis that the difference between population means is 0. THIS IS WHY CAN'T BE USED.

2. The assumptions of the two-sample t-test are:1. The data are continuous (not discrete).2. The data follow the normal probability distribution.3. The variances of the two populations are equal. (If not, the Aspin-Welch Unequal-Variance test is used.)4. The two samples are independent. There is no relationship between the individuals in one sample as compared to the other (as there is in the paired t-test). 5. Both samples are simple random samples from their respective populations. Each individual in the population has an equal probability of being selected in the sample.

3. The pooled t-test or TS-pooled, which is the theoretically the correct t-TO POOL OR NOT TO POOL...499statistic, has fallen into some disfavor because of its ‘claimed’ sensitivity todepartures from the assumtions of equal population variances (Peck, Olsen, &Devore). We use a simulation study to disprove this claim. The study consistsof 240 comparisons of the two test statistics. For the sake ofsimplicity, we wildescribe here the basics of one such comparison. We will alsointroduce a fewterms which we will be using throughout the paper. For a single comparison ofthe two test statistics, we draw two independent random samples from two sim-ulated populations. The two populations may (or may not) have equal meansand/or equal variances. For a particular comparison, if thetwo populationsindeed have equal variances, we designate TS-pooled to be the ‘correct’ teststatistic. Similarly if the two populations have unequal variances, we desig-nate TS-unpooled to be the ‘correct’ test statistic. Once the two independentsamples from the two simulated populations are drawn, we perform the test ofhypothesis of equality of two means usingbothTS-pooled and TS-unpooled.Since the tests are conducted on samples from known populations, we recordthe conclusions of both TS-pooled and TS-unpooled as correct or incorrect.Furthermore, we label one of the two test statistics as the ‘better’ one if thep-value corresponding to that test-statistic is closer to the correct conclusion(unless, obviously, both test statistics have exactly the same p-value). For ex-ample, if the two populations, where the two samples were drawn from, indeedhad the same mean, then the test-statistic that yielded the bigger p-value islabeled as the ‘better’ one, whereas if the two populations,where the two sam-ples were drawn from, had unequal means, then the test-statistic that yieldedthe smaller p-value is labeled as the ‘better’ one. In addition, we label the teststatistic that yielded the p-value which is farther away (when compared to eachother) from the correct conclusion to be the ‘underperformer’. Also, if the twosamples were drawn from two populations with the same variance, we refer toit as the “equal-variance setting”, whereas if the two samples were drawn fromtwo populations with unequal variances, we refer to it as the“unequal-variancesetting”. Although we reveal, in section 3, the number of times each test statis-tic arrives at the correct conclusion, it is important to note that as far as thecomparisons are concerned, we are strictly interested in finding the ‘better’ one(or the ‘underperformer’) of the two.

5. I have seen posts argue along the lines of "the actually-computed CI either contains the population mean or it doesn't, so its probability is either 1 or 0", but this seems to imply a strange definition of probability that is dependent on unknown states (i.e. a friend flips fair coin, hides the result, and I am disallowed from saying there is a 50% chance that it's heads).

9. An effect size is a measure of how important a difference is: large effect sizes mean the difference is important; small effect sizes mean the difference is unimportant. It normalizes the average raw gain in a population by the standard deviation in individuals’ raw scores, giving you a measure of how substantially the pre- and post-test scores differ.

10. Power is the probability that the null hypothesis will be correctly rejected.

When conducting a power analysis a priori, there are typically three parameters a researcher will need to know to calculate an appropriate sample size to achieve empirical validity. Those parameters are the alpha value, the power, and the effect size. The alpha value is the level at which you determine to reject the null hypothesis. An alpha level of .05 is typically used when the statistical analysis is conducted in the social sciences field. Power is the probability that the null hypothesis will be correctly rejected. And according to Howell (2010), a generally accepted power is .80.

Regarding effect size, often times it is acceptable to use a medium effect in the sample size calculation, however, it is possible to determine an effect size that is more true to what has been found in previous studies in order to get a more accurate measure.


Related Solutions

I am stumped on these problems and homework question, please could I get the answers to...
I am stumped on these problems and homework question, please could I get the answers to the questions below from an expert. Thank you 5). Suppose that two population proportions are being compared to test weather there is any difference between them. Assume that the test statistic has been calculated to be z= 2.21. Find the p-value for this situation?   a). p-value = 0.4864 b). p-value = 0.0272 c). p-value = 0.9728 d). p-value = 0.0136. 8). If you are...
what is the best sampling plan if I am trying to evaulate a health program? I...
what is the best sampling plan if I am trying to evaulate a health program? I am evaluating whether or not a fall prevention program is helpful to elderly people in my county. My study design is Cohort the program last 3 months and I want to follow up with participants after the program is complete
Short Answer Questions: 1. In my research, I found that the levels of “gonadotropins” in the...
Short Answer Questions: 1. In my research, I found that the levels of “gonadotropins” in the body are critical to understanding how the drugs Clomid and Ortho Tri-Cyclen work. What are gonadotropins? What role do they play in fertility? 2. Some of the references talk about how “negative feedback” is involved in understanding how these drugs work. Can you explain what is meant by negative feedback? 3. My doctor told me that birth control pills contain small amounts of estrogen...
PLEASE DO NOT ATTEMPT MY QUESTION WITH ANY SHORT ANSWER OR MEDIOCORE ANSWER. I AM A...
PLEASE DO NOT ATTEMPT MY QUESTION WITH ANY SHORT ANSWER OR MEDIOCORE ANSWER. I AM A UNIVERSITY STUDENT AND I NEED TO UNDERSTAND IT BEYOND THE SURFACE INFORMATION. IMPACT OF HEALTHCARE TECHNOLOGY ON PANDEMICS Imagine your daily routine being entirely dependent on a smartphone app. Leaving your home, taking the subway, going to work, entering cafes, restaurants and shopping malls, where each move is dictated by the color shown on your screen. Green: you're free to proceed. Amber or Red:...
Hello I am stumped on this: Adjustment Data: One month's insurance coverage has expired. The company...
Hello I am stumped on this: Adjustment Data: One month's insurance coverage has expired. The company occupied the office space for the month of December. At the end of the month, $600 of office supplies are still available. Create journal entries to record the transactions that occurred during the month of December. (Completed in Unit 3) Prepare an unadjusted trial balance (Completed in Unit 3) Create adjusting journal entries at the end of the year, December 31 based on the...
Please answer questions/short answer form..THANKS...I rate high!! Digestive System 1. What is the benefit of having...
Please answer questions/short answer form..THANKS...I rate high!! Digestive System 1. What is the benefit of having a digestive system with specialized regions? 2. Which orgsns are considered "accessory" to the digestive system? What are their functions? 3. Whst does "extracellular" digestion refer to? Comparative Vertebrate Anatomy ((Bat, Cat, Pigeon & Human Skeletons)) 1. Which parts of the forelimbs are similar to each other? Different? 2. Compare the phalanges (finger bones) of each- how have they become modified in each animal?...
I am supposed to answer these conceptual questions with this lab simulator, but I can never...
I am supposed to answer these conceptual questions with this lab simulator, but I can never get the simulator to work https://phet.colorado.edu/en/simulation/legacy/energy-skate-park Help please? Energy State Park Lab Handout Click on the “Energy State Park Simulation” link to perform simulations in the setup satisfying the given conditions. Upon opening the simulation, the skate should be alternating between the walls of the skate park with no friction added and with Earth’s gravity. Click on the Show Pie Chart under the Energy...
please I need the short answer for these questions 1. Words power and energy are often...
please I need the short answer for these questions 1. Words power and energy are often used internchangeeably . Is that correct ? Explain . 2. What is the approximate trading price for a barrel of crude oil ? 3. What are two main forms of solar energy ? 4. What is a ball park figure for power rating of a 1 m^2 solar panel ? 5. Name at least one form of an energy crop ? 6. What does...
Hi, there is a few short questions related to Canadian economics. Please answer them thank you....
Hi, there is a few short questions related to Canadian economics. Please answer them thank you. 1.a) Economists often say things like if the prices are right, people will respond correctly." What exactly do they mean by this? What do they mean here by `right' and `correctly?' b) The federal government keeps writing cheques to provinces for health care. Is financing really the problem? If it is, why don't provinces just raise their tax rates? c) Why do we have...
The topic is Divorce (impact on children). I am required to answer the questions below. At...
The topic is Divorce (impact on children). I am required to answer the questions below. At what age and stage does this issue typically impact a person? What is its potential impact on a person’s development progress? What is the impact to cognitive, social, emotional, relationship, and/or moral dimensions of development?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT