In: Statistics and Probability
Find a research article that uses either: t-test, ANOVA, chi-square, or correlation. What was their hypothesis? What were their conclusions? What were the limitations of the analysis/analyses that they used?
If you were to conduct a similar experiment, what would your hypothesis be? How would you go about collecting data (describe the nature of your sample: where would you access participants, how many people would you sample, etc.)? What analysis/analyses would you use to test your hypothesis? Why is this the correct analysis to use?
I am not sure what kind of answer you are expecting. But I can tell you what are t-test,ANOVA test ,chi-square test and correlation is.
t-test :
The t test tells you how significant the differences between groups are; In other words it lets you know if those differences (measured in means/averages) could have happened by chance.
Let’s say you have a cold and you try a naturopathic remedy. Your cold lasts a couple of days. The next time you have a cold, you buy an over-the-counter pharmaceutical and the cold lasts a week. You survey your friends and they all tell you that their colds were of a shorter duration (an average of 3 days) when they took the homeopathic remedy. What you really want to know is, are these results repeatable? A t test can tell you by comparing the means of the two groups and letting you know the probability of those results happening by chance.
Student’s T-tests can be used in real life to compare means. For example, a drug company may want to test a new cancer drug to find out if it improves life expectancy. In an experiment, there’s always a control group (a group who are given a placebo, or “sugar pill”). The control group may show an average life expectancy of +5 years, while the group taking the new drug might have a life expectancy of +6 years. It would seem that the drug might work. But it could be due to a fluke. To test this, researchers would use a Student’s t-test to find out if the results are repeatable for an entire population.
There are three main types of t-test:
The t score is a ratio between the difference between two groups and the difference within the groups. The larger the t score, the more difference there is between groups. The smaller the t score, the more similarity there is between groups. A t score of 3 means that the groups are three times as different from each other as they are within each other. When you run a t test, the bigger the t-value, the more likely it is that the results are repeatable.
How big is “big enough”? Every t-value has a p-value to go with it. A p-value is the probability that the results from your sample data occurred by chance. P-values are from 0% to 100%. They are usually written as a decimal. For example, a p value of 5% is 0.05. Low p-values are good; They indicate your data did not occur by chance. For example, a p-value of .01 means there is only a 1% probability that the results from an experiment happened by chance. In most cases, a p-value of 0.05 (5%) is accepted to mean the data is valid.
Limitations :
ANOVA test :
An ANOVA test is a way to find out if survey or experiment results are significant. In other words, they help you to figure out if you need to reject the null hypothesis or accept the alternate hypothesis. Basically, you’re testing groups to see if there’s a difference between them. Examples of when you might want to test different groups:
There are two main types: one-way and two-way. Two-way tests can be with or without replication.
What Does “Replication” Mean?
- It’s whether you are replicating your test(s) with multiple groups. With a two way ANOVA with replication , you have two groups and individuals within that group are doing more than one thing (i.e. two groups of students from two colleges taking two tests). If you only have one group taking two tests, you would use without replication.
Limitations of the One Way ANOVA :
A one way ANOVA will tell you that at least two groups were different from each other. But it won’t tell you what groups were different.
The results from a Two Way ANOVA will calculate a main effect and an interaction effect. The main effect is similar to a One Way ANOVA: each factor’s effect is considered separately. With the interaction effect, all factors are considered at the same time. Interaction effects between factors are easier to test if there is more than one observation in each cell. For the above example, multiple stress scores could be entered into cells. If you do enter multiple observations into cells, the number in each cell must be equal.
Two null hypotheses are tested if you are placing one
observation in each cell. For this example, those hypotheses would
be:
H01: All the income groups have equal mean stress.
H02: All the gender groups have equal mean stress.
For multiple observations in cells, you would also be testing a
third hypothesis:
H03: The factors are independent or the
interaction effect does not exist.
An F-statistic is computed for each hypothesis you are testing.
To determine whether any of the differences between the means are statistically significant, compare the p-value to your significance level to assess the null hypothesis. The null hypothesis states that the population means are all equal. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference.
The limitations of ANOVA model is similar to t-test.
Chi-Square test :
The Chi-Square statistic is most commonly used to evaluate Tests of Independence when using a crosstabulation (also known as a bivariate table). Crosstabulation presents the distributions of two categorical variables simultaneously, with the intersections of the categories of the variables appearing in the cells of the table. The Test of Independence assesses whether an association exists between the two variables by comparing the observed pattern of responses in the cells to the pattern that would be expected if the variables were truly independent of each other. Calculating the Chi-Square statistic and comparing it against a critical value from the Chi-Square distribution allows the researcher to assess whether the observed cell counts are significantly different from the expected cell counts.
The calculation of the Chi-Square statistic is quite straight-forward and intuitive:
where fo = the observed frequency (the observed
counts in the cells)
and fe = the expected frequency if NO relationship
existed between the variables
Null hypothesis: Assumes that there is no association between the two variables.
Alternative hypothesis: Assumes that there is an association between the two variables.
As depicted in the formula, the Chi-Square statistic is based on the difference between what is actually observed in the data and what would be expected if there was truly no relationship between the variables.
Limitation :
Chi-square is highly sensitive to sample size. As sample size increases, absolute differences become a smaller and smaller proportion of the expected value. What this means is that a reasonably strong association may not come up as significant if the sample size is small, and conversely, in large samples, we may find statistical significance when the findings are small and uninteresting., i.e., the findings are not substantively significant, although they are statistically significant.
Correlation :
Correlation is a statistical technique that can show whether and how strongly pairs of variables are related. Like all statistical techniques, correlation is only appropriate for certain kinds of data. Correlation works for quantifiable data in which numbers are meaningful, usually quantities of some sort. It cannot be used for purely categorical data, such as gender, brands purchased, or favorite color.
The main result of a correlation is called the correlation coefficient (or "r"). It ranges from -1.0 to +1.0. The closer r is to +1 or -1, the more closely the two variables are related. If r is close to 0, it means there is no relationship between the variables. If r is positive, it means that as one variable gets larger the other gets larger. If r is negative it means that as one gets larger, the other gets smaller (often called an "inverse" correlation).
Limitation :
So to sum it all up , t-test is a test for comparing mean(s) of the sample(s) ; ANOVA test is used to check if a group of means are equal or not(can be used for more than two means) and also the interaction are statistically significant or not(Two way ANOVA) ; chi square test used for testing independence between categorical variables and correlation is the interdependence of variable quantities.
Hope this helps.