In: Statistics and Probability
Chi Square? Outline the basic ideas behind the chi-square test of independence. What null and alternative hypotheses are used? How does a chi-square test of homogeneity differ from a chi-square test of independence? Describe a Goodness of Fit Test. What are the cells? How many degrees of freedom are there
Chi square test-
A chi-square (χ2) statistic is a test that measures how expectations compare to actual observed data (or model results). The data used in calculating a chi-square statistic must be random, raw, mutually exclusive, drawn from independent variables, and drawn from a large enough sample.
*Chi square test of independence-
1. Two categorical variables.
2. Two or more categories (groups) for each variable.
3. Independence of observations. There is no relationship between the subjects in each group.
4. Relatively large sample size. Expected frequencies for each cell are at least 1.
Use of Null hypothesis-
A null hypothesis is a hypothesis that says there is no statistical significance between the two variables. It is usually the hypothesis a researcher or experimenter will try to disprove or discredit. An alternative hypothesis is one that states there is a statistically significant relationship between two variables.
Use of Alternative Hypothesis-
The alternative hypothesis is the hypothesis used in hypothesis testing that is contrary to the null hypothesis. It is usually taken to be that the observations are the result of a real effect (with some amount of chance variation superposed).
Chi square test of Homogeneity differ from chi square test of independence-
The Chi Square Test for independence measures the degree of association between two categorical variables from the same population. The Chi Square test for homogeneity assesses if a categorical variable is distributed in the same proportion of counts across two or more different populations.
Goodness of fit test-
The goodness of fit test is a statistical hypothesis test to see how well sample data fit a distribution from a population with a normal distribution. Put differently, this test shows if your sample data represents the data you would expect to find in the actual population or if it is somehow skewed.
Cell-
The Chi-square test is intended to test how likely it is that an observed distribution is due to chance. It is also called a "goodness of fit" statistic, because it measures how well the observed distribution of data fits with the distribution that is expected if the variables are independent.
Degree of freedom-
The degrees of freedom for the chi-square are calculated using the following formula: df = (r-1)(c-1) where r is the number of rows and c is the number of columns. If the observed chi-square test statistic is greater than the critical value, the null hypothesis can be rejected.