Question

In: Statistics and Probability

Describe how to obtain a p-value for a chi-squared test for goodness of fit. Then describe...

  1. Describe how to obtain a p-value for a chi-squared test for goodness of fit. Then describe how to obtain a p-value for a chi-squared test for independence. Make sure how to point out the differences from your answer to the question above.

Please use simple terms! I have no idea what's going on.

Solutions

Expert Solution

In a chi-squared test for goodness of fit, the given data set is compared to a hypothesized distribution which it is expected to follow. Generally, we are given a frequency distribution of the values, and then the expected frequencies are computed bya ssuming that it follows certain discrete distribution. For this hypothesized distribution, either a standard distribution function might be given with defined paraeters, ex. Normal distribution with certain mean and variance, Poisson distribution with certain rate, Uniform distribution etc. or we might be given an explicitly defined discrete distribution.

To find the p-value, we first need to test statistic. This is obtained as

To find the p-value, we need to define the degrees of freedom of the statistic. This is one less than the number of categories or data values.

The p-value is finally obtained using a Chi-square calculator, for this statistic value

This might not be easy to understand just theoretically. So here is an illustration on this type of Chi-square problem.

We need to conduct the Chi-square Goodness of Fit test, based on a given discrete probability distribution.

Null Hypothesis, H0: New machines follow the same probability distribution to perform the job as the old machines

Alternate hypothesis, Ha: New machines do not follow the same probability distribution to perform the job as the old machines

We can compute the expected frequencies as per below table, by multiplying respective probability with the total frequency. The Chi-square statistic has been computed using

Type Old Machine Observed Frequency Expected Frequency Chi-Square Statistic
Top Grade 0.4 174 160 1.225
High Grade 0.3 115 120 0.208333333
Medium Grade 0.2 71 80 1.0125
Low Grade 0.1 40 40 0
SUM 1 400 400 2.445833333

Critical Value: The test has 3 degrees of freedom, viz. one less than the number of categories. Hence, the p-value is

The calculator used is

https://www.danielsoper.com/statcalc/calculator.aspx?id=11

In a chi-squared test for independence, the data is given in form of a table across two categories, one along the row and other along the column, which can further have several levels or sub-categories. The objective is to determine whether or not, there is an association between the two categories. That is, does the data depend on the level of each category, or the distribution of values is independent of the levels.

The formula for Chi-square statistic may look similar, but the way to compute it is entirely different, since the data is in form of a table instead of a frequency distribution.

The computation of expected values is also a bit different. First, find the sum of given values along each row and column. To find the expected vaue at the (i, j) cell value, multiply the row sum and column sum along that cell, and divide by the sum total of all observations. Symbolically,

To find the p-value, we need to define the degrees of freedom of the statistic. This is the product of one less than the number of columns, and one less than the number of rows .

The p-value is finally obtained using a Chi-square calculator, for this statistic value

This might not be easy to understand just theoretically. So here is an illustration on this type of Chi-square problem.

(1) We need to perform the Chi-square test for independence between the Verdict, and the categories of seasons. That is, whether the proportion of data differs between the two variables as one is changed against the other. If there is an association, we shall have a significant result.

(2) The hypothesis statements are

H0, Null Hypothesis: The distribution (proportion) of different types of verdicts among Guilt, Not Guilty, Plea Bargain and Other does not change across the different 5-season populations of Suits. That is, the two variables are independent of each other.

H1, Alternate hypothesis:  The distribution (proportion) of at least one out of the verdicts among Guilt, Not Guilty, Plea Bargain and Other, changes across the different 5-season populations of Suits. That is, the two variables depend on each other.

Such problems are best done on excel. since the Chi-square test involves a lot of cross referncing across table repeatedly. The process is to first find the expected values for each cell. This is computed as

Hence, the table of expected values can be tabulated as below

Observed Values
Guilty Not Guilty Plea Bargain Other Total
Season 1-5 31 4 26 20 81
Season 6-10 28 7 33 20 88
Season 11-15 33 8 30 19 90
Season 16-20 26 4 24 23 77
Total 118 23 113 82 336
Expected Values
Guilty Not Guilty Plea Bargain Other Total
Season 1-5 28.44642857 5.544642857 27.24107143 19.76785714 81
Season 6-10 30.9047619 6.023809524 29.5952381 21.47619048 88
Season 11-15 31.60714286 6.160714286 30.26785714 21.96428571 90
Season 16-20 27.04166667 5.270833333 25.89583333 18.79166667 77
Total 118 23 113 82 336

For ex, the expected value of Season 6-10, Guilty is obtained as product of 118 with 81 (row sum and column sum), divided by 336, the total sum.

Next step is to compute the Chi-square statistic. This is computed as the summation of squared deviations of observed and expected values, divided by the expected value for all cells in the table.

This is again computed by using above tables as

Chi-Square Statistic
Guilty Not Guilty Plea Bargain Other Total
Season 1-5 0.22922832 0.430311134 0.056541766 0.002726158 4.083888569
Season 6-10 0.273020765 0.158196876 0.391698272 0.101467638
Season 11-15 0.061380145 0.549120083 0.002370417 0.400058072
Season 16-20 0.040125835 0.306406456 0.138793913 0.94244272

To make a conclusion, we either need to find the p-value of this observed statistic, or find a critical value of the Chi square table against a certain degrees of freedom and level of significance. Since no level of significance is specified let us find the p-value.

The degrees of freedom of the test is

Hence, The p-value is obtained from the Chi-square probability calculator for 9 degrees of freedom

Conclusion: The p-value is very large, since the general criteria for a significant result is that the p-value should be less than 0.05, 0.01 etc.  Thus, we have a strong evidence that the two variables are independent. Hence we must accept the null hypothesis and conclude that the distribution (proportion) of different types of verdicts among Guilt, Not Guilty, Plea Bargain and Other does not change across the different 5-season populations of Suits

Excel Link: https://drive.google.com/file/d/1doePWbdFk51yp1HQBs-KrPbWFbWUQtuk/view?usp=sharing


Related Solutions

Compare the chi-squared test for goodness of fit to the chi-squared test for independence. Be sure...
Compare the chi-squared test for goodness of fit to the chi-squared test for independence. Be sure to mention number of samples and the number of levels of categories for your comparison. Also provide the alternative hypothesis for both.
Describe how to obtain a p-value for a chi-squared test for independence. Describe how the two...
Describe how to obtain a p-value for a chi-squared test for independence. Describe how the two sample means test is different from the paired means test, both conceptually and in terms of the calculation of the standard error. What visualizations are useful for checking each of the conditions required for performing ANOVA?
For a Chi-Squared Goodness of Fit Test about a uniform distribution, complete the table and find...
For a Chi-Squared Goodness of Fit Test about a uniform distribution, complete the table and find the test statistic. Round to the fourth as needed. Categories Observed Frequency Expected Frequency 1 23 2 39 3 50 4 32 5 15 6 31 Test Statistic:
Create an experiment which would use a Chi-Squared for Goodness of Fit test in order to...
Create an experiment which would use a Chi-Squared for Goodness of Fit test in order to examine a topic of either race or gender in society. What is your independent variable? please help me with this assignment, but only if you're willing to do it and not copy and paste someone else's work. thank you in advance.
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an...
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an example and explain how the test is used to evaluate the data.
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an...
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an example and explain how the test is used to evaluate the data.
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an...
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an example and explain how the test is used to evaluate the data.
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an...
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an example and explain how the test is used to evaluate the data
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an...
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an example and explain how the test is used to evaluate the data.
Choose either the Chi Square Goodness of Fit test OR the Chi Square Test for Independence....
Choose either the Chi Square Goodness of Fit test OR the Chi Square Test for Independence. Give an example of a research scenario that would use this test, including your hypothesis AND what makes the test suitable for your variables chosen
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT