In: Statistics and Probability
When evaluating a chi-square test, describe the importance of the goodness of fit test. Provide an example and explain how the test is used to evaluate the data
Solution-:
Uses of chi-square distribution:-
It is used
1) To make a test of goodness of fit.
2) To test the independence of attribute.
3) To test validity of hypothetical ratio.
4) To test homogeneity of several population variance.
5) To equality of several population correlation coefficient.
Note:- Points of inflexion on the chi-square distribution curve are equidistance from its mode.
When to Use the Chi-Square Goodness of Fit Test
The chi-square goodness of fit test is appropriate when the following conditions are :
1) The sampling method is simple random sampling.
2) The variable under study is categorical.
3) The expected value of the number of sample observations in each level of the variable is at least 5.
This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan (3) analyze sample data, and (4) interpret results.
Categorical:-Categorical variables take on values that are names or labels. The color of a ball (e.g., red, green, blue) or the breed of a dog (e.g., collie, shepherd, terrier) would be examples of categorical variables.
Procedure for Chi-Square Goodness of Fit Test:
A. Null hypothesis: In Chi-Square goodness of fit test, the null hypothesis assumes that there is no significant difference between the observed and the expected value.
B. Alternative hypothesis: In Chi-Square goodness of fit test, the alternative hypothesis assumes that there is a significant difference between the observed and the expected value.
E.g. A nationalized bank utilized four teller windows to render fast services to the customers. On a particular day 800 customers were observed, they were given services at the different windows as follows:
Window number |
Number of customer |
1 |
150 |
2 |
250 |
3 |
170 |
4 |
230 |
Test whether the customers are uniformly distributed over the windows.
Solution:
Take
Hypothesis:
1) Here, k=4, p=number of parameter estimated for fitting the probability distribution =p=0
2) Null hypothesis:
All blood types are equally likely.
Alternative hypothesis:
All blood types are not equally likely.
Test statistic:
Under , the expected frequency are:
Blood Type |
Observed Frequency(Oi) |
Expected Frequency (ei) |
Oi^2/ei |
O |
82 |
50 |
134.48 |
A |
80 |
50 |
128 |
B |
20 |
50 |
8 |
AB |
18 |
50 |
6.48 |
Total |
200 |
200 |
276.96 |
The test statistic is,
(Calculated value)
Table value:
(From chi square table)
Decision Rule:
Here,, hence we reject at 5% l.o.s. significance.
Conclusion: We conclude that all blood types are not equally likely.