In: Statistics and Probability
For the following questions, identify the type of test that should be used. Simply use the corresponding letter: A) One-sample z test (for a mean); B) One-sample t-test; C) One-sample z-test for a proportion (or a chi-squared goodness-of-fit); D) Chi-square goodness of fit (and a z-test is not appropriate); E) Two-sample z-test for a difference between proportions (or a chi-squared test for independence); F) Chi-square test for independence (and a z-test is not appropriate); G) Simple regression; H) Multiple regression; I) Two-independent samples t-test (with homogeneity of variance); J) Two-independent samples t-test (without homogeneity of variance); K) Two-related samples t-test; L) One-way (independent measures) ANOVA; M) One-way Repeated measures ANOVA; N) Two-way ANOVA (independent, mixed, or repeated measures); O) Mann-Whitney; P) Wilcoxon; Q) Kruskal-Wallis; R) Friedman; If you are going down the interval/ratio branch, it is safe to use parametric measures, unless something is directly stated that clearly indicates otherwise, or unless the data strongly and unambiguously indicates otherwise. Similarly, it is safe to assume homogeneity of variance unless it is clearly indicated otherwise. Do not try to analyze whether or not the experiment is tenable or practical or flawless. This is not your concern right now. Most of the below were written by students in this class.
7. Suzie wants to see if attending group counseling sessions affects the frequency of fights at-risk children are involved in. She counts the number of fights six children were involved in the month preceding group counseling, the month during group counseling, and the month following group counseling. The data is below. The null hypothesis is that counseling does not affect the number of fights.
1 2 3 4 5 6
Before 20 20 21 20 37 37
During 25 11 20 11 10 11
After 5 0 4 1 0 7
8. Arlene thinks that the agricultural output of a farm is affected by how many days a farmer works and the size of the farm.
9. We want to know whether cats or dogs take longer to eat their dinner. The data is found below.
Cats: 8, 10, 19, 11, 3, 16, 14, 12, 6 Dogs: 8, 10, 10, 11, 9, 11, 10, 12, 9
This is a simple question set related ot identification of correct statistical test for hypothesis .
To know the type of test we need to know ,
1. the type of sampling
2. The type of sample (dependent or independent)
3. What is to be compared?
4. The sample size etc .
Now let us try to analyse the questions one by one
8. In this question we are given with three samples and we are required ot check if there exists a difference in mean between any two samples.
This type of statistical analysis is done in two phase
1. ANOVA which gives as insight into the siginificence of difference if any between samples
2. The Tukey HSD to determine which two samples have difference of means
In othe current case, 6 students (sample size ) are exposed to three conditions (before, interim,after) and we are only interested to know if there is any idffrence due ot treatment or in other words if there is any diffrence in mean between sample.
Thus we only need ot conduct ANOVA to know if there is any siginificent difference and since it is only for one factor (counselling ) for a common sample (repeated tests) we can say that this can be validated by one way ANOVA for repeated measures . Hence option M
b). Arlene thinks that the agricultural output of a farm is affected by how many days a farmer works and the size of the farm.
This has to be determined by a multi variable regression equation . But why?
Because here Arlene is trying to find the impact of days a farmer works and the size of the farm on the output of the farm
Thus regression equation can be
Farm output =A +B*(farmer working days)+C*(Farm Size) (A = intercept, B,C are coefficients of independent variables Farmer working days and farm size respectively)
Hence Option H (Multiple regression)
c). We want to know whether cats or dogs take longer to eat their dinner.
Observe !
What is being asked ?
Data for two indepdent samples of Cats and dogs eating time is given . Aslo the sample size is less than 30 .
We are asked to find if there is any difference between the eating hours of the two sample and if yes then which?
Clearly since the samples are independnet and less than 30 ,we can conduct the t test for two independent samples which is designed specifically for cases like these.
But we have only unravelled half of the story so far.
Why ?
Because upto this point we only know that the t test for indepdent samples has to be conducted.
But there are two test for it
1. T test for same variance of two samples
2. T test for different variance
So let us test for nature of variance first.
Thus we NOW know that the variances are not equal and hence we need to go with the t test for independnet means assuming unequal variances .
Hence option J