In: Statistics and Probability
PART 1: CHOOSING THE RIGHT STATISTIC. Below are sample studies to illustrate how statistical tests depend on the DV. What test is appropriate? Be specific about the type of test, for example: T-tests (One sample, Independent Sample, Paired) ANOVA (1-way, 2-way, Factorial) Chi-square (Independence or Goodness of Fit) Regression (Simple or Multiple) among others. NOT every type of test appears below…
Research Question: Do men and women have different values when choosing a career?
1. Survey data (ratings from 1-7) were collected from a community sample of men and women and combined into a Life Roles Inventory scale (DV, Interval scale). Assume the DV is normally distributed in the population and test to see if the mean ratings of life values are different for men and women (IV, 2 groups).
2. What if you wanted to see if the mean of a scale regarding the values assigned to different careers (DV, Interval scale) differed not only between men and women (IV1) but also between four age groups (IV2): <30, 31-45, 46-60, >61years)? You might predict differences related to both gender and age separately, or interactions of gender and age.
3. If the data were counts (DV, Nominal) of how many men and women (IV1, 2 groups) were actually working in various professions: engineers, nurses, physicians, and elementary school teachers (IV2, 4 groups) How would we test for differences in the % of men and women holding different jobs?
Research Question: Do rental rates differ by neighborhood?
4. An apartment rental agent claims that mean rental rates for two-bedroom apartments are the same in three parts of the city. To test this claim, students randomly sample apartment complexes in each neighborhood and collect rental data (DV, Interval) from 15 apartments in each of the three areas or regions (IV, 3 groups). Assuming that the data were normally distributed, what type of test statistic should they use to compare the means across the three neighborhoods?
5. Suppose the data being compared across the three areas was the percentage (or counts) of low-income housing (DV, Nominal) in each area (IV, 3 groups). Area 1 had 11%, Area 2 had 15%, and Area 3 had 23% low-income apartments. What test could we use to compare percentages across three categories?
Please help, thanks!
Firstly, the test statistic for the following statistical tests firstly satisfies the following purposes:
1. T-test
One sample: Applied on a continuous variable, to compare the population mean to a hypothesized value, where, we test,
Vs ......(depending on left tailed, right-tailed and twp tailed test)
Test statistic:
Independent sample:
Applied on 2 continuous variables, to compare the population means of the groups, where, we test,
Vs ......(depending on left tailed, right-tailed and twp tailed test)
Test statistic:
where
Paired sample:
Applied on 2 paired continuous variables, to compare the population means of the paired groups, where, we test,
Vs ......(depending on left tailed, right-tailed and twp tailed test)
where, = mean of the paired differences.
ANOVA:
One way: Applied on a continuous variable and categorical variable (> 2 levels), to compare the dependent continuous variable across the categories / levels of the independent variable
Vs Ha: Not all means are equal
Test statistic F = Between treatment mean square / Within mean square.
Two way:
Applied on a continuous variable and 2 categorical variables (> 2 levels) - Factor A and Factor B, to compare the dependent continuous variable across the categories / levels of the independent variables
H01: Factor A has no effect on the dependent variable Vs Ha1: Factor A has a significant effect
H02: Factor B has no effect on the dependent variable Vs Ha2: Factor B has a significant effect
H03: There is no interaction between Factor A and Factor B Vs Ha: There is a significant interaction between Factor A and Factor B
Factorial: The same ANOVA procedure for more than 2 independent groups.
* All these parametric tests mentioned above, assumes that the data is normally distributed.
Chi-square:
Applied on Count data: A test to study the association between two nominal or categorical variables or goodness of fit by comparing the expected and observed frequencies:
Vs
Test statistic:
Regression:
Involves establishing a linear causal relationship between two variables.
===============================================================
1. Here, a continuous (interval scale ) dependent variable - Life Roles Inventory scale is compared across two independent categories (Men and Women) of an independent variable - Gender
Assuming the DV is normally distributed in the population, as mentioned as the start,
The appropriate statistical test would be an independent sample t-test
2. Here, we are asked to compare the mean of a continuous variable - scale, regarding the values assigned to different careers across multiple categories of two nominal independent variables - Gender (Men and Women) and Age groups (<30, 31-45, 46-60, >61years). Also, in addition to studying the main effects of these two variables on the scale, we are also asked to determine whether their interaction affects the scale.
Hence, The appropriate statistical test would be a Two Way ANOVA.
3. Here, we are dealing with count data, where, we are asked to study the association between two categorical variables - Gender (Men and Women) and Profession (engineers, nurses, physicians, and elementary school teachers).
Hence, The appropriate statistical test would be a Chi-square test for Association/Independence.
4. Here, we are asked to compare the continuous dependent variable - Rate of apartments across the three categories of independent variable - Area/Regions.
Hence, The appropriate statistical test would be a One Way ANOVA.
5. Here, we are dealing with count data, where, we are asked to study the distribution of low-income housing in 3 areas, i.e we have to compare the count data (in %s) across 3 categories, to see whether the counts follow a particular distribution.
Hence, The appropriate statistical test would be a Chi-square test of goodness of fit.