In: Statistics and Probability
In 4 sentences, explain when would be appropriate to choose the following tests? [Hint: type of variables involved, information the test provides] (2 pts each)
a) Chi-squared test
b) Kaplan-Meier Survival Curve
c) Correlation Analysis
d) Logistic Regression
e) Tukey’s HSD with ANOVA
(a) Chi square test is used to check the difference between two or more than two categorical variables. It is also used to detect the association among the categorical variable.
For example, If we have two categorical variable like gender (male and female) and disease (yes and no) then chi square test is used to detect the association between risk and composer.
(b) Survival analysis is used, when the data is time to event and survival curve is used to compare the proportion of two time to event category.
For example, if we have life time data of male and female then kaplan meier curve is used two compare the survival probability of male and female. This curve tell us the survival probability in male and female and compare whose survival probability is higher to eachother.
(c) Correlation analysis is used to find out the correlation among the quantitative variables as we find out the association among categorical variables.
For example, If we have two quantitative variable like Height and Body mass index (BMI) then we can perform correlation analysis to calculate the correlation coefficient between height and BMI.
(d) Logistic regression is used to illustrate the impact of qualitavie and quantitative variables on the qualitavie variable.
For example, If we have data like gender, height, weight body mass index disease present and absent. Now if i want to perform the logistic regression then our dependent variable would be either gender or disease because it is categorical and the other hand height, weight, and bmi would be the explanatory varibales.
(e) When we have more than two independent groups and the dependent variable is quantitative and we have to compare the group mean then we must perform the one way anova if the data is normal.
After performing anova we check the p value of the test, if the p value is showing the significant difference in the group of mean then we use Tukey's HSD (highest significant difference) to detect which pair of group is highly significant.
Anova tell us only the difference among the group and multiple comparison test like Tukey's HSD, Fisher's LSD tell us which pair of group is significant.