In: Statistics and Probability
Birth and infant death data for all children born in the state of North Carolina dating back to 1968. The data set for the births in 2001 contains 120,300 records. The data represents a random sample of 800 of those births and selected variables. My goal is to use the data set to test if there is an association between premature births (PREMIE) and smoking during pregnancy (SMOKE) using α=.05
I am going to calculate the critical value of the test statistic and the calculated test statistic. to compare the means, would this be a One sample T-test, Independent Sample T-Test, Paired Sample T-test, or One way/ANOVA test??
Thanks!
None of the test you have listed is applicable for the goal you want to achieve. You will have to perform Chi-square test/logistic regression. This is because in your case both the variables Premature births and Smoking during pregnancy are categorical, so you would not be able to calculate mean.
You will have to prepare table as follows (Values will be entered as per your sample data of 800):
Smoking During Pregnancy | Not Smoking During Pregnancy | Total | |
Premature Birth | (Here you write the no. of women who smoked and gave premature birth) | (Here you write the no. of women who did not smoke but gave premature birth) | (Total premature birth) |
Not premature birth | (Here you write the no. of women who smoked but did not gave premature birth) | (Here you write the no. of women who did not smoke and did not gave premature birth) | (Total nor premature birth) |
Total | (Total women who smoked during pregnancy) | (Total women who did not smoke during pregnancy) | 800 |