In: Math
A survey was conducted to study if parental smoking is associated with the incidence of smoking in children when they reach high school. Randomly chosen high school students were asked whether they smoked and whether at least one of their parents smoked.
The results are summarized in the following table:
Student Smoke Student Don’t
Parents Smoke 262 183
Parents Don’t 120 380
(a) For a randomly selected student in this study, find the conditional probability of smoking given his/her parents smoke.
(b) Suppose we are interested in testing whether parental smoking is independent of children smoking. Which statistical test would you consider for this problem?
(c) (4 points) Write down the R code to carry out that test. You first need to store the data into a matrix.
(d) Calculate the test statistic by yourself.
(e) Write down the R code to obtain the p-value based on your
answer in
part(d).
(f) Suppose the p-value is 0.0001, what would be your really world
conclusion? (You may use α = 0:05.)
A survey was conducted to study if parental smoking is associated with the incidence of smoking in children when they reach high school. Randomly chosen high school students were asked whether they smoked and whether at least one of their parents smoked.
The results are summarized in the following table:
StudentSmoke |
StudentDon’t |
Total |
|
ParentsSmoke |
262 |
183 |
445 |
ParentsDon’t |
120 |
380 |
500 |
Total |
382 |
563 |
945 |
(a) For a randomly selected student in this study, find the conditional probability of smoking given his/her parents smoke.
P=262/445 =0.58876
(b) Suppose we are interested in testing whether parental smoking is independent of children smoking. Which statistical test would you consider for this problem?
Chi square test
(c) (4 points) Write down the R code to carry out that test. You first need to store the data into a matrix.
x<-c(262,183)
y<-c(120,380)
mydata <- data.frame(x,y)
chisq.test(mydata)
R output:
Pearson's Chi-squared test with Yates' continuity correction
data: mydata
X-squared = 117.48, df = 1, p-value < 2.2e-16
(d) Calculate the test statistic by yourself.
Chi-Square Test |
||||||
Observed Frequencies |
||||||
Column variable |
Calculations |
|||||
StudentSmoke |
StudentDon’t |
Total |
fo-fe |
|||
ParentsSmoke |
262 |
183 |
445 |
82.1164 |
-82.1164 |
|
ParentsDon’t |
120 |
380 |
500 |
-82.1164 |
82.1164 |
|
Total |
382 |
563 |
945 |
|||
Expected Frequencies |
||||||
Column variable |
||||||
StudentSmoke |
StudentDon’t |
Total |
(fo-fe)^2/fe |
|||
ParentsSmoke |
179.8836 |
265.1164 |
445 |
37.4859 |
25.4345 |
|
ParentsDon’t |
202.1164 |
297.8836 |
500 |
33.3625 |
22.6367 |
|
Total |
382 |
563 |
945 |
|||
Data |
||||||
Level of Significance |
0.05 |
|||||
Number of Rows |
2 |
|||||
Number of Columns |
2 |
|||||
Degrees of Freedom |
1 |
|||||
Results |
||||||
Critical Value |
3.841 |
|||||
Chi-Square Test Statistic |
118.92 |
|||||
p-Value |
0.0000 |
|||||
Reject the null hypothesis |
(e) Write down the R code to obtain the p-value based on your
answer in
part(d).
pchisq(118.92, df=1, lower.tail=FALSE)
(f) Suppose the p-value is 0.0001, what would be your really world conclusion? (You may use α = 0:05.)
Since p value 0.0001 < 0.05 level of significance, Ho is rejected.
We conclude that parental smoking is not independent of children smoking