In: Statistics and Probability
Never forget that even small effects can be statistically significant if the samples are large. To illustrate this fact, consider a sample of 151 small businesses. During a three-year period, 14 of the 99 headed by men and 9 of the 52 headed by women failed.
(a) Find the proportions of failures for businesses headed by
women and businesses headed by men. These sample proportions are
quite close to each other. Give the P-value for the test of the
hypothesis that the same proportion of women's and men's businesses
fail. (Use the two-sided alternative). What can we conclude (Use
α=0.05α=0.05)?
The P-value was so we conclude that
Choose a conclusion. The test showed strong evidence of a
significant difference. The test showed no significant
difference.
(b) Now suppose that the same sample proportion came from a
sample 30 times as large. That is, 270 out of 1560 businesses
headed by women and 420 out of 2970 businesses headed by men fail.
Verify that the proportions of failures are exactly the same as in
part (a). Repeat the test for the new data. What can we
conclude?
The P-value was so we conclude that
Choose a conclusion. The test showed strong evidence of a
significant difference. The test showed no significant
difference.
(c) It is wise to use a confidence interval to estimate the size of an effect rather than just giving a P-value. Give 95% confidence intervals for the difference between proportions of men's and women's businesses (men minus women) that fail for the settings of both (a) and (b). (Be sure to check that the conditions are met. If the conditions aren't met for one of the intervals, use the same type of interval for both
Answer)
A)
P1 = 14/99 = 0.1414141414141
P2 = 9/52 = 0.1730769230769
N1 = 99
N2 = 52
First we need to check the conditions of normality that n1*p1 and n2*p2 both are greater than 5 or not
N1*p1 = 14
N2*p2 = 9
As both the conditions are met, we can use standard normal z test.
Ho : P1 = P2
Ha : P1 is not equal to P2
Test statistics z = (P1-P2)/standard error
Standard error = √p*(1-p)*√{(1/n1+(1/n2)}
P = combined proportion = (14+9)/(99+52)
Z = -0.51
From.z table, P(z<-0.51) = 0.3050
But this is for one tail and our test is two tailed, so
P-Value = 2*0.3050
P-value = 0.61
As the obtained P-Value is greater than the given significance level of 0.05
We fail to reject the null hypothesis.
So there is not enough evidence to support the claim that there is a difference.
B)
P1 = 270/1560 = 0.1730769230769
P2 = 420/2970 = 0.1414141414141
Z = (P1-P2)/Standard error
Standard error = √P*(1-P)*√{(1/n1)+(1/n2)}
N1 = 1560, N2 = 2970
P = (270+420)/(1560+2970)
P = 2.82
From z table, P(z>2.82) = 0.0024
P-value = 2*0.0024 = 0.0048
As the obtained P-Value is less than the given significance level of 0.05
We reject the null hypothesis.
We have enough evidence to support the claim that there is a difference.
C)
For part a
P1 = 14/99
P2 = 9/52
N1 = 99
N2 = 52
As n1*p1 and n2*p2 both are greater than 5, therefore we can use standard normal z table to estimate the interval
From z table, critical value for 95% confidence level is 1.96
Margin of error (MOE) = Z*(√{p1*(1-p1)/n1} +√{P2*(1-P2)/n1})
Z = 1.96
MOE = 0.1236317104738
Confidence interval is given by
(P1-P2) - MOE < (P1-P2) < (P1-P2) + MOE
−0.155294492136 < (P1-P2) < 0.0919689288110
For part B
P1 = 270/1560
P2 = 420/2970
Here also n1*p1 and n2*p2 both are greater than 5
So we can use standard normal z table to estimate the interval
Margin of error (MOE) = 1.96*(√P1*(1-P1)/√N1 + √P2*(1-P2)/√N2)
MOE = 0.0225719588832
Confidence interval is given by
(P1-P2)-MOE<(P1-P2)<(P1-P2)+MOE
0.0090908227795 < (P1-P2) < 0.0542347405460