In: Statistics and Probability
Never forget that even small effects can be statistically significant if the samples are large. To illustrate this fact, consider a sample of 144 small businesses. During a three-year period, 14 of the 99 headed by men and 8 of the 45 headed by women failed.
(a) Find the proportions of failures for businesses headed by
women and businesses headed by men. These sample proportions are
quite close to each other. Give the P-value for the test of the
hypothesis that the same proportion of women's and men's businesses
fail. (Use the two-sided alternative). What can we conclude (Use
α=0.05α=0.05)?
The P-value was  so we conclude that
Choose a conclusion. The test showed strong evidence of a
significant difference. The test showed no significant
difference.
(b) Now suppose that the same sample proportion came from a
sample 30 times as large. That is, 240 out of 1350 businesses
headed by women and 420 out of 2970 businesses headed by men fail.
Verify that the proportions of failures are exactly the same as in
part (a). Repeat the test for the new data. What can we
conclude?
The P-value was  so we conclude that
Choose a conclusion. The test showed strong evidence of a
significant difference. The test showed no significant
difference.
(c) It is wise to use a confidence interval to estimate the size
of an effect rather than just giving a P-value. Give 95% confidence
intervals for the difference between proportions of men's and
women's businesses (men minus women) that fail for the settings of
both (a) and (b). (Be sure to check that the conditions are met. If
the conditions aren't met for one of the intervals, use the same
type of interval for both)
Interval for smaller samples:  to  
Interval for larger samples:
a)
Ho:   p1 - p2 =   0  
       
Ha:   p1 - p2 ╪   0  
       
          
       
sample #1   ----->      
       
first sample size,     n1=  
99          
number of successes, sample 1 =     x1=  
14          
proportion success of sample 1 , p̂1=  
x1/n1=   0.1414141      
   
          
       
sample #2   ----->      
       
second sample size,     n2 =   
45          
number of successes, sample 2 =     x2 =
   8      
   
proportion success of sample 1 , p̂ 2=   x2/n2 =
   0.177778      
   
          
       
difference in sample proportions, p̂1 - p̂2 =    
0.1414   -   0.1778   =  
-0.0364
          
       
pooled proportion , p =   (x1+x2)/(n1+n2)=  
0.1527778          
          
       
std error ,SE =    =SQRT(p*(1-p)*(1/n1+
1/n2)=   0.06468      
   
Z-statistic = (p̂1 - p̂2)/SE = (   -0.036  
/   0.0647   ) =   -0.562
          
       
p-value =       
0.5740   [excel formula
=2*NORMSDIST(z)]      
The test showed no significant difference.
b)
Ho:   p1 - p2 =   0  
       
Ha:   p1 - p2 ╪   0  
       
          
       
sample #1   ----->      
       
first sample size,     n1=  
1350          
number of successes, sample 1 =     x1=  
240          
proportion success of sample 1 , p̂1=  
x1/n1=   0.1777778      
   
          
       
sample #2   ----->      
       
second sample size,     n2 =   
2970          
number of successes, sample 2 =     x2 =
   420      
   
proportion success of sample 1 , p̂ 2=   x2/n2 =
   0.141414      
   
          
       
difference in sample proportions, p̂1 - p̂2 =    
0.1778   -   0.1414   =  
0.0364
          
       
pooled proportion , p =   (x1+x2)/(n1+n2)=  
0.1527778          
          
       
std error ,SE =    =SQRT(p*(1-p)*(1/n1+
1/n2)=   0.01181      
   
Z-statistic = (p̂1 - p̂2)/SE = (   0.036  
/   0.0118   ) =   3.079
          
       
  
p-value =       
0.0021   [excel formula =2*NORMSDIST(z)]  
   
.The test showed strong evidence of a significant
difference.
c)
level of significance, α =   0.05  
           
Z critical value =   Z α/2 =   
1.960   [excel function: =normsinv(α/2)  
   
          
       
Std error , SE =    SQRT(p̂1 * (1 - p̂1)/n1 + p̂2 *
(1-p̂2)/n2) =     0.06689  
       
margin of error , E = Z*SE =    1.960  
*   0.0669   =   0.13111
          
       
confidence interval is       
           
lower limit = (p̂1 - p̂2) - E =    -0.036  
-   0.1311   =   -0.1674721
upper limit = (p̂1 - p̂2) + E =    -0.036  
+   0.1311   =   0.0947448
          
       
so, confidence interval is (  
-0.1675   < p1 - p2 <  
0.0947   )  
--------------------
level of significance, α =   0.05  
           
Z critical value =   Z α/2 =   
1.960   [excel function: =normsinv(α/2)  
   
          
       
Std error , SE =    SQRT(p̂1 * (1 - p̂1)/n1 + p̂2 *
(1-p̂2)/n2) =     0.01221  
       
margin of error , E = Z*SE =    1.960  
*   0.0122   =   0.02394
          
       
confidence interval is       
           
lower limit = (p̂1 - p̂2) - E =    -0.036  
-   0.0239   =   -0.0603007
upper limit = (p̂1 - p̂2) + E =    -0.036  
+   0.0239   =   -0.0124266
          
       
so, confidence interval is (  
-0.0603   < p1 - p2 <  
-0.0124   )