In: Statistics and Probability
You have now been asked to study the yearly mean sales of cameras of two competing models at stores throughout the United States. You will also study the proportions of cameras sold that include certain lenses at a large store that sells both lenses. The specific questions you will be asked to answer are stated below. In addition, appropriate sample data for the studies you will be accomplishing are given below. Answer the following questions concerning the situations posed.
4) In addition, you wish to concern yourself with a comparison of the proportions of the sales of the two camera bodies that include the purchase of a certain lens that fits the body of the camera being studied. Just as the camera bodies are considered competing models, so are the two lenses that may or not be included with the sales of the camera bodies. Random samples of yearly sales of both camera bodies are selected. It is observed with each purchase whether the lens has also been purchased. The data concerning whether the lens is included with the purchase of the camera body is shown in appendix two below. At each of the 10% and 5% levels of significance, is the proportion of Nikon D5 purchases that include the purchase of a certain type of lens at the same time less than the proportion of Canon purchases that includes the purchase of an equivalent Canon lens? If possible, construct both 90% and 95% confidence intervals for the difference in the population proportions of sales of the camera bodies at the stores that include the sale of the lenses. Explain their meaning. Do not use these intervals to perform any hypothesis tests.
Appendix One: (Sales of Camera Bodies)
Nikon D5: 131 145 150 156 176 154 138 122 130 235 165 168 221 229 154 155 154 160 154 144 240 143 232 238 130
Canon Model: 138 140 237 147 170 155 232 228 135 130 161 160 220 229 155 158 150 250 248 246 139 233 133 230 126
Appendix Two: (Includes the Purchase of a Lens? Y = yes, N = no)
Nikon D5: Y N N N Y Y N Y N Y Y N N N N Y Y Y N Y Y N N N N N N Y N N N Y N N N N
Canon Model: N N Y Y Y N N Y N Y N N Y Y Y Y N N N N Y N Y N Y N N N Y Y Y Y Y N Y Y
Comparison of the proportions of the sales of the two camera bodies that include the purchase of a certain lens that fits the body of the camera:
We need to answer the question:
At each of the 10% and 5% levels of significance, is the proportion of Nikon D5 purchases that include the purchase of a certain type of lens at the same time less than the proportion of Canon purchases that includes the purchase of an equivalent Canon lens?
So we have to conduct a hypothesis testing to compare the population proportion of Nikon D5 and Canon purchases with the appropriate lens.
Let's define
pn : proportion of Nikon D5 purchases with the appropriate lens
pc : proportion of Canon purchases with the appropriate lens.
Null hypothesis: H0 : pn = pc or pn - pc = 0
Alternative hypothesis: H1 : pn < pc or pn - pc < 0
Nikon samples:
n = sample size= 36
sample estimate of pn = 13/36 = 0.3611111
Canon samples:
n = sample size= 36
sample estimate of pc = 19/36 = 0.5277778
Test statistic:
Substituting the estimates of pn , pc and n in test statistic we get the observed test statistic value as: -4.170309
Since the sample size is 36 (quite high, more than 30) here, we can do the large sample testing here according to which the test statistic follows standard normal distribution under H0.
As this is a left tailed test, p_value of the test is P(z < -4.170309) = 1.520935e-05
where z follows the standard normal distribution.
At 10% level of significance : p-value <0.10 , H0 is rejected.
At 5% level of significance : p-value <0.05 , H0 is rejected.
So at both the level of significance, null hypothesis is rejected and based on the sample evidence we can say that the proportion of Nikon D5 purchases that include the purchase of a certain type of lens at the same time is significantly less than the proportion of Canon purchases that includes the purchase of an equivalent Canon lens.
(1-)100% confidence interval for the difference in the population proportions pn - pc is given by:
where = 0.03996507 (here)
90% confidence interval:
ie. (-0.1666667 - (1.644854*0.03996507), -0.1666667 + (1.644854*0.03996507))
ie. (-0.23240340525, -0.10092999475)
If we repeat the experiment 100 times, 90 times this confidence interval will be able to include the true population proportion difference.
95% confidence interval:
ie. (-0.1666667 - (1.959964*0.03996507), -0.1666667 + (1.959964*0.03996507))
ie. (-0.24499679845, -0.08833660154)
If we repeat the experiment 100 times, 95 times this confidence interval will be able to include the true population proportion difference.