In: Statistics and Probability
Now generate at least 5000 bootstrap samples and observe the bootstrap distribution.a.What does each dot in the distribution represent?b.Where is the middle of the distribution?c.What is the standard error for the distribution?d.Use the standard errorto compute a 95% confidence interval for the correlation.e.Now use the percentile method to compute a 95% confidence interval. (remember, click on the ‘Two-Tail’ box in the distribution plot). Are the two 95% confidence intervalsvery different?5.Using the confidence interval from part 4 (either one), can we claim that there is an association between car price and depreciation? Hint: An association would mean the population correlation is not zero –we can only claim association if zero is nota possible value for rho. 6.Would you answer to part 5 change if we built a 99% confidence interval? Hint: Does the interval get wider or narrow when we go to 99%.Does this change whether or not 0 is included in the interval?Activity 3: Are female students more likely to smoke than male students?Still in StatKey, go to bootstrap CI for a difference inproportions. We are going to use data from a sample of 169 female students and 193 male students. Each participant was asked whether or not she or he smoked. Our task is to build confidence intervals for the difference in proportion of students who smoke when comparing females to males. Select the Data set ‘Student Survey: Smoke by Gender?’.1.What arethe sample sizes?2.What is the value and notation for the sample statistic?3.Find a 90% confidence interval using at least 5000 bootstrap statistics. 4.Using your answerfrom question 3:a.At 90% confidence, can we claim that there is a difference in the proportion of smokers when comparing females to males?b.What about at 99% confidence?
1. a bootstrap sample - a random sample taken withreplacement from the original sample, of the same size as theoriginal sample.
2.The center of a distribution is the middle of a distribution
3.the standard error (SE) of a statisticis the standard deviation of its sampling distribution or an estimate of that standard deviation. Mathematically, the variance of the sampling distribution obtained is equal to the variance of the population divided by the sample size.
1.Sample sizes n1= 169 and n2 = 193
2. Sample proportion of female = 16/169 = 0.095
Sample proporition of males = 27/193 = 0.140
3. The 90% confidence interval using at least 3000 bootstrap statistics
4. 90% confidence interval (0.125, 0.175)
5. Yes, there is a difference in the proportion of smokers when comparing females to males
99% confidence interval
The 99% confidence interva; (0.111,0.189)
Yes, there is a difference in the proportion of smokers when comparing females to males
5.
The 90% confidence interval (-0.078, -0.014)