In: Statistics and Probability
Data set for ages of students in stats class:
17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 19, 19, 19, 19,19, 20, 20, 20, 21, 21, 21, 21, 21, 21, 22, 22, 22, 22, 23, 23, 23, 23, 24, 25, 25, 25, 26, 26, 27, 28, 28, 28, 28, 29, 29, 32, 36, 42
1) Find the standard deviation of the ages and use it to calculate a 95% confidence interval for the true mean age (be sure to use the correct degrees of freedom). Do the median and the mode fall into this range?
2) Someone once told me that the average age of PSU students is 26. Does this value fall within the confidence interval? If 26 is the true mean and σ=5, what is the probability that we would observe a sample mean in our sample (n=49) that is less than 24? Given these answers, would you agree that 26 is the true mean?
3) Find the proportion of students who can legally drink alcohol in the State of Oregon (students who are 21 or over). Use this to find a 90% confidence interval for the true proportion of PSU students that can drink. If we wanted the margin of error to be less than .05, what is the minimum sample size that we would need?
4) Assume that student ages are normally distributed and that the mean and standard deviation of the sample are the true population values. Under this assumption, what would be the probability of that someone would be able to drink, to wit, find P(X>21)? Does this value fall within the confidence interval you found in problem 4?
5) Compute the 5-number summary and create a modified box-plot, using the inner-quartile range to identify any outliers.
6) Construct a histogram of the data (starting at 16.5, use a class width of 3). Is the data skewed? If so, is it skewed right or left? What factors in the population (such as a natural boundary) may be causing this? How does this affect problems 2 and 4?
7) Suppose we are trying to use this sample to infer something about the ages of all PSU students. Explain why this is not a truly random
sample and how it may be biased compared to the PSU student population at large. How does this affect problem 2?
8) Find a 95% confidence interval for the true standard deviation of ages.