In: Statistics and Probability
Problem 1: Relations among Useful Discrete Probability Distributions. A Bernoulli experiment consists of only one trial with two outcomes (success/failure) with probability of success p. The Bernoulli distribution is
P(X=k) = pkq1-k, k= 0,1
The sum of n independent Bernoulli trials forms a binomial experiment with parameters n and p. The binomial probability distribution provides a simple, easy-to-compute approximation with reasonable accuracy to hypergeometric distribution with parameters N , M and n when n/N is less than or equal to 0.10. In this case, we can approximate the hypergeometric probabilities by a binomial distribution with parameters n and p = M/N . Further, the Poisson distribution with mean μ = np gives an accurate approximation to binomial probabilities when n is large and p is small. Child-abuse victims and developing cancer: Truth or myth? Is physical childhood abuse somehow related to the development of cancer later in life? A recent survey revealed that people who have been physically abused as children were 49% more likely to develop cancer as adults.
Assuming that in some part of Quebec the probability is 0.00007 that a child will develop cancer. Therefore, the number Z among 28 572 children that will develop cancer follows a Binomial distribution with parameters p = 0.00007 and n = 28 572. We would like to use the Poisson distribution to approximate these binomial probabilities.
(a) What is the adequate value of the variance of the Poisson distribution to use in order to approximate the precedent binomial distribution?
(b) Find the probabilities of the 11 first possible values of Z (i.e. Z = 0, 1, …, 10) using both the formulas for the binomial distribution and then the Poisson approximation. Plot the two histograms and make a comparison. Is this approximation close enough? Justify!
(c) Use the Poisson probabilities to approximate the binomial probabilities that among 28 572 children
i. None will develop cancer.
ii. At most two will develop cancer.
(d) Using Poisson approximation, calculate the probability that at least seven in a sample of ten children will not develop cancer. Is it a good approximation? Justify!