In: Statistics and Probability
Recall that the population average of the heights in the file "pop1.csv" is μ = 170.035. Using simulations we found that the probability of the sample average of the height falling within 1 centimeter of the population average is approximately equal to 0.626. From the simulations we also got that the standard deviation of the sample average is (approximately) equal to 1.122. In the next 3 questions you are asked to apply the Normal approximation to the distribution of the sample average using this information. The answer may be rounded up to 3 decimal places of the actual value:
1- Using the Normal approximation, the probability that sample average of the heights falls within 1 centimeter of the population average is _________
2- Using the Normal approximation we get that the central region that contains 95% of the distribution of the sample average is of the form 170.035 ± z · 1.122. The value of z is ________
3- Using the Normal approximation, the probability that sample average of the heights is less than 168 is ______
4- According to the Internal Revenue Service, the average length of time for an individual to complete (record keep, learn, prepare, copy, assemble and send) IRS Form 1040 is 10.53 hours (without any attached schedules). The distribution is unknown. Let us assume that the standard deviation is 2 hours. Suppose we randomly sample 36 taxpayers and compute their average time to completing the forms. Then the probability that the average is more than 11 hours is approximately equal to (The answer may be rounded up to 3 decimal places of the actual value.)
_____________
Suppose that a category of world class runners are known to run a marathon (26 miles) in an expectation of 145 minutes with a standard deviation of 14 minutes. Consider 49 of the races. In the next 3 questions you are asked to apply the Normal approximation to the distribution of the sample average using this information. The answer may be rounded up to 3 decimal places of the actual value:
5- The probability that the runner will average between 142 and 146 minutes in these 49 marathons is ______
6- The 0.80-percentile for the average of these 49 marathons is_____
7- The median of the average running time is_____
8- The time to wait for a particular rural bus is distributed uniformly from 0 to 75 minutes. 100 riders are randomly sampled and their waiting times are measured. The 90th percentile of the average waiting time (in minutes) for a sample of 100 riders is (approximately):
Select one:
a. 315.0
b. 40.3
c. 38.5
d. 65.2
______________
A switching board receives a random number of phone calls. The expected number of calls is 5.3 per minute. Assume that the distribution of the number of calls is Poisson. The average number of calls per minute is recorded by counting the total number of calls received in one hour, divided by 60, the number of minutes in an hour. In the next 4 questions you are asked to apply the Normal approximation to the distribution of the sample average using this information. The answer may be rounded up to 3 decimal places of the actual value:
9- The expectation of the average is____
10- The standard deviation of the average is____
11- The probability that the average is less than 5 _____
12- The probability that number of calls in a random minute is less than 5 is _____ (Note, the question is with respect to a random minute, and not the average.)
______________
It is claimed that the expected length of time some computer part may work before requiring a reboot is 2 months. In order to examine this claim 80 identical parts are set to work. Assume that the distribution of the length of time the part can work (in months) is Exponential. In the next 4 questions you are asked to apply the Normal approximation to the distribution of the average of the 80 parts that are examined. The answer may be rounded up to 3 decimal places of the actual value:
13- The expectation of the average is______
14- The standard deviation of the average is_____
15- The central region that contains 90% of the distribution of the average is of the form E(X) ± c, where E(X) is the expectation of the sample average. The value of c is ______
16- The probability that the average is more than 2.5 months is ______
Solution
Back-up Theory
If a random variable X ~ N(µ, σ2), i.e., X has Normal Distribution with mean µ and variance σ2, then, Z = (X - µ)/σ ~ N(0, 1), i.e., Standard Normal Distribution and hence
P(X ≤ or ≥ t) = P[{(X - µ)/σ} ≤ or ≥ {(t - µ)/σ}] = P[Z ≤ or ≥ {(t - µ)/σ}] .…………...............................................……...…(1)
Probability values for the Standard Normal Variable, Z, can be directly read off from Standard Normal Tables ....… (2a)
or can be found using Excel Function: Statistical, NORMSDIST(z) which gives P(Z ≤ z) ….......................................(2b)
Now, to work out the solution,
Q1, 2 and 3
Let X represent the height in centimeter..
We are given:
Population mean, µ = 170.035; standard deviation of sample average, σXbar = 1.122 ............................................ (3)
Q1
Vide (3),
within 1 centimeter of the population average is: 170.035 – 1 to 170.035 + 1 i.e., 169.035 to 171.035....................(4)
Probability that the sample average is within 1 centimeter of the population average is:
= P(169.035 < Xbar < 171.035) [vide (4)]
= P[{(169.035 – 170.035)/1.122} < Z < {(171.035 – 170.035)/1.122}] [vide (1) and (3)]
= P(- 0.8937 < Z < 0.8937)
= P(Z < 0.8937) - P(Z < - 0.8937)
= 0.8131 – 0.1869 [vide (2b)]
= 0.6262 Answer 1
[Note that this probability is already given in the first para of the question]
Q2
The central region that contains 95% of the distribution of the sample average is of the form: µ ± z0.25.σxbar where
z0.25 is the upper 2.5% point of N(0, 1) = 1.96 [vide (2b)].
Thus, the required z-value = 1.96 Answer 2
Q3
The probability that sample average of the heights is less than 168 is
P(Xbar < 168)]
= P[Z < {(168 – 170.035)/1.122}] [vide (1) and (3)]
= P(Z < - 1.8137)
= 0.0349 [vide (2b)]Answer 3
Q4
Back-up theory
CENTRAL LIMIT THEOREM
Let {X1, X2, …, Xn} be a sequence of n independent and identically distributed (i.i.d) random variables drawn from a distribution [i.e., {x1, x2, …, xn} is a random sample of size n] of expected value given by µ and finite variance given by σ2. Then, as n gets larger, the distribution of Z = {√n(Xbar − µ)/σ}, approximates the normal distribution with mean 0 and variance 1 (i.e., Standard Normal Distribution)
Or symbolically, Z = {√n(Xbar − µ)/σ} ~ N(0, 1) …………………………….………........................................………………… (5a)
i.e., sample average from any distribution with mean µ and variance σ2, follows Normal Distribution with
mean µ and variance σ2/n, if the sample size, n is large enough, say 30 or more. ............................................................... (5b)
Now, to work out the solution,
Let Y = length of time (in hours) for an individual to complete IRS Form 1040.
Given, µ = 10.53, σ = 2 and sample size = 36, vide (5b),
Ybar ~ N(10.53, 2/6) .............................................................................................................................................................. (6)
The probability that the average is more than 11 hours is
=P(Ybar > 11)
= P[Z > {(11 – 10.53)/0.6667 [vide (1) and (6)]
= P(Z > 0.7050)
= 0.2404 [vide (2b)] Answer 4
DONE