In: Statistics and Probability
Data on 72 randomly selected flights departing from the three major NYC airports
dep_delay |
-4 |
-3 |
58 |
-5 |
-5 |
-4 |
-1 |
-1 |
-1 |
-3 |
-5 |
-7 |
-5 |
-4 |
-5 |
-8 |
-2 |
4 |
-1 |
0 |
11 |
-5 |
37 |
22 |
65 |
6 |
-1 |
19 |
16 |
-5 |
178 |
-3 |
-5 |
4 |
-1 |
4 |
15 |
-3 |
-7 |
-6 |
-7 |
-3 |
-5 |
51 |
-4 |
-6 |
-1 |
-7 |
-11 |
2 |
1 |
102 |
-7 |
36 |
11 |
1 |
-6 |
-7 |
-5 |
-3 |
9 |
115 |
58 |
-2 |
-6 |
8 |
-4 |
-7 |
2 |
-5 |
303 |
18 |
(a) Calculate the 90% confidence interval estimate of the average departure delay times for flights that departed from the NYC airports in 2013. Remember to define the variable of interest and state the distribution. Show all working.
(b) Assume that we want to be more confident that the true average is contained in the interval. Hence, we increase the confidence level to 95%. Without any calculation, explain what would happen to the confidence interval width and estimation precision if the confidence level is increased while all other factors remain the same.
The given data on departure delay of 72 flights is:
Delay |
-4 |
-3 |
58 |
-5 |
-5 |
-4 |
-1 |
-1 |
-1 |
-3 |
-5 |
-7 |
-5 |
-4 |
-5 |
-8 |
-2 |
4 |
-1 |
0 |
11 |
-5 |
37 |
22 |
65 |
6 |
-1 |
19 |
16 |
-5 |
178 |
-3 |
-5 |
4 |
-1 |
4 |
15 |
-3 |
-7 |
-6 |
-7 |
-3 |
-5 |
51 |
-4 |
-6 |
-1 |
-7 |
-11 |
2 |
1 |
102 |
-7 |
36 |
11 |
1 |
-6 |
-7 |
-5 |
-3 |
9 |
115 |
58 |
-2 |
-6 |
8 |
-4 |
-7 |
2 |
-5 |
303 |
18 |
We are to determine the 90% Confidence Interval. This is done by using the formula:
Now, n refers to the number of samples in the dataset. Here, n = 72
mu is the mean of the data set and is calculated using the formula:
Using the data set, we get:
Next, the standard deviation sigma is given as:
Using the data set, we get:
For a confidence level of 90%, z = 1.64
Substituting the values in the above formula, we get:
Hence, the 90% CI is (12.2732, 14.3934).
The variable of interest here is the delay in the flight timings. Plotting the data using a histogram shows that it is skewed towards the left. This can also be seen from the confidence interval which is highly restricted to the low values of delay.
When the confidence interval is increased, naturally the width of the interval would increase. Precision refers to how often the values lie in a given interval and hence as the CI increases, the precision will decrease.