In: Statistics and Probability
A regional airline transfers passengers from small airports to a larger regional hub airport. The airline data analyst was assigned to estimate the revenue ( in thousands of dollars) generated by each of the 22 small airports based on two variables: the distance from each airport ( in miles) to the hub and the population ( in hundreds) of the cities in which each of the 22 airports is located. The data is given in the following table.
Airport revenue distance population
1 233 233 56
2 272 209 74
3 253 206 74
4 296 232 78
5 268 125 73
6 296 245 54
7 276 213 100
8 235 134 98
9 253 140 95
10 233 165 81
11 240 234 52
12 267 205 96
13 338 214 96
14 243 183 73
15 252 230 55
16 269 238 91
17 242 144 64
18 233 220 60
19 234 170 60
20 450 170 240
21 340 290 70
22 200 340 75
From your scatter, plots in part (a), is there a data point that you think may be an “issue” for analysis?
Yes, there are some outliers in the data, which need to be removed. We are creating a model without removing them.
The 95% confidence interval of beta_1 is given by