In: Statistics and Probability
In what follows use any of the following tests/procedures: Regression, multiple regression, confidence intervals, one sided T-test or two sided T-test. All the procedures should be done with 5% P-value or 95% confidence interval.Some answers are approximated, choose the most appropriate answer.Open Pollution data. SETUP: Since wind clears the air, some people believe that the cities with wind speed above 10 have less SO2 than the cities with wind speed below 10. Given the data your job is to decide if this is a reasonable expectation.
I. What test/procedure did you perform? (6.66 points)
II. Statistical interpretation? (6.66 points)
III. Conclusion? (6.66 points)
link to data set: https://www.limes.one/Content/DataFiles/Cars04.txt
CITY SO2 MANUF POP TEMP WIND PRECIP-INCHES PRECIP-#DAYS Phoenix 11 213 582 70.3 6 7.05 36 Little Rock 15 91 132 61 8.2 48.52 100 San Francisco 16 453 716 56.7 8.7 20.66 67 Denver 24 454 515 51.9 9 12.95 86 Hartford 82 412 158 49.1 9 43.37 127 Wilmington 43 80 80 54 9 40.25 114 Washington 30 434 757 57.3 9.3 38.89 111 Jacksonville 18 136 529 68.4 8.8 54.47 116 Miami 14 207 335 75.5 9 59.8 128 Atlanta 32 368 497 61.5 9.1 48.34 115 Chicago 131 3344 3369 50.6 10.4 34.44 122 Indianapolis 40 361 746 52.3 9.7 38.74 121 Des Moines 20 104 201 49 11.2 30.85 103 Wichita 10 125 277 56.6 12.7 30.58 82 Louisville 35 291 593 55.6 8.3 43.11 123 New Orleans 9 204 361 68.3 8.4 56.77 113 Baltimore 47 625 905 55 9.6 41.31 111 Detroit 46 1064 1513 49.9 10.1 30.96 129 Minneapolis-St. Paul 42 699 744 43.5 10.6 25.94 137 Kansas City 18 381 507 54.5 10 37 99 St. Louis 61 775 622 55.9 9.5 35.89 105 Omaha 17 181 347 51.5 10.9 30.18 98 Albuquerque 15 46 244 56.8 8.9 7.77 58 Albany 56 44 116 47.6 8.8 33.36 135 Buffalo 11 391 463 47.1 12.4 36.11 166 Cincinnati 27 462 453 54 7.1 39.04 132 Cleveland 80 1007 751 49.7 10.9 34.99 155 Columbus 27 266 540 51.5 8.6 37.01 134 Philadelphia 79 1692 1950 54.6 9.6 39.93 115 Pittsburgh 63 347 520 50.4 9.4 36.22 147 Providence 136 343 179 50 10.6 42.75 125 Memphis 10 337 624 61.6 9.2 49.1 105 Nashville 23 275 448 59.4 7.9 46 119 Dallas 11 641 844 66.2 10.9 35.94 78 Houston 10 721 1233 68.9 10.8 48.19 103 Salt Lake City 28 137 176 51 8.7 15.17 89 Norfolk 38 96 308 59.3 10.6 44.68 116 Richmond 38 197 299 57.8 7.6 42.59 115 Seattle 40 379 531 51.1 9.4 38.79 164 Charleston 40 35 71 55.2 6.5 40.75 148 Milwaukee 20 569 717 45.7 11.8 29.07 123
Let's divide the data into those with wind speed < 10 and those with wind speed >= 10
Let be the mean SO2 level in the cities with wind speed < 10, and in those with wind speed >= 10
Let and be the respective std deviations of SO2 levels.
Hence, using the data given, we get: = 34.19, = 42.14, = 20.16, = 43.3
1) The hypothesis to run is (where and are respective true population means):
H0: - = 0
H1: - > 0
Hence, it is a One-sided t-test (since the population variance is not known, we cannot carry out a Z-test)
2) t-statistic = t = ( - )/ sqrt[(/n1) + (/n2)]
= -0.651
p-value = 0.2594
Hence, the p-value is too large, so we cannot reject the null hypothesis.
3) Conclusion: As explained above, We cannot conclude this is a reasonable expectation.