In: Statistics and Probability
Using the data below, we would like to understand why some cities have a higher proportion of creative-class workers than others. Under one theory, the proportion may be explained only by the city’s income, while in another theory, income along with population and cost-of-living are thought to explain the proportion..
(a) Estimate a linear regression model for each theory and report your results.
(b) Use an F test to choose one theory over another.
(c) State the null hypothesis, the test statistic, and the specific distribution used (including the degrees of freedom) for your test in (b)
| Metro Area | Population | Income | Cost-of-Living Index | Creative Class (%) |
| New Orleans-Metairie-Kenner, LA | 1,024.68 | 46.459 | 99 | 29.6 |
| Rochester, NY | 1,035.44 | 47.749 | 102 | 33.1 |
| Salt Lake City, UT | 1,067.19 | 53.587 | 101 | 30.4 |
| Birmingham-Hoover, AL | 1,089.88 | 44.534 | 93 | 30.6 |
| Buffalo-Niagara Falls, NY | 1,137.52 | 42.831 | 127 | 29.7 |
| Oklahoma City, OK | 1,173.63 | 42.036 | 92 | 31.6 |
| Hartford-West Hartford-East Hartford, CT | 1,188.84 | 61.753 | 116 | 37.5 |
| Richmond, VA | 1,196.41 | 53.416 | 106 | 31.1 |
| Louisville-Jefferson County, KY-IN | 1,220.64 | 45.115 | 98 | 27.2 |
| Memphis, TN-MS-AR | 1,268.33 | 42.092 | 97 | 26.7 |
| Jacksonville, FL | 1,276.86 | 49.736 | 96 | 27.4 |
| Nashville-Davidson--Murfreesboro, TN | 1,455.30 | 47.699 | 92 | 29.8 |
| Austin-Round Rock, TX | 1,506.43 | 52.882 | 93 | 36.5 |
| Milwaukee-Waukesha-West Allis, WI | 1,509.98 | 50.27 | 100 | 30.3 |
| Charlotte-Gastonia-Concord, NC-SC | 1,582.63 | 50.367 | 92 | 30.9 |
| Providence-New Bedford-Fall River, RI-MA | 1,612.99 | 51.797 | 126 | 30.1 |
| Virginia Beach-Norfolk-Newport News, VA-NC | 1,647.40 | 52.976 | 105 | 29.3 |
| Indianapolis-Carmel, IN | 1,669.37 | 50.841 | 98 | 29.4 |
| Columbus, OH | 1,725.57 | 49.92 | 102 | 30.9 |
| Las Vegas-Paradise, NV | 1,777.54 | 53.536 | 109 | 20.6 |
| San Jose-Sunnyvale-Santa Clara, CA | 1,784.83 | 80.638 | 154 | 44.4 |
| San Antonio, TX | 1,948.44 | 45.019 | 92 | 29.7 |
| Kansas City, MO-KS | 1,966.79 | 52.359 | 94 | 32.1 |
| Orlando-Kissimmee, FL | 1,984.86 | 48.934 | 107 | 27.3 |
| Sacramento--Arden-Arcade--Roseville, CA | 2,067.12 | 56.953 | 122 | 34 |
| Cincinnati-Middletown, OH-KY-IN | 2,105.01 | 50.306 | 93 | 30.3 |
| Cleveland-Elyria-Mentor, OH | 2,114.16 | 45.925 | 101 | 30.4 |
| Portland-Vancouver-Beaverton, OR-WA | 2,137.60 | 52.48 | 109 | 30.1 |
| Pittsburgh, PA | 2,370.78 | 43.26 | 95 | 30.1 |
| Denver-Aurora, CO | 2,408.62 | 54.994 | 102 | 34.5 |
| Baltimore-Towson, MD | 2,658.41 | 61.01 | 121 | 33.9 |
| Tampa-St. Petersburg-Clearwater, FL | 2,697.73 | 43.742 | 101 | 30.2 |
| St. Louis, MO-IL | 2,793.99 | 49.765 | 100 | 30.1 |
| San Diego-Carlsbad-San Marcos, CA | 2,941.45 | 59.591 | 131 | 32.8 |
| Minneapolis-St. Paul-Bloomington, MN-WI | 3,175.04 | 62.223 | 99 | 35.2 |
| Seattle-Tacoma-Bellevue, WA | 3,263.50 | 60.663 | 108 | 35 |
| Riverside-San Bernardino-Ontario, CA | 4,026.14 | 53.243 | 121 | 24.2 |
| Phoenix-Mesa-Scottsdale, AZ | 4,039.18 | 51.862 | 102 | 28.3 |
| San Francisco-Oakland-Fremont, CA | 4,180.03 | 70.463 | 157 | 38.8 |
| Boston-Cambridge-Quincy, MA-NH | 4,455.22 | 64.144 | 139 | 41.6 |
| Detroit-Warren-Livonia, MI | 4,468.97 | 52.004 | 105 | 30.6 |
| Atlanta-Sandy Springs-Marietta, GA | 5,134.87 | 55.552 | 98 | 33 |
| Washington-Arlington-Alexandria, DC-VA-MD-WV | 5,288.67 | 78.978 | 133 | 43.7 |
| Miami-Fort Lauderdale-Miami Beach, FL | 5,463.86 | 46.637 | 115 | 25.6 |
| Houston-Sugar Land-Baytown, TX | 5,542.05 | 50.25 | 88 | 31.3 |
| Philadelphia-Camden-Wilmington, PA-NJ-DE-MD | 5,826.74 | 55.593 | 116 | 34.9 |
| Dallas-Fort Worth-Arlington, TX | 6,006.09 | 52.001 | 92 | 33 |
| Chicago-Naperville-Joliet, IL-IN-WI | 9,506.86 | 57.008 | 106 | 32.3 |
| Los Angeles-Long Beach-Santa Ana, CA | 12,950.13 | 55.516 | 131 | 32.9 |
| New York-Northern New Jersey-Long Island, NY-NJ-PA | 18,818.54 | 59.281 | 148 | 35.6 |
Model 1
Proportion=f(Income)

a)Reg equation->Creative Class %=9.61129+0.4165Income
b) F stat=72.48856, Hence, the model is significant
c)H0:The coefficients of regression are equal to 0
Ha: Atleast one coefficient of regression is not equal to 0
test stat=72.48856, pvalue=0.000
Hence,reject H0 in favor of Ha and say the model is significant.
Model2
Proportion= f(Population, Income, Cost of Living)

Reg Eqn->y=10.12808+0.0000381Population+0.4326Income-0.38136Cost of living.
F test shows us that the model is significant.
Comparison:
The model 1 is found to be performing better because the Adj Rsq of model 1 is higher than the one in model 2