In: Statistics and Probability
Using the data below, we would like to understand why some cities have a higher proportion of creative-class workers than others. Under one theory, the proportion may be explained only by the city’s income, while in another theory, income along with population and cost-of-living are thought to explain the proportion..
(a) Estimate a linear regression model for each theory and report your results.
(b) Use an F test to choose one theory over another.
(c) State the null hypothesis, the test statistic, and the specific distribution used (including the degrees of freedom) for your test in (b)
Metro Area | Population | Income | Cost-of-Living Index | Creative Class (%) |
New Orleans-Metairie-Kenner, LA | 1,024.68 | 46.459 | 99 | 29.6 |
Rochester, NY | 1,035.44 | 47.749 | 102 | 33.1 |
Salt Lake City, UT | 1,067.19 | 53.587 | 101 | 30.4 |
Birmingham-Hoover, AL | 1,089.88 | 44.534 | 93 | 30.6 |
Buffalo-Niagara Falls, NY | 1,137.52 | 42.831 | 127 | 29.7 |
Oklahoma City, OK | 1,173.63 | 42.036 | 92 | 31.6 |
Hartford-West Hartford-East Hartford, CT | 1,188.84 | 61.753 | 116 | 37.5 |
Richmond, VA | 1,196.41 | 53.416 | 106 | 31.1 |
Louisville-Jefferson County, KY-IN | 1,220.64 | 45.115 | 98 | 27.2 |
Memphis, TN-MS-AR | 1,268.33 | 42.092 | 97 | 26.7 |
Jacksonville, FL | 1,276.86 | 49.736 | 96 | 27.4 |
Nashville-Davidson--Murfreesboro, TN | 1,455.30 | 47.699 | 92 | 29.8 |
Austin-Round Rock, TX | 1,506.43 | 52.882 | 93 | 36.5 |
Milwaukee-Waukesha-West Allis, WI | 1,509.98 | 50.27 | 100 | 30.3 |
Charlotte-Gastonia-Concord, NC-SC | 1,582.63 | 50.367 | 92 | 30.9 |
Providence-New Bedford-Fall River, RI-MA | 1,612.99 | 51.797 | 126 | 30.1 |
Virginia Beach-Norfolk-Newport News, VA-NC | 1,647.40 | 52.976 | 105 | 29.3 |
Indianapolis-Carmel, IN | 1,669.37 | 50.841 | 98 | 29.4 |
Columbus, OH | 1,725.57 | 49.92 | 102 | 30.9 |
Las Vegas-Paradise, NV | 1,777.54 | 53.536 | 109 | 20.6 |
San Jose-Sunnyvale-Santa Clara, CA | 1,784.83 | 80.638 | 154 | 44.4 |
San Antonio, TX | 1,948.44 | 45.019 | 92 | 29.7 |
Kansas City, MO-KS | 1,966.79 | 52.359 | 94 | 32.1 |
Orlando-Kissimmee, FL | 1,984.86 | 48.934 | 107 | 27.3 |
Sacramento--Arden-Arcade--Roseville, CA | 2,067.12 | 56.953 | 122 | 34 |
Cincinnati-Middletown, OH-KY-IN | 2,105.01 | 50.306 | 93 | 30.3 |
Cleveland-Elyria-Mentor, OH | 2,114.16 | 45.925 | 101 | 30.4 |
Portland-Vancouver-Beaverton, OR-WA | 2,137.60 | 52.48 | 109 | 30.1 |
Pittsburgh, PA | 2,370.78 | 43.26 | 95 | 30.1 |
Denver-Aurora, CO | 2,408.62 | 54.994 | 102 | 34.5 |
Baltimore-Towson, MD | 2,658.41 | 61.01 | 121 | 33.9 |
Tampa-St. Petersburg-Clearwater, FL | 2,697.73 | 43.742 | 101 | 30.2 |
St. Louis, MO-IL | 2,793.99 | 49.765 | 100 | 30.1 |
San Diego-Carlsbad-San Marcos, CA | 2,941.45 | 59.591 | 131 | 32.8 |
Minneapolis-St. Paul-Bloomington, MN-WI | 3,175.04 | 62.223 | 99 | 35.2 |
Seattle-Tacoma-Bellevue, WA | 3,263.50 | 60.663 | 108 | 35 |
Riverside-San Bernardino-Ontario, CA | 4,026.14 | 53.243 | 121 | 24.2 |
Phoenix-Mesa-Scottsdale, AZ | 4,039.18 | 51.862 | 102 | 28.3 |
San Francisco-Oakland-Fremont, CA | 4,180.03 | 70.463 | 157 | 38.8 |
Boston-Cambridge-Quincy, MA-NH | 4,455.22 | 64.144 | 139 | 41.6 |
Detroit-Warren-Livonia, MI | 4,468.97 | 52.004 | 105 | 30.6 |
Atlanta-Sandy Springs-Marietta, GA | 5,134.87 | 55.552 | 98 | 33 |
Washington-Arlington-Alexandria, DC-VA-MD-WV | 5,288.67 | 78.978 | 133 | 43.7 |
Miami-Fort Lauderdale-Miami Beach, FL | 5,463.86 | 46.637 | 115 | 25.6 |
Houston-Sugar Land-Baytown, TX | 5,542.05 | 50.25 | 88 | 31.3 |
Philadelphia-Camden-Wilmington, PA-NJ-DE-MD | 5,826.74 | 55.593 | 116 | 34.9 |
Dallas-Fort Worth-Arlington, TX | 6,006.09 | 52.001 | 92 | 33 |
Chicago-Naperville-Joliet, IL-IN-WI | 9,506.86 | 57.008 | 106 | 32.3 |
Los Angeles-Long Beach-Santa Ana, CA | 12,950.13 | 55.516 | 131 | 32.9 |
New York-Northern New Jersey-Long Island, NY-NJ-PA | 18,818.54 | 59.281 | 148 | 35.6 |
Model 1
Proportion=f(Income)
a)Reg equation->Creative Class %=9.61129+0.4165Income
b) F stat=72.48856, Hence, the model is significant
c)H0:The coefficients of regression are equal to 0
Ha: Atleast one coefficient of regression is not equal to 0
test stat=72.48856, pvalue=0.000
Hence,reject H0 in favor of Ha and say the model is significant.
Model2
Proportion= f(Population, Income, Cost of Living)
Reg Eqn->y=10.12808+0.0000381Population+0.4326Income-0.38136Cost of living.
F test shows us that the model is significant.
Comparison:
The model 1 is found to be performing better because the Adj Rsq of model 1 is higher than the one in model 2