In: Statistics and Probability
The linear model below explores a potential association between property damage and wind speed based on observational data from 94 hurricanes that hit the United States between 1950 and 2012. The variables are
Damage: property damage in millions of U.S. dollars (adjusted for inflation to 2014) for each hurricane
Landfall.Windspeed: Maximum sustained windspeed in miles per hour measured along U.S. coast for each hurricane
* Assume that the sample data satisfies all assumptions for linear regression.
Level of significance = 0.05.
> summary(model)
Call:
lm(formula = Damage ~ Landfall.Windspeed)
Residuals:
Min 1Q Median 3Q Max
-9294 -4782 -1996 -531 90478
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -10041.78 6064.29 -1.656 0.1012
Landfall.Windspeed 142.07 56.65 2.508 0.0139 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12280 on 92 degrees of freedom
Multiple R-squared: [ A ], Adjusted R-squared: 0.05381
F-statistic: 6.289 on 1 and 92 DF, p-value: 0.01391
(a) Write the equation for the linear model using the variables Damages and Landfall Windspeed, taking the results of the t-tests into account.
(b) A hurricane is defined as a storm with wind speeds greater than 74 miles per hour. Interpret the value of the intercept in connection to the real-life context of this model (two or three sentences). Hint: Is the intercept truly meaningful, given the definition of a hurricane?
(c) The value of Pearson’s correlation coefficient for Damages and Landfall.Windspeed is 0.2529438. Calculate and interpret the value of R2 , denoted [A] in the table, in relation to the predictor and response variables.
(d) The range of observed maximum wind speeds in the sample data is 75 – 190 miles per hour. Is it appropriate to use the linear model to predict the cost of damage for a hurricane with a maximum wind speed of 150 miles per hour? Why or why not? If so, estimate the typical value of damages (specifying units).
(e) Would it be appropriate to use the linear model to predict the cost of damage for a hurricane with a maximum wind speed of 225 miles per hour? Why or why not? If so, estimate the typical value of damages (specifying units).
a)
Let, y = Damages, and x = Landfall Windspeed
Then the linear equation is y = -10041.78 + 142.07*x
and the corresponding p-value of the t-test of the significance of slope is 0.0139. Which is not significant for alpha = 0.05
b)
Interpretation of intercept - when the wind speed is zero the expected damages will equal the value of intercept.
But this is not meaningful because a hurricane is defined as a storm with wind speeds greater than 74 miles per hour. without a hurricane, there are not any damages.
c)
r = 0.2529438 and we know that in simple linear regression R2 = r^2 = (0.2529438)^2 = 0.06398057
interpretation - 6.398 % of the variablity of damages is explained by landfall windspeed
d)
In regression, extrapolation is not a good estimate because the behaviour of the graph can be different. See the below figure if we estimate the value outside its range it is so much away from the actual value. Outside the range the variance will also increase
As 150 in the range of 75-190. So, we can estimate it
and the estimate value is y = -10041.78 + 142.07*150 = 11268.72
e) same reason as above. As, 225 miles per hour is outside the range of 75-190. this will not be a appropriate estimate