In: Statistics and Probability
Fire Damage: A fire insurance company is interested in investigating the effect of the distance between the burning house and the nearest fire station (miles) on the amount of fire damage (thousands of dollars) in major residential fire. A random sample of 15 recent fires in a suburb is selected. The data set can be found in the excel file Fire with variables name Distance and Damage. Source: McClave, J. & Sincich, T. (2015). Statistics. (12th edition). M. A.: Pearson.
Part 1: Describe a scatterplot to learn the association between the distance between the burning house and the nearest fire station (miles) and the amount of fire damage (thousands of dollars) in major residential fire. (36 points total, 9 points each criterion)
Direction of association:
Form of association:
Strength of association:
Presence of outlier:
Note: Save R codes, outputs, and scatterplot on your machine for future reference
Part 2: Interpret regression parameters in context: (44 points total, 22 points each parameter)
Interpretation of y-intercept (b_0):
Interpretation of slope of regression line (b_1):
Note: Save R code and output on your machine for future reference
Part 3: Interpret the coefficient of determination (R^2) in context: (20 points)
Interpretation:
Note: Save R code and output on your machine for future reference
Distance Damage
1 3.4 26.2
2 1.8 17.8
3 4.6 31.3
4 2.3 23.1
5 3.1 27.5
6 5.5 36.0
7 0.7 14.1
8 3.0 22.3
9 2.6 19.6
10 4.3 31.3
11 2.1 24.0
12 1.1 17.3
13 6.1 43.2
14 4.8 36.4
15 3.8 26.1
Part 1:
Stored the given data into a dataframe fire
Scatterplot is generated as below.
plot(fire$Distance,fire$Damage, pch = 16, xlab = "Distance", ylab = "Damage")
From the scatterplot,
Direction of association: Positive
Form of association: Linear
Strength of association: Strong
Presence of outlier: There does not seems to be any outlier.
Part 2 -
The estimated regression model is generated as,
> m = lm(Damage ~ Distance, data =
fire)
> m
Call:
lm(formula = Damage ~ Distance, data = fire)
Coefficients:
(Intercept) Distance
10.278 4.919
Interpretation of y-intercept (b_0): 10.278
The amount of fire damage for a house with 0 miles from the fire station is 10.278 thousands of dollars.
Interpretation of slope of regression line (b_1):
With every increase in one mile from the fire station, the amount of fire damage will increase by 4.919 thousands of dollars.
Part 3:
Summary of the model is,
> summary(m)
Call:
lm(formula = Damage ~ Distance, data = fire)
Residuals:
Min 1Q Median 3Q Max
-3.4682 -1.4705 -0.1311 1.7915 3.3915
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.2779 1.4203 7.237 6.59e-06 ***
Distance 4.9193 0.3927 12.525 1.25e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.316 on 13 degrees of freedom
Multiple R-squared: 0.9235, Adjusted
R-squared: 0.9176
F-statistic: 156.9 on 1 and 13 DF, p-value: 1.248e-08
Interpretation:
coefficient of determination (R^2) = 0.9235
92.35% of variation in amount of fire damage is explained by the model with distance from the fire station as the predictor variable.