In: Statistics and Probability
Car manufacturers produced a variety of classic cars that continue to increase in value. Suppose the following data is based upon the Martin Rating System for Collectible Cars, and shows the rarity rating (1–20) and the high price ($1,000) for 15 classic cars.
Model | Rating | Price ($1,000) |
---|---|---|
A | 16 | 275.0 |
B | 19 | 1,325.0 |
C | 17 | 140.0 |
D | 16 | 100.0 |
E | 18 | 975.0 |
F | 15 | 77.5 |
G | 16 | 325.0 |
H | 18 | 350.0 |
I | 13 | 70.0 |
J | 17 | 450.0 |
K | 17 | 375.0 |
L | 19 | 4,000.0 |
M | 14 | 62.0 |
N | 19 | 2,625.0 |
O | 18 | 1,575.0 |
(a) Develop a scatter diagram of the data using the rarity rating as the independent variable and price as the independent variable.
Does a simple linear regression model appear to be appropriate?
Yes, there appears to be a linear relationship between the two variables.
No, there appears to be a curvilinear relationship between the two variables.
No, there doesn't appear to be a relationship between the two variables.
(b) Develop an estimated multiple regression equation with x = rarity rating and x2 as the two independent variables. (Round b0 and b1 to the nearest integer and b2 to one decimal place.
ŷ =
(c)Consider the nonlinear relationship shown by equation (16.7):E(y) = β0β1xUse logarithms to develop an estimated regression equation for this model. (Round b0 to three decimal places and b1 to four decimal places.)
log(ŷ) =
(d) Do you prefer the estimated regression equation developed in part (b) or part (c)? Explain.
The model in part (b) is preferred because r2 is higher and the p-value is lower.
The model in part (c) is preferred because r2 is higher and the p-value is lower.
The model in part (b) is preferred because r2 is lower and the p-value is lower.T
he model in part (c) is preferred because r2 is lower and the p-value is lower.
R-Code and results
x=c(16,19,17,16,18,15,16,18,13,17,17,19,14,19,18)
>
y=c(275,1325,140,100,975,77.5,325,350,70,450,375,4000,62,2625,1575)
> plot(x,log(y),main="scatter
plot",xlab="rating",ylab="price")
> summary(lm(y ~ x + I(x^2)))
Call:
lm(formula = y ~ x + I(x^2))
Residuals:
Min 1Q Median 3Q Max
-1087.8 -310.6 101.8 271.0 1587.2
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 34138.71 13629.87 2.505 0.0277 *
x -4608.45 1684.79 -2.735 0.0181 *
I(x^2) 154.67 51.62 2.997 0.0111 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 667 on 12 degrees of freedom
Multiple R-squared: 0.703, Adjusted R-squared: 0.6535
F-statistic: 14.2 on 2 and 12 DF, p-value: 0.0006864
> summary(lm(log(y)~x))
Call:
lm(formula = log(y) ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.13464 -0.30656 0.03297 0.42267 0.90025
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.12260 1.57285 -3.257 0.00624 **
x 0.65876 0.09311 7.075 8.36e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.6343 on 13 degrees of freedom
Multiple R-squared: 0.7938, Adjusted R-squared: 0.778
F-statistic: 50.05 on 1 and 13 DF, p-value: 8.357e-06
Scatter diagram