In: Statistics and Probability
Use the appropriate test to determine whether X1 can be dropped from the regression model given that X2 is retained. Use level of significance 0.05. Find the value of appropriate test statistic, the critical vale and the P-value. Plese show me how to use R to solve this.
X1 X2 Y
190 130 35
176 174 81.7
205 134 42.5
210 191 98.3
230 165 52.7
192 194 82
220 143 34.5
235 186 95.4
240 139 56.7
230 188 84.4
200 175 94.3
218 156 44.3
220 190 83.3
210 178 91.4
208 132 43.5
225 148 51.7
After loading our date into R with the name of Test;
> Test <- Regression...Sheet1
> linearMod <- lm(Y ~ ., data=Test) # build linear
regression model on full data
> print(linearMod)
Call:
lm(formula = Y ~ ., data = Test)
Coefficients:
(Intercept) X1 X2
-67.88436 -0.06419 0.90609
> summary(linearMod)
Call:
lm(formula = Y ~ ., data = Test)
Residuals:
Min 1Q Median 3Q Max
-15.172 -8.404 1.026 7.410 16.457
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -67.88436 40.58652 -1.673 0.118
X1 -0.06419 0.16391 -0.392 0.702
X2 0.90609 0.12337 7.344 5.63e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 11.2 on 13 degrees of freedom
Multiple R-squared: 0.8064, Adjusted R-squared:
0.7766
F-statistic: 27.07 on 2 and 13 DF, p-value: 2.319e-05
Further breaking the components individually;
> modelCoeffs <- modelSummary$coefficients # model
coefficients
> modelCoeffs
Estimate Std. Error t value Pr(>|t|)
(Intercept) -67.88435970 40.5865217 -1.6725838 1.182893e-01
X1 -0.06418911 0.1639142 -0.3916018 7.016961e-01
X2 0.90608862 0.1233709 7.3444291 5.628708e-06
> beta.estimate <- modelCoeffs["X1", "Estimate"] #
get beta estimate for X1
> beta.estimate
[1] -0.06418911
> std.error <- modelCoeffs["X1", "Std. Error"] # get
std.error for X1
> std.error
[1] 0.1639142
> t_value <- beta.estimate/std.error # calc t
statistic
> t_value
[1] -0.3916018
> p_value <- 2*pt(-abs(t_value),
df=nrow(Test)-ncol(Test)) # calc p Value
> p_value
[1] 0.7016961
> f_statistic <- linearMod$fstatistic[1] #
fstatistic
> f_statistic
NULL
> f <- summary(linearMod)$fstatistic # parameters for
model p-value calc
> f
value numdf dendf
27.07029 2.00000 13.00000
> model_p <- pf(f[1], f[2], f[3],
lower=FALSE)
> model_p
value
2.318617e-05
>
As we can see the p-value is high for X1 and low value for t-statistic hence we can drop X1. Keeping variables that are not statistically significant can reduce the model’s precision.