In: Statistics and Probability
Question 2:
Download the Excel data file "Arlington_Homes" from the folder "Data" under "Chapter 12."
a) read the data file in R.
b) using R, answer question 65 (a, b, and c) on page 411 of your book. Run the regression, show the estimates and test. Write what you are testing using a comment in the R program.
Question #65. link for page 411 #65 https://imgur.com/s0SgxP3
please show every step for R frmulas
Price |
Sqft |
Beds |
Baths |
Col |
840000 |
2768 |
4 |
3.5 |
1 |
822000 |
2500 |
4 |
2.5 |
1 |
713000 |
2400 |
3 |
3 |
1 |
689000 |
2200 |
3 |
2.5 |
1 |
685000 |
2716 |
3 |
3.5 |
1 |
645000 |
2524 |
3 |
2 |
1 |
625000 |
2732 |
4 |
2.5 |
0 |
620000 |
2436 |
4 |
3.5 |
1 |
587500 |
2100 |
3 |
1.5 |
1 |
585000 |
1947 |
3 |
1.5 |
1 |
583000 |
2224 |
3 |
2.5 |
1 |
569000 |
3262 |
4 |
2 |
0 |
546000 |
1792 |
3 |
2 |
0 |
540000 |
1488 |
3 |
1.5 |
0 |
537000 |
2907 |
3 |
2.5 |
0 |
516000 |
1951 |
4 |
2 |
1 |
511000 |
1752 |
3 |
1.5 |
1 |
510000 |
1727 |
3 |
2 |
1 |
495000 |
1692 |
3 |
2 |
0 |
463000 |
1714 |
3 |
2 |
0 |
457000 |
1650 |
3 |
2 |
0 |
451000 |
1685 |
3 |
2 |
0 |
435000 |
1500 |
3 |
1.5 |
1 |
431700 |
1896 |
2 |
1.5 |
0 |
414000 |
1182 |
2 |
1.5 |
0 |
401500 |
1152 |
3 |
1 |
0 |
399000 |
1383 |
4 |
1 |
0 |
380000 |
1344 |
4 |
2 |
0 |
380000 |
1272 |
3 |
1 |
0 |
375900 |
2275 |
5 |
1 |
0 |
372000 |
1005 |
2 |
1 |
0 |
367500 |
1272 |
3 |
1 |
0 |
356500 |
1431 |
2 |
2 |
1 |
330000 |
1362 |
3 |
1 |
0 |
330000 |
1465 |
3 |
1 |
0 |
307500 |
850 |
1 |
1 |
0 |
Solutiona:
To read data:
library(readxl)
Arlington_Homes <-
read_excel("C:/Users/M1045151/Downloads/Arlington_Homes.xlsx")
View(Arlington_Homes)
dim(Arlington_Homes)
glimpse(Arlington_Homes)
Output:
Observations: 36
Variables: 5
$ Price <dbl> 840000, 822000, 713000, 689000, 685000, 645000,
625000, 620000, 587500, 585000, 5...
$ Sqft <dbl> 2768, 2500, 2400, 2200, 2716, 2524, 2732, 2436,
2100, 1947, 2224, 3262, 1792, 148...
$ Beds <dbl> 4, 4, 3, 3, 3, 3, 4, 4, 3, 3, 3, 4, 3, 3, 3, 4,
3, 3, 3, 3, 3, 3, 3, 2, 2, 3, 4, ...
$ Baths <dbl> 3.5, 2.5, 3.0, 2.5, 3.5, 2.0, 2.5, 3.5, 1.5,
1.5, 2.5, 2.0, 2.0, 1.5, 2.5, 2.0, 1...
$ Col <dbl> 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1,
1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, ...
Solutionb:
Rcode to get the linear regression:
regmod <- lm(Arlington_Homes$Price
~Arlington_Homes$Sqft+Arlington_Homes$Beds+Arlington_Homes$Baths+Arlington_Homes$Col)
summary(regmod)
Call:
lm(formula = Arlington_Homes$Price ~ Arlington_Homes$Sqft + Arlington_Homes$Beds +
Arlington_Homes$Baths + Arlington_Homes$Col)
Residuals:
Min 1Q Median 3Q Max
-157118 -47479 -4742 38849 168327
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 165888.66 52353.54 3.169 0.00343 **
Arlington_Homes$Sqft 91.68 32.34 2.834 0.00801 **
Arlington_Homes$Beds 4372.36 18561.84 0.236 0.81533
Arlington_Homes$Baths 66619.61 24659.48 2.702 0.01109 *
Arlington_Homes$Col 74557.88 27374.26 2.724 0.01051 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 68440 on 31 degrees of freedom
Multiple R-squared: 0.777, Adjusted R-squared: 0.7483
F-statistic: 27.01 on 4 and 31 DF, p-value: 1.031e-09
Regression equation is
price=165888.66+91.68*sqft+4372.36 *Beds+ 66619.61*Baths+ 74557.88 *col
sqft,Beds,cola re significanct varaibles
F=27.01
p= 1.031e-09
p<0.05
Model is significant.