In: Math
In 2014, a group of students was interested in investigating prices of rental accommodation in suburbs of Brisbane that are close to the CBD and collected information on a total of 200 randomly chosen dwellings in four inner western suburbs. A subset of this data, relating to rental apartments in these suburbs is included below. The variables are:
Per week: weekly rental price for the apartment ($);
Bedrooms: number of bedrooms in the apartment;
Sqm: size of the apartment (m2)
Furnished: whether the apartment was furnished or not (yes/no).
The values are;
265,2,59,No
305,2,70,No
300,1,72,No
320,3,66,No
340,2,113,Yes
330,2,58,Yes
355,2,63,No
345,2,57,Yes
355,2,61,No
360,2,114,Yes
355,2,75,Yes
360,2,68,No
365,2,64,No
370,1,69,No
390,2,73,Yes
380,2,85,Yes
390,2,56,Yes
370,2,56,Yes
385,2,59,Yes
380,2,65,Yes
385,2,62,Yes
400,2,65,No
415,2,69,Yes
400,3,63,No
405,3,70,No
420,2,77,No
435,2,84,Yes
435,2,83,Yes
455,2,73,Yes
450,2,72,Yes
485,2,68,No
500,2,76,Yes
535,2,97,No
290,1,60,No
305,1,63,Yes
330,2,65,No
310,2,70,No
335,2,64,No
330,2,62,No
345,2,79,No
355,1,81,No
340,2,66,No
345,1,60,No
345,2,64,No
355,2,73,No
385,2,61,No
380,2,78,No
405,2,81,No
410,2,76,Yes
430,2,80,No
440,2,61,No
450,3,86,No
485,3,91,No
500,1,87,No
545,1,97,Yes
345,3,86,No
400,2,72,No
400,2,74,No
480,2,73,Yes
755,3,87,No
760,3,77,No
770,3,113,No
824,2,109,No
860,3,104,No
295,1,70,No
290,1,54,No
295,1,61,No
325,1,61,No
340,2,56,No
355,2,61,No
365,2,95,No
420,1,75,No
420,2,66,No
440,2,74,No
480,3,72,No
465,3,87,No
470,1,87,Yes
490,1,81,Yes
495,2,76,No
505,2,97,No
530,2,77,No
545,2,97,No
560,2,79,No
550,2,78,No
560,3,75,No
565,1,96,Yes
580,2,85,Yes
605,3,84,No
605,2,93,Yes
610,2,78,Yes
620,2,87,No
665,2,88,No
700,2,80,No
750,3,97,Yes
740,3,124,No
805,3,101,No
860,3,98,No
960,3,123,Yes
990,3,102,Yes
1195,3,133,No
1190,3,137,No
1405,3,148,Yes
1490,3,154,No
Q1)
The students were interested in exploring the relationship between apartment size and weekly rent. Using R, fit a linear regression relating weekly rent to apartment size; that is, a model of the form:
per. week = β0 + β1 sqm + ε,
and answer the following questions:
(a) What is the MSE (mean-squared error) for this regression? What are the degrees of freedom associated with this value?
(b) Find a 95% confidence interval for the intercept parameter in this model. You may take relevant statistics from the R output for your regression, but please show full working.
(c) Another group of students suggest that apartment pricing is known to increase with apartment size at an average rate of $12 per square metre. Carry out a formal test of this hypothesis and interpret the resulting p-value, using α = 0.05.
(d) You intend to use your research to help inform fair weekly rent for a friend, who is looking at renting an apartment of size 60 m2 . Using R, find a 95% prediction interval for the weekly rent of this apartment.
(e) Construct a residual and normal quantile plot for this analysis. Are there any issues with the underlying assumptions of linearity and/or homoscedasticity?
Please answer this question by using R code !!!!!!! it is very important
SolutionA:
Import the data into R
Rocde is
Brisbane <-
read_excel("C:/Users/M1045151/Downloads/Brisbane.xlsx")
View(Brisbane)
head(Brisbane)
# A tibble: 6 x 4
`Per week` Bedrooms Sqm Furnished
<dbl> <dbl> <dbl> <chr>
1 265 2 59 No
2 305 2 70 No
3 300 1 72 No
4 320 3 66 No
5 340 2 113 Yes
6 330 2 58 Yes
Fit a linear regresison using lm fuction in R .Rcode is
regmod1 = lm(Brisbane$`Per week`~Brisbane$Sqm)
summary(regmod1)
coefficients(regmod1)
Output is
(Intercept) Brisbane$Sqm
-268.376910 9.567441
Regression equation is
Per wk= -268.376910 +9.567441 *Sqm
To get MSE
MSE (mean-squared error) for this regression? What are the degrees of freedom associated with this value?
R code is
anova(regmod1)
Output:
Analysis of Variance Table
Response: Per week
Df Sum Sq Mean Sq F value Pr(>F)
Sqm 1 3758668 3758668 237.9 < 2.2e-16 ***
Residuals 101 1595759 15800
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
MSE=15800
Degrees of freedom=101
b) Find a 95% confidence interval for the intercept parameter in this model. You may take relevant statistics from the R output for your regression, but please show full working.
Rcode to get confidence interval is
confint(regmod1)
Output:
2.5 % 97.5 %
(Intercept) -369.941568 -166.81225
Brisbane$Sqm 8.336933 10.79795
95% confidence interval for intercept lies in between -369.941568 and -166.81225
Solutiond:
use predict function in R to create prediction interval.
Rcode is:
attach(Brisbane)
regmod1= lm(`Per week` ~ Sqm)
newdata = data.frame(Sqm=60)
predict(regmod1, newdata, interval="predict")
output is:
fit lwr upr
305.6696 53.89787 557.4412
Rental price for an apartment of size 60 m2 is 305.6696
95% prediction interval for apartment of size 60 m2 lies in between 53.89787 and 557.4412