In: Statistics and Probability
Forecasting labour costs is a key aspect of hotel revenue management that enables hoteliers to appropriately allocate hotel resources and fix pricing strategies. Mary, the President of Hellenic Hoteliers Federation (HHF) is interested in investigating how labour costs (variable L_COST) relate to the number of rooms in a hotel (variable Total_Rooms). Suppose that HHF has hired you as a business analyst to develop a linear model to predict hotel labour costs based on the total number of rooms per hotel using the data provided.
3.1 Use the least squares method to estimate the regression coefficients b0 and b1
3.2 State the regression equation
3.3 Plot on the same graph, the scatter diagram and the regression line
3.4 Give the interpretation of the regression coefficients b0 and b1 as well as the result of the t-test on the individual variables (assume a significance level of 5%)
3.5 Determine the correlation coefficient of the two variables and provide an interpretation of its meaning in the context of this problem 3.6 Check statistically, at the 0.05 level of significance whether there is any evidence of a linear relationship between labour cost and total number of rooms per hotel
I need only the 3.4 and 3.5 questions.
Total_Rooms L_COST
412 2.165.000
313 2.214.985
265 1.393.550
204 2.460.634
172 1.151.600
133 801.469
127 1.072.000
322 1.608.013
241 793.009
172 1.383.854
121 494.566
70 437.684
65 83.000
93 626.000
75 37.735
69 256.658
66 230.000
54 200.000
68 199.000
57 11.720
38 59.200
27 130.000
47 255.020
32 3.500
27 20.906
48 284.569
39 107.447
35 64.702
23 6.500
25 156.316
10 15.950
18 722.069
17 6.121
29 30.000
21 5.700
23 50.237
15 19.670
8 7.888
20
11
15 3.500
18 112.181
23
10 30.000
26 3.575
306 2.074.000
240 1.312.601
330 434.237
139 495.000
353 1.511.457
324 1.800.000
276 2.050.000
221 623.117
200 796.026
117 360.000
170 538.848
122 568.536
57 300.000
62 249.205
98 150.000
75 220.000
62 50.302
50 517.729
27 51.000
44 75.704
33 271.724
25 118.049
42
30 40.000
44
10 10.000
18 10.000
18
73 70.000
21 12.000
22 20.000
25 36.277
25 36.277
31 10.450
16 14.300
15 4.296
12
11
16 379.498
22 1.520
12 45.000
34 96.619
37 270.000
25 60.000
10 12.500
270 1.934.820
261 3.000.000
219 1.675.995
280 903.000
378 2.429.367
181 1.143.850
166 900.000
119 600.000
174 2.500.000
124 1.103.939
112 363.825
227 1.538.000
161 1.370.968
216 1.339.903
102 173.481
96 210.000
97 441.737
56 96.000
72 177.833
62 252.390
78 377.182
74 111.000
33 238.000
30 45.000
39 50.000
32 40.000
25 61.766
41 166.903
24 116.056
49 41.000
43 195.821
9
20 96.713
32 6.500
14 5.500
14 4.000
13 15.000
13 9.500
53 48.200
11 3.000
16 27.084
21 30.000
21 20.000
46 43.549
21 10.000
In order to solve this question I used R software.
R codes and output:
First we need to delete the observation with missing values in order to fit regression equation.
> room=scan('clipboard');room
Read 126 items
[1] 412 313 265 204 172 133 127 322 241 172 121 70 65 93 75 69 66
54
[19] 68 57 38 27 47 32 27 48 39 35 23 25 10 18 17 29 21 23
[37] 15 8 15 18 10 26 306 240 330 139 353 324 276 221 200 117 170
122
[55] 57 62 98 75 62 50 27 44 33 25 30 10 18 73 21 22 25 25
[73] 31 16 15 16 22 12 34 37 25 10 270 261 219 280 378 181 166
119
[91] 174 124 112 227 161 216 102 96 97 56 72 62 78 74 33 30 39
32
[109] 25 41 24 49 43 20 32 14 14 13 13 53 11 16 21 21 46 21
> cost=scan('clipboard');cost
Read 126 items
[1] 2165.000 2214.985 1393.550 2460.634 1151.600 801.469 1072.000
1608.013
[9] 793.009 1383.854 494.566 437.684 83.000 626.000 37.735
256.658
[17] 230.000 200.000 199.000 11.720 59.200 130.000 255.020
3.500
[25] 20.906 284.569 107.447 64.702 6.500 156.316 15.950
722.069
[33] 6.121 30.000 5.700 50.237 19.670 7.888 3.500 112.181
[41] 30.000 3.575 2074.000 1312.601 434.237 495.000 1511.457
1800.000
[49] 2050.000 623.117 796.026 360.000 538.848 568.536 300.000
249.205
[57] 150.000 220.000 50.302 517.729 51.000 75.704 271.724
118.049
[65] 40.000 10.000 10.000 70.000 12.000 20.000 36.277 36.277
[73] 10.450 14.300 4.296 379.498 1.520 45.000 96.619 270.000
[81] 60.000 12.500 1934.820 3000.000 1675.995 903.000 2429.367
1143.850
[89] 900.000 600.000 2500.000 1103.939 363.825 1538.000 1370.968
1339.903
[97] 173.481 210.000 441.737 96.000 177.833 252.390 377.182
111.000
[105] 238.000 45.000 50.000 40.000 61.766 166.903 116.056
41.000
[113] 195.821 96.713 6.500 5.500 4.000 15.000 9.500 48.200
[121] 3.000 27.084 30.000 20.000 43.549 10.000
> fit=lm(cost~room)
> summary(fit)
Call:
lm(formula = cost ~ room)
Residuals:
Min 1Q Median 3Q Max
-1485.63 -103.31 -24.63 56.92 1528.86
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -87.050 41.885 -2.078 0.0397 *
room 6.082 0.315 19.307 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 340.4 on 124 degrees of freedom
Multiple R-squared: 0.7504, Adjusted R-squared: 0.7484
F-statistic: 372.8 on 1 and 124 DF, p-value: < 2.2e-16
> plot(room,cost)
> abline(fit)
> cor.test(room,cost)
Pearson's product-moment correlation
data: room and cost
t = 19.307, df = 124, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.8147969 0.9041651
sample estimates:
cor
0.8662512
Que.1
Regression coefficient b0 = -87.050
b1 = 6.082
Que.2
Regression equation:
Labour cost = -87.050 + 6.082 Total rooms
Que.3
Scatter plot:
Que.4
Interpretation of b1 :
When number of rooms in the hotel are increased by 1 then labour cost increased by 6.082 units.
Interpretation bo :
When a hotel have zero room then labour cost is -87.050
The value of individual t test for b0 is -2.078 and p-value is 0.0397 which is less than 0.05, hence b0 is statistically significant.
The value of individual t test for b1 is 19.307 and p-value is 0.0000 which is less than 0.05, hence b1 is statistically significant.
Que.5
The correlation coefficient between total rooms per hotel and labour cost is 0.8663. Which is high degree positive correlation. Which indicates that if we increase number of room per hotel then labour cost also increases and voice-a-versa.
Que.6
The p-value for testing correlation coefficient is 2.2e-16 which is less than 0.05, hence at 5% level of significance we reject null hypothesis. And conclude that their exist linear relationship between labour cost and total number of rooms per hotel.