In: Statistics and Probability
An agent for a real estate company wanted to predict the monthly rent for apartments based on the size of the apartment. The data for a sample of
25 apartments is available below. Perform a t test for the slope to determine if a significant linear relationship between the size and the rent exists.
a. At the 0.05 level of significance, is there evidence of a linear relationship between the size of the apartment and the monthly rent?
b. Construct a 95% confidence interval estimate of the population slope, betaβ1.
Size_(sq._ft) | Rent_($) |
850 | 1950 |
1450 | 2575 |
1085 | 2225 |
1232 | 2500 |
718 | 1975 |
1495 | 2675 |
1126 | 2675 |
726 | 1935 |
700 | 1875 |
966 | 2150 |
1110 | 2400 |
1285 | 2650 |
1985 | 3275 |
1359 | 2775 |
1165 | 2425 |
1235 | 2475 |
1255 | 2075 |
1269 | 2725 |
1160 | 2200 |
896 | 2175 |
1351 | 2625 |
1040 | 2650 |
765 | 2175 |
1000 | 1825 |
1200 | 2775 |
Given:
Size_(sq._ft) x |
Rent_($) y |
850 | 1950 |
1450 | 2575 |
1085 | 2225 |
1232 | 2500 |
718 | 1975 |
1495 | 2675 |
1126 | 2675 |
726 | 1935 |
700 | 1875 |
966 | 2150 |
1110 | 2400 |
1285 | 2650 |
1985 | 3275 |
1359 | 2775 |
1165 | 2425 |
1235 | 2475 |
1255 | 2075 |
1269 | 2725 |
1160 | 2200 |
896 | 2175 |
1351 | 2625 |
1040 | 2650 |
765 | 2175 |
1000 | 1825 |
1200 | 2775 |
a)
Equation for simple linear regression is given as
where is the slope.
and of the above equation is estimated by solving the below simultaneous equation
n = number of samples = 25
The above required parameters are calculated below
x |
y |
x2 |
xy |
850 |
1950 |
722500 |
1657500 |
1450 |
2575 |
2102500 |
3733750 |
1085 |
2225 |
1177225 |
2414125 |
1232 |
2500 |
1517824 |
3080000 |
718 |
1975 |
515524 |
1418050 |
1495 |
2675 |
2235025 |
3999125 |
1126 |
2675 |
1267876 |
3012050 |
726 |
1935 |
527076 |
1404810 |
700 |
1875 |
490000 |
1312500 |
966 |
2150 |
933156 |
2076900 |
1110 |
2400 |
1232100 |
2664000 |
1285 |
2650 |
1651225 |
3405250 |
1985 |
3275 |
3940225 |
6500875 |
1359 |
2775 |
1846881 |
3771225 |
1165 |
2425 |
1357225 |
2825125 |
1235 |
2475 |
1525225 |
3056625 |
1255 |
2075 |
1575025 |
2604125 |
1269 |
2725 |
1610361 |
3458025 |
1160 |
2200 |
1345600 |
2552000 |
896 |
2175 |
802816 |
1948800 |
1351 |
2625 |
1825201 |
3546375 |
1040 |
2650 |
1081600 |
2756000 |
765 |
2175 |
585225 |
1663875 |
1000 |
1825 |
1000000 |
1825000 |
1200 |
2775 |
1440000 |
3330000 |
Plugging in the above values into the equations we have
Multiplying the equation (1) by 28423 and equation (2) by 25 we have
Subtracting (3) from (4) we have
Substituting in equation 1 we have
Hence, the simple linear equation to predict the rent is given as
Rent = 1207.2481 + 1.0407 Size
Hence, for each sample for the observed size, the predicted equation is calculated as below
850 | 1950 | 2091.8431 |
1450 | 2575 | 2716.2631 |
1085 | 2225 | 2336.4076 |
1232 | 2500 | 2489.3905 |
718 | 1975 | 1954.4707 |
1495 | 2675 | 2763.0946 |
1126 | 2675 | 2379.0763 |
726 | 1935 | 1962.7963 |
700 | 1875 | 1935.7381 |
966 | 2150 | 2212.5643 |
1110 | 2400 | 2362.4251 |
1285 | 2650 | 2544.5476 |
1985 | 3275 | 3273.0376 |
1359 | 2775 | 2621.5594 |
1165 | 2425 | 2419.6636 |
1235 | 2475 | 2492.5126 |
1255 | 2075 | 2513.3266 |
1269 | 2725 | 2527.8964 |
1160 | 2200 | 2414.4601 |
896 | 2175 | 2139.7153 |
1351 | 2625 | 2613.2338 |
1040 | 2650 | 2289.5761 |
765 | 2175 | 2003.3836 |
1000 | 1825 | 2247.9481 |
1200 | 2775 | 2456.0881 |
Null hypothesis:
H0: There is no linear relationship between the size of the apartment and the monthly rent,
Alternative hypothesis:
H1: There is a linear relationship between the size of the apartment and the monthly rent,
Level of significance:
Test statistic:
where, is the mean. Therefore, the above deviations are calculated as below
850 | 1950 | 2091.8431 | -286.92 | 82323.09 | -141.843 | 20119.47 |
1450 | 2575 | 2716.2631 | 313.08 | 98019.09 | -141.263 | 19955.26 |
1085 | 2225 | 2336.4076 | -51.92 | 2695.686 | -111.408 | 12411.65 |
1232 | 2500 | 2489.3905 | 95.08 | 9040.206 | 10.6095 | 112.5615 |
718 | 1975 | 1954.4707 | -418.92 | 175494 | 20.5293 | 421.4522 |
1495 | 2675 | 2763.0946 | 358.08 | 128221.3 | -88.0946 | 7760.659 |
1126 | 2675 | 2379.0763 | -10.92 | 119.2464 | 295.9237 | 87570.84 |
726 | 1935 | 1962.7963 | -410.92 | 168855.2 | -27.7963 | 772.6343 |
700 | 1875 | 1935.7381 | -436.92 | 190899.1 | -60.7381 | 3689.117 |
966 | 2150 | 2212.5643 | -170.92 | 29213.65 | -62.5643 | 3914.292 |
1110 | 2400 | 2362.4251 | -26.92 | 724.6864 | 37.5749 | 1411.873 |
1285 | 2650 | 2544.5476 | 148.08 | 21927.69 | 105.4524 | 11120.21 |
1985 | 3275 | 3273.0376 | 848.08 | 719239.7 | 1.9624 | 3.851014 |
1359 | 2775 | 2621.5594 | 222.08 | 49319.53 | 153.4406 | 23544.02 |
1165 | 2425 | 2419.6636 | 28.08 | 788.4864 | 5.3364 | 28.47716 |
1235 | 2475 | 2492.5126 | 98.08 | 9619.686 | -17.5126 | 306.6912 |
1255 | 2075 | 2513.3266 | 118.08 | 13942.89 | -438.327 | 192130.2 |
1269 | 2725 | 2527.8964 | 132.08 | 17445.13 | 197.1036 | 38849.83 |
1160 | 2200 | 2414.4601 | 23.08 | 532.6864 | -214.46 | 45993.13 |
896 | 2175 | 2139.7153 | -240.92 | 58042.45 | 35.2847 | 1245.01 |
1351 | 2625 | 2613.2338 | 214.08 | 45830.25 | 11.7662 | 138.4435 |
1040 | 2650 | 2289.5761 | -96.92 | 9393.486 | 360.4239 | 129905.4 |
765 | 2175 | 2003.3836 | -371.92 | 138324.5 | 171.6164 | 29452.19 |
1000 | 1825 | 2247.9481 | -136.92 | 18747.09 | -422.948 | 178885.1 |
1200 | 2775 | 2456.0881 | 63.08 | 3979.086 | 318.9119 | 101704.8 |
Applying the above estimated values in to the SE formula we have
Plugging in the SE into t-statistic we have
Critical Value:
The critical value of t-statistic is calculated from the students t-table for 5% level of significance with (n-2) degrees of freedom with two-tailed test. Therefore, for the above hypothesis we have n-2 = 25 -2 = 23 degrees of freedom. Hence, at 5% level of significance for two-tailed test with 23 degrees of freedom the t-value from the students t distribution table is given as 2.069. (It can also be calculated using excel with the function TINV. The inputs for the function are level of significance which is 0.05 and degrees of freedom which is 23. Therefore, the final function in excel is given as =TINV(0.05,23))
Inference:
The calculated t-value of 7.3809 is greater than the critical t-value of 2.069. Therefore, the null hypothesis is rejected and we conclude that there is an evidence of linear relationship between the size of the apartment and the monthly rent.
b)
95% Confidence interval for the slope is constructed as below
From section a we have
= 1.0407
t for 95% CI with 23 degrees of freedom = 2.069
= 0.141
Therefore,