In: Statistics and Probability
A real estate investor wants to study the relationship between annual return on his commer- cial retail shops (measured in thousands of dollars) as it relates to their location and the number of homes near the shops. Specifically, the investor has collected data on the annual return of the shops, the number of households within 15 miles of the shops (measured in thousands), and the location of the shops (whether the shops are in a suburban area, near a shopping mall, or downtown). The annual return data can be found in the file “RealEstate.csv” in the d2l. As demonstrated in the lecture, please create a subset data of size 18 and perform your statistical analysis for the subset data. Please note that the subset data should be a random sample of the given data.
(a) State the mean of the response.
(b) Is the multiple linear regression model useful for prediction?
Show details. Use ? = 0.05. (c) Provide the detailed
interpretations of b1, b2, and b3 in the context of the
problem.
(d) Use your estimated regression equation to predict the annual return for a shop in mall with 120,000 households near the shop.
Shop. Annual Return($1000s). Number of Households(1000s).
Location.
3 245.81   232   Mall
4   137.07   108   Mall
5   207.36   220   Suburban
6   146.12   150   Suburban
8   188.19   198   Suburban
9   152.23   149   Downtown
10   182.23   192   Suburban
11   198.88   179   Mall
13   156.22   130   Mall
15   195.62   199   Downtown
16   210.38   224   Suburban
17   209.16   215   Downtown
18   260.82   250   Mall
20   127.66   129   Suburban
22   219.93   203   Mall
23   166.61   166   Downtown
27   219.67   227   Downtown
29   232.32   217   Mall
A)
Annual Return($1000s).
mean = sum(Yi)/n
192.0155556
B)
Ho:model is not significant. V/s h1: model is significant
With F=109664, P<5%,I reject the null hypothesis at 5% level of significance and conclude that the model is significant.
The coefficient of determination, ask where is 99.99% indicating the model is a good FIT to the data. There is 99.99% variation in annual return which is explained by number of households and location.

C)
The regression equation is given by annual return = 21.86 + *0.872*number of household + 21.05*Mall-6.69*sub urban
WITH 1000 increase number of households, there is 872$ increase in the annual return.
For mall location, the annual return is 21052$ more as compared to Downtown
For Suburban location, the annual return is 6695$ less as compared to Downtown
D)
When location is small and the number of households near the shop is 120 units, the predicted annual return is =21.86+0.8723*120+21.05 = 147.586 (1000 $'s)