In: Statistics and Probability
$3,198,000 |
$1,450,000 |
$5,375,000 |
$3,648,000 |
$5,195,000 |
$5,345,000 |
$5,495,000 |
$16,300,000 |
$7,395,000 |
$9,995,000 |
$13,500,000 |
$11,995,000 |
$5,245,000 |
$6,350,000 |
$7,199,000 |
$5,498,000 |
$23,000,000 |
$6,890,000 |
$37,500,000 |
$16,950,000 |
$13,995,000 |
$7,400,000 |
$10,900,000 |
$33,400,000 |
$18,500,000 |
$18,995,000 |
$10,995,000 |
$11,995,000 |
$10,995,000 |
$23,995,000 |
$2,648,000 |
$18,900,000 |
$39,995,000 |
$9,500,000 |
$29,000,000 |
2,089 sqft |
1,961 sqft |
4,415 sqft |
2,083 sqft |
5,352 sqft |
3,458 sqft |
2,763 sqft |
11,438 sqft |
4,847 sqft |
5,007 sqft |
5,639 sqft |
6,306 sqft |
2,892 sqf |
2,626 sqft |
4,346 sqft |
1,814 sqft |
9,000 sqft |
6,195 sqft |
16,450 sqft |
5,269 sqft |
5,047 sqft |
4,173 sqft |
5,134 sqft |
17,000 sqft |
11,872 sqft |
6,424 sqft |
7,451 sqft |
6,244 sqft |
6,495 sqft |
9,200 sqft |
2,204 sqft |
5,880 sqft |
15,328 sqft |
1,518 sqft |
11,491 sqft |
Using the data collected, find the relationship between any two quantitative variables
a) Perform a correlation analysis
b) Perform a regression analysis .Develop a linear regression mathematical model that would help make a prediction based on a given dependent variable.
c) Discuss the findings of your regression analysis. Interpret slope, coefficient of determination in context on problem
d) Provide the scatter plot. (20)
Excel > Data > Data Analysis > Regression
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.913814181 | |||||||
R Square | 0.835056357 | |||||||
Adjusted R Square | 0.830058065 | |||||||
Standard Error | 4085011.67 | |||||||
Observations | 35 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 2.78792E+15 | 2.78792E+15 | 167.0683352 | 1.82875E-14 | |||
Residual | 33 | 5.50682E+14 | 1.66873E+13 | |||||
Total | 34 | 3.3386E+15 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | -620067.5902 | 1266733.768 | -0.489501114 | 0.627725157 | -3197256.819 | 1957121.639 | -3197256.819 | 1957121.639 |
SQFT(X) | 2189.673105 | 169.4073354 | 12.92549168 | 1.82875E-14 | 1845.01129 | 2534.33492 | 1845.01129 | 2534.33492 |
a)
Correlation coefficient (r) = 0.9138
Hypothesis:
H0:ρ=0
HA:ρ̸=0
Test:
t stat = r*SQRT((n-2)/(1-r^2)) = 0.9138*SQRT((35-2)/(1-0.9138^2)) = 12.924
P value = 0
P value < 0.05, reject H0
There is enough evidence to conclude that there is a significant linear relationship between x and y varaibles
b)
Hypothesis:
H0: β1 = 0
Ha: β1 not = 0
Test:
F stat = 167.07
P value = 0
P value < 0.05, reject H0
Therefore regression model is statistically significant
Regression equation:
Y = -620067.5902+2189.6731*X
c)
Slope interpretation:
If sqft increases by 1 unit, price increases by 2189.6731 units
Coefficient of determination (r^2) = 0.8351
83.51% of variation in price is explained by variation in sqft or regression model
d)