In: Statistics and Probability
Please use excel and show all work and formulas. (I will give your work a like if you do this)
|
Size (1000s sq. ft) |
Selling Price ($1000s) |
|
1.26 |
117.5 |
|
3.02 |
299.9 |
|
1.99 |
139.0 |
|
0.91 |
45.6 |
|
1.87 |
129.9 |
|
2.63 |
274.9 |
|
2.60 |
259.9 |
|
2.27 |
177.0 |
|
2.30 |
175.0 |
|
2.08 |
189.9 |
|
1.12 |
95.0 |
|
1.38 |
82.1 |
|
1.80 |
169.0 |
|
1.57 |
96.5 |
|
1.45 |
114.9 |
Running a simple linear regression for establishing a causal linear relationship between the variables Size and Selling Price, by regressing the dependent / response variable 'Selling Price' on the independent variable predictor 'Size', using excel,


We get the output:

From the output,
a. P-value for the t test for slope = 0.000 (for t = 10.668) P-value for the overall test corresponding to F = 0.000 (for F = 113.81)
We find that both the slope and the overall model are significant at 5% level.Here , the estimated slope coefficient implies that the average selling price of a house increases by 115.091 ($1000s) for every square foot increase in size.
b. The coefficient of determination, a measure of goodness of fit, r2 = 0.897. This implies that the variation in the predictor 'Size' explains about 89.7% of the total variation in 'Selling price'. This value (r2 = 0.897) close to unity implies that the model is a pretty good fit to the data.
c. Correlation coefficient (or) is a quantitative measure that explains the strength and direction of this linear relationship. It ranges from -1 to 1, negative and positive values indicating a negative and positive linear relationship respectively. Values close to unity, depicts a strong linear relationship and those close to zero implies weak or no linear relationship.Here , R =0.947. This may be interpreted as: There is a strong positive linear relationship between Selling price and Size. As the size increases, the Selling price also increases linearly.
d. At x = 2, 95% CI for mean is computed using the formula:

Computing predicted value of y for x = 2:

Critical value of t:

SE = Standard Error of regression = 24.60
Constructing the Lower and Upper limits:


We get 95% CI = (108.854, 233.480)
Hence, the 95% confidence interval for the mean value of the selling price of a 2000 square foot house in Winston Salem, North Carolina = (108.854,233.480) (in $1000s).
e. The Prediction interval can be constructed using the formula:

Using excel:


X bar:

Syx:

SSx:

Predicted y value:

Critical value of t:

SE:

Constructing the 99% PI:



Hence, the 99% prediction interval for the selling price of a 2500 square foot house in Winston Salem, North Carolina = (140.127,317.299) (in $1000s)