In: Statistics and Probability
The data below relates diamond carats to purchase prices.
| Carat | 0.31 | 0.32 | 0.36 | 0.38 | 0.4 | 0.43 | 0.45 | 0.48 | 0.5 | 
| Price | 1641 | 1468 | 1685 | 1420 | 1797 | 1824 | 2043 | 2342 | 2122 | 
Run a linear regression model, with the carat being the independent variable ( X ) and the purchase price being the dependent variable ( Y ).
(a) Find the estimate of the intercept for the linear regression model.
(b) Find the estimate of the slope for the linear regression model.
(c) What is the predicted price of a 0.37 carat diamond?
(d) What fraction of the dependent variable’s variation can be explained by this linear regression model?
SOLUTION:
Given in the question
Carat is the independent variable and the purchase price is
dependent variable
Regression equation can be calculated as
Y = a+bX
where a is intercept of regression line and b is slope of
regression line
Slope of regression line can be calculated as
Slope = ((n*Summation(XY)) -
(Summation(X)*Summation(Y))/(n*Summation(X^2) -
(Summation(X))^2))
| 
 Carat(X)  | 
 Price(Y)  | 
 X^2  | 
 Y^2  | 
 XY  | 
| 
 0.31  | 
 1641  | 
 0.0961  | 
 2692881  | 
 508.71  | 
| 
 0.32  | 
 1468  | 
 0.1024  | 
 2155024  | 
 469.76  | 
| 
 0.36  | 
 1685  | 
 0.1296  | 
 2839225  | 
 606.6  | 
| 
 0.38  | 
 1420  | 
 0.1444  | 
 2016400  | 
 539.6  | 
| 
 0.4  | 
 1797  | 
 0.16  | 
 3229209  | 
 718.8  | 
| 
 0.43  | 
 1824  | 
 0.1849  | 
 3326976  | 
 784.32  | 
| 
 0.45  | 
 2043  | 
 0.2025  | 
 4173849  | 
 919.35  | 
| 
 0.48  | 
 2342  | 
 0.2304  | 
 5484964  | 
 1124.16  | 
| 
 0.5  | 
 2122  | 
 0.25  | 
 4502884  | 
 1061  | 
| 
 3.63  | 
 16342  | 
 1.5003  | 
 30421412  | 
 6732.3  | 
Slope = (9*6732.3 - 3.63*16342) / (9*1.5003 - 3.63*3.63) =
3895.76
Intercept of regression line can be calculated as
Interecept = (Summation(Y) - slope*Summation(X))/n = (16342 -
3895.76*3.63)/9 = 244.49
So regression equation is Y = 244.49 + 3895.76*X
Solution(c)
If X = 0.37 than Y can be calcualted as
Y = 244.49 + 3895.76*X = 244.49 + 3895.76*0.37 = 1685.92
Solution(d)
For calculating coefficient of determination, first we will
calculate correaltion coefficient which can be calculated as
Correlation coefficient = (n*Summation(XY) -
Summation(X)*Summation(Y))/sqrt(((n*Summation(X^2) -
Summation(X)^2))*((n*Summation(Y^2) - Summation(Y)^2))) = (9*6732.3
- 3.63*16342)/sqrt((9*1.5003 - 3.63*3.63)*(9*30421412 - 16342*16342)) =
1269.24/sqrt(0.3258*6731744) = 0.8570
So coefficient of determination can be calculated as
Coefficient of determination = (Correlation coeffcient)^2 =
(0.8570)^2 = 0.7345
So this model explain the 73.45% fraction of the dependent
variables variation can be explained by this linear regression
model.