In: Statistics and Probability
Let’s consinder a mortgage application using HMDA (The Home
Mortgage Disclosure Act). Here is a sample from 30 mortgage
applications.
ID |
Loanamt |
Income |
hprice |
1 |
109 |
63 |
155 |
2 |
185 |
137 |
264 |
3 |
121 |
53 |
128 |
4 |
125 |
78 |
125 |
5 |
119 |
37 |
149 |
6 |
153 |
65 |
171 |
7 |
380 |
188 |
484 |
8 |
100 |
58 |
125 |
9 |
110 |
78 |
158 |
10 |
41 |
31 |
116.5 |
11 |
115 |
54 |
128 |
12 |
248 |
117 |
280 |
13 |
126 |
60 |
157.5 |
14 |
260 |
192 |
325 |
15 |
90 |
40 |
145 |
16 |
50 |
36 |
230 |
17 |
125 |
45 |
125 |
18 |
125 |
55 |
145 |
19 |
158 |
62 |
175 |
20 |
130 |
29 |
209 |
21 |
204 |
77 |
260 |
22 |
30 |
28 |
150 |
23 |
114 |
60 |
143 |
24 |
188 |
91 |
253 |
25 |
187 |
85 |
285 |
26 |
84 |
44 |
105 |
27 |
450 |
265 |
650 |
28 |
108 |
49 |
120 |
29 |
100 |
53 |
125 |
30 |
53 |
24 |
66 |
loanamt = Amount of Mortgage Loan Application (in $1000)
income = Applicant’s Annual Income (in $1000)
hprice = House Price to buy (in $1000)
Regression Analyis
Let’s consider the following regression model. Estimate the model using Minitab and answer the questions using the output.
Loanamti = b0 + b1 * Incomei + et
Write the equations for the following statistics, find or calculate them from the Minitab or Excel output, and explain the meanings of the statistics
1.Equations for R2 and r (correlation coefficient), and
perform the t test for the correlation coefficient between Loanamt
and Income is zero.
2.Standard Error of b1 and variance of b1
3. Standard deviation and variance of et
4. Plot the residuals, and explain if you find any possible
violations of assumptions on the regression model.
Loanamti = 29.3726 +1.5558 * Incomei + et
The hypothesis being tested is:
H0: ρ = 0
Ha: ρ ≠ 0
The p-value is 0.0000.
Since the p-value (0.0000) is less than the significance level (0.05), we can reject the null hypothesis.
Therefore, we can conclude that the correlation coefficient between Loanamt and Income is not zero.
The residual plot is:
All the assumptions are met.
The calculations are:
r² | 0.871 | |||||
r | 0.933 | |||||
Std. Error | 33.414 | |||||
n | 30 | |||||
k | 1 | |||||
Dep. Var. | Loanamt | |||||
ANOVA table | ||||||
Source | SS | df | MS | F | p-value | |
Regression | 2,10,816.8918 | 1 | 2,10,816.8918 | 188.83 | 5.72E-14 | |
Residual | 31,260.9749 | 28 | 1,116.4634 | |||
Total | 2,42,077.8667 | 29 | ||||
Regression output | confidence interval | |||||
variables | coefficients | std. error | t (df=28) | p-value | 95% lower | 95% upper |
Intercept | 29.3726 | |||||
Income | 1.5558 | 0.1132 | 13.741 | 5.72E-14 | 1.3239 | 1.7877 |
Observation | Loanamt | Predicted | Residual | |||
1 | 109.0 | 127.4 | -18.4 | |||
2 | 185.0 | 242.5 | -57.5 | |||
3 | 121.0 | 111.8 | 9.2 | |||
4 | 125.0 | 150.7 | -25.7 | |||
5 | 119.0 | 86.9 | 32.1 | |||
6 | 153.0 | 130.5 | 22.5 | |||
7 | 380.0 | 321.9 | 58.1 | |||
8 | 100.0 | 119.6 | -19.6 | |||
9 | 110.0 | 150.7 | -40.7 | |||
10 | 41.0 | 77.6 | -36.6 | |||
11 | 115.0 | 113.4 | 1.6 | |||
12 | 248.0 | 211.4 | 36.6 | |||
13 | 126.0 | 122.7 | 3.3 | |||
14 | 260.0 | 328.1 | -68.1 | |||
15 | 90.0 | 91.6 | -1.6 | |||
16 | 50.0 | 85.4 | -35.4 | |||
17 | 125.0 | 99.4 | 25.6 | |||
18 | 125.0 | 114.9 | 10.1 | |||
19 | 158.0 | 125.8 | 32.2 | |||
20 | 130.0 | 74.5 | 55.5 | |||
21 | 204.0 | 149.2 | 54.8 | |||
22 | 30.0 | 72.9 | -42.9 | |||
23 | 114.0 | 122.7 | -8.7 | |||
24 | 188.0 | 171.0 | 17.0 | |||
25 | 187.0 | 161.6 | 25.4 | |||
26 | 84.0 | 97.8 | -13.8 | |||
27 | 450.0 | 441.7 | 8.3 | |||
28 | 108.0 | 105.6 | 2.4 | |||
29 | 100.0 | 111.8 | -11.8 | |||
30 | 53.0 | 66.7 | -13.7 |