In: Statistics and Probability
| 
 Purchase  | 
 Income ($ '000)  | 
 Age  | 
 Gender  | 
| 
 0  | 
 71.9  | 
 42  | 
 2  | 
| 
 0  | 
 100.4  | 
 42  | 
 1  | 
| 
 0  | 
 105.6  | 
 44  | 
 1  | 
| 
 1  | 
 83.1  | 
 39  | 
 2  | 
| 
 0  | 
 114.2  | 
 43  | 
 1  | 
| 
 1  | 
 113.5  | 
 44  | 
 1  | 
| 
 0  | 
 115.2  | 
 42  | 
 1  | 
| 
 0  | 
 100.4  | 
 35  | 
 2  | 
| 
 0  | 
 92.6  | 
 43  | 
 2  | 
| 
 0  | 
 123.8  | 
 42  | 
 1  | 
| 
 0  | 
 122.8  | 
 45  | 
 1  | 
| 
 1  | 
 98.6  | 
 46  | 
 2  | 
| 
 0  | 
 107.6  | 
 41  | 
 2  | 
| 
 0  | 
 108.4  | 
 42  | 
 2  | 
| 
 1  | 
 138.8  | 
 41  | 
 1  | 
| 
 1  | 
 109.9  | 
 44  | 
 2  | 
| 
 1  | 
 136.2  | 
 47  | 
 1  | 
| 
 1  | 
 117.6  | 
 38  | 
 2  | 
| 
 1  | 
 122.8  | 
 43  | 
 2  | 
| 
 0  | 
 121.8  | 
 45  | 
 2  | 
| 
 1  | 
 126.6  | 
 41  | 
 2  | 
| 
 1  | 
 125.8  | 
 46  | 
 2  | 
| 
 1  | 
 138.8  | 
 42  | 
 2  | 
| 
 0  | 
 149.6  | 
 37  | 
 1  | 
| 
 1  | 
 159.5  | 
 33  | 
 2  | 
Code definitions: Purchase 0 – Not purchased and 1 – Purchased; Gender 1 – Male and 2 – Female
Fit a logistic regression model to predict purchase decision. Identify significant predictors and comment on classification accuracy.
Submit a word doc including key results and their interpretation for both parts A and B. Attach Excel files to support your results which is a must to get credit for the assignment.
Data:
| Purchase | Income | Age | Gender | 
| 0 | 71.9 | 42 | 2 | 
| 0 | 100.4 | 42 | 1 | 
| 0 | 105.6 | 44 | 1 | 
| 1 | 83.1 | 39 | 2 | 
| 0 | 114.2 | 43 | 1 | 
| 1 | 113.5 | 44 | 1 | 
| 0 | 115.2 | 42 | 1 | 
| 0 | 100.4 | 35 | 2 | 
| 0 | 92.6 | 43 | 2 | 
| 0 | 123.8 | 42 | 1 | 
| 0 | 122.8 | 45 | 1 | 
| 1 | 98.6 | 46 | 2 | 
| 0 | 107.6 | 41 | 2 | 
| 0 | 108.4 | 42 | 2 | 
| 1 | 138.8 | 41 | 1 | 
| 1 | 109.9 | 44 | 2 | 
| 1 | 136.2 | 47 | 1 | 
| 1 | 117.6 | 38 | 2 | 
| 1 | 122.8 | 43 | 2 | 
| 0 | 121.8 | 45 | 2 | 
| 1 | 126.6 | 41 | 2 | 
| 1 | 125.8 | 46 | 2 | 
| 1 | 138.8 | 42 | 2 | 
| 0 | 149.6 | 37 | 1 | 
| 1 | 159.5 | 33 | 2 | 
> model = glm(Purchase~Income+Age+Gender,data=data,family =
"binomial")
> summary(model)
Call:
glm(formula = Purchase ~ Income + Age + Gender, family =
"binomial",
data = data)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9208 -0.7992 -0.4139 0.8686 1.9216
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -17.66520 9.26768 -1.906 0.0566 .
Income 0.06139 0.03015 2.036 0.0417 * (significant)
Age 0.16189 0.15741 1.028 0.3037
Gender 2.28756 1.11809 2.046 0.0408 *(significant)
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 34.617 on 24 degrees of freedom
Residual deviance: 26.600 on 21 degrees of freedom
AIC: 34.6
Number of Fisher Scoring iterations: 4
Here,
Please rate my answer and comment for doubt.