In: Statistics and Probability
Data shows graduate program admission decisions (Yes: 1 and No: 2), GRE score and undergraduate GPA for twenty-five students.
Tasks:
Examine if the given data is suitable for the application of linear discriminant analysis.
Create a linear discriminant function predicting admission decisions.
Comment on the classification accuracy.
Predict the admission decision given GRE score = 690 and GPA = 3.2.
Perform logistic regression analysis for the data.
Compare the classification accuracies of both methods.
Admit | GRE | GPA |
2 | 790 | 3.8 |
1 | 370 | 3.4 |
2 | 480 | 2.9 |
1 | 580 | 3.3 |
1 | 620 | 3.9 |
1 | 740 | 3.2 |
2 | 490 | 3.1 |
2 | 720 | 3.7 |
1 | 740 | 3.9 |
2 | 460 | 3.4 |
1 | 610 | 3.3 |
1 | 260 | 2.5 |
2 | 740 | 4 |
1 | 700 | 3.6 |
1 | 760 | 3.5 |
1 | 410 | 2.8 |
1 | 700 | 4 |
1 | 800 | 3.4 |
2 | 680 | 2.9 |
2 | 520 | 3.2 |
1 | 700 | 3.5 |
1 | 580 | 3.3 |
2 | 470 | 3.9 |
1 | 640 | 3.8 |
2 | 410 | 3.8 |
I have solve the problem in R Code:
library(MASS)
data_GRE=read.csv(file.choose())
data_GRE$Admit=as.factor(data_GRE$Admit)
lda_model=lda(Admit~GRE+GPA,data_GRE)
lda_predtest=predict(lda_model,data_GRE)
table(lda_predtest$class,data_GRE$Admit)
Linear Discriminant Function:
Admit = Wo + W1* GRE + W2*GPA
Wo is the Bias weight
a)
Classification accuracy
table(lda_pred$class,data_GRE$Admit)
1 | 2 | Total | |
1 | 14 | 8 | 22 |
2 | 1 | 2 | 3 |
Total | 15 | 10 | 25 |
Diagonally we see that exact prediction = 14 +2 =16
accuracy = 16 /25 = 0.64
64% is our accuracy
b) Predict the admission decision given GRE score = 690 and GPA = 3.2.
R -code > lda_pred=predict(lda_model,data.frame("GRE"=690,"GPA"=3.2))
Ans: Yes: 1
c)
# Logistic regression
log_model=glm(Admit~GRE+GPA,data_GRE,family = binomial)
log_probs=predict(log_model,data_GRE,type="response")
log_pred=rep(2,25)
log_pred[log_probs>0.5]=1
table(log_pred,data_GRE$Admit)
Classification accuracy is
1 | 2 | Total | |
1 | 1 | 2 | 3 |
2 | 14 | 8 | 22 |
15 | 10 | 25 |
= 9/25
=0.34
34 % is logistic regression, which is less than LDA, Hence LDA method in this problem