In: Math
You are trying to predict if audit clients will go bankrupt in the next year. Based on the results from a logistic regression, the actual vs predicted numbers for clients is as follows, when using a 50 percent cutoff probability for predicting that a client will go bankrupt:
Prediction from Model |
||
Actual Status |
Not Bankrupt |
Bankrupt |
Not Bankrupt |
950 |
50 |
Bankrupt |
40 |
60 |
One of your colleagues, John, suggests that you should use a 70 percent cutoff probability for predicting that a client will go bankrupt.
Another colleague, Mike, suggests that you should use a 30 percent cutoff probability for predicting that a client will go bankrupt.
Answer the following questions:
(a) Who is correct, in this situation? Explain your answer with appropriate logic.
(b) What will happen to the Actual vs Predicted matrix if you use:
(i) 70 percent cutoff probability suggested by John? That is, how would the total numbers in the two columns and two rows change, if at all they change?
(i) 30 percent cutoff probability suggested by Mike? That is, how would the total numbers in the two columns and two rows change, if at all they change?
We can explain the choice of a better cutoff point with the help
of sensitivity and specificity values
Sensitivity is defined as the total number of bankrupts(in the
current situation) who will be correctly classified as
bankrupts
Specificity is defined as the total number of non bankrupts who
will be correctly classified as non bankrupts.
Effect of shift in cut points on sensitivity and specificity:
Here, we have the contingency table (the table which contains the
actual and predicted values) for the actual and predicted
values.
Fixing the cutoff to a lower value than the one chosen presently
will result in a greater sensitivity and lower specificity (more
false positives) while fixing the cutoff to a higher value than the
one chosen presently will result in a lower sensitivity and higher
specificity (more false negatives).
a)
Here in the case of predicting bankrupts, there will be loss to be
incurred if bankrupts are not correctly classified as
bankrupts.
So we need a higher sensitivity value (or false positives should be
reduced)
Mike's suggestion would serve better and give better results.
b)
30 percent cutoff probability suggested by Mike will give more false positives. ie Bankrupts predicted as Bankrupts will have a higher number.
70 percent cutoff probability suggested by John will give more false negatives. ie Non Bankrupts predicted as Non Bankrupts will have a higher number.