In: Statistics and Probability
In a few sentences, explain why we do a logistic transformation of the outcome data before doing the logistic regression.
Solution :
Consider the situation where the response variable in a regression problem takes only two possible values like 0 & 1.
Suppose that the model has the form
Where = [ 1, ] , , and the response variable , takes on the value either 0 or 1. We assume that the response variable is Bernoulli random variable with probability distribution as follows :
Probability | |
1 | |
0 |
Now since E() = 0, the expected value of the response variable is,
E() = 1(i ) + 0(1- ) =
This implies that
E ( ) =
This means that the expected response given by the response function E ()= is just the probability that the response variable takes on the value 1.
Generally, when the response variable is binary, there is considered empirical evidence indicating that the shape of response function should be nonlinear. A monotonically increasing ( or decreasing) S- shape ( or reverse S - shape) function. This function is called the logistic response function and has form.
This logistic response function can be easily linearized. One approach defines the structural portions of the model. In terms of a function of The response function mean. Let
Be the linear predictor where is defined by the transformation
This transformation is often called the logit transformation of the probability and the ratio in the transformation is called odds. Sometimes the logit transformation is called the log- odds.
So we do a logistic transformation of the data doing the logistic regression.