In: Statistics and Probability
In a few sentences, explain why we do a logistic transformation of the outcome data before doing the logistic regression.
Solution :
Consider the situation where the response variable in a regression problem takes only two possible values like 0 & 1.
Suppose that the model has the form
Where
= [ 1,
] ,
, and the response variable
, takes on the value either 0 or 1. We assume that the response
variable
is Bernoulli random variable with probability
distribution as follows :
![]() |
Probability |
1 | ![]() |
0 | ![]() |
Now since E()
= 0, the expected value of the response variable is,
E()
= 1(
i
) + 0(1-
) =
This implies that
E (
) =
This means that the expected response given by the response
function E ()=
is just the probability that the response variable takes on the
value 1.
Generally, when the response variable is binary, there is
considered empirical evidence indicating that the shape of response
function should be nonlinear. A monotonically increasing ( or
decreasing) S- shape ( or reverse S - shape) function. This
function is called the logistic response function
and has form.
This logistic response function can be easily linearized. One
approach defines the structural portions of the model. In terms of
a function of The response function mean. Let
Be the linear predictor where
is defined by the transformation
This transformation is often called the logit
transformation of the probability
and the ratio
in the transformation is called odds. Sometimes
the logit transformation is called the log- odds.
So we do a logistic transformation of the data doing the logistic regression.