In: Statistics and Probability
Explain models with binary independent variables and possible issues with them.
A binary variable can assume only two values. Numerically, it is usually represented as 0 or 1.
Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable, although many more complex extensions exist. In regression analysis, logistic regression is estimating the parameters of a logistic model (a form of binary regression). Mathematically, a binary logistic model has a dependent variable with two possible values, such as pass/fail which is represented by an indicator variable, where the two values are labeled "0" and "1".
But there are some real world problems in implementing logistic regression discussed below,
1) Logistic Regression is also not one of the most powerful algorithms out there and can be easily outperformed by more complex ones.
Also, we can’t solve non-linear problems with logistic
regression since it’s decision surface is linear.
2) Logistic regression will not perform well with independent
variables that are not correlated to the target variable and are
very similar or correlated to each other that is we can’t solve
non-linear problems with logistic regression since it’s decision
surface is linear. Just take a look at the example,
3) Your likelihood function won’t converge if there is full
separation in the data.
4) You’re assuming a specific functional form, and in particular
monotonicity.
5) There are complications with heteroskedastic and clustered
standard errors.
6) If you’re using fixed effects in a panel model with a short time
dimension, your estimates will be severely biased.
7) You can’t use it as the first stage of an IV without committing
the sin of “forbidden regression.”
Most of these complaints are true of any nonlinear parametric
model, and the logit does have some nice properties. For example,
it is globally concave.
8) verfitting the modelL
9) limited to Linear relationships between variables
10) Sensitivity to Outliers
11) Large Sample Size