In: Math
If a dependent variable is binary, is it optimal to use linear regression or logistic regression? Explain your answer and include the theoretical and practical concerns associated with each regression model. Provide a business-related example to illustrate your ideas.
Here' the answer to the question with full concept. Please don't hesitate to give a "thumbs up" in case you're satisfied with the answer
Lets say you have a dependent variable which is dichotomous i.e. takes 1 or 0 values only, then we doesn't leave much of a range and variance to use linear regression. Moreover, these are discrete values, rather than continous values which means we can't use linear regression approporitely. Linear regression is best for modelling dependent variables which are continous in nature. So, practically logistic regression make more sense if the dependent variable is binary in nature.
However, interpretating the output of logistic regression becomes tricky and hence wherever we have continous dependent variables we should use linear regression for ease of its interpretation.
To put in other words:
If probabilities that we are modeling are extreme—close to 0 or 1—then you probably you should use the logistic regression. If the probabilities are more moderate—say between .20 and .80, or a little beyond—then the linear and logistic models fit about equally well, and the linear model should be favored for its ease of interpretation.
Real life example:
In business, lets say you have high given people some promotion and wanted to gauge if they will buy/not buy the item via the promotion. Then you should model this using logistic regression than linear regression.