In: Math
What are ALL the possible difficulties with with fitting linear regression? Please explain the reasoning.
Possible difficulties are :
Linear Regression Is Limited to Linear Relationships:
Main difficulty of Linear Regression is the assumption of linearity between the dependent variable and the independent variables, it assumes there is a straight-line relationship between them. In the real world, the data is rarely linear. For example, the relationship between income and age is curved, i.e., income tends to rise in the early parts of adulthood, flatten out in later adulthood and decline after people retire. You can tell if this is a problem by looking at graphical representations of the relationships.
Prone to outliers:
Linear regressions are sensitive to outliers. For example, if most of your data lives in the range (20,50) on the x-axis, but you have one or two points out at x= 200, this could significantly swing your regression results.
Data Must Be Independent:
Linear regression assumes that the data are independent. That means that the scores of one subject (such as a person) have nothing to do with those of another, that is not possible most of the time.
Linear Regression Only Looks at the Mean of the Dependent Variable :
Linear regression looks at a relationship between the mean of the dependent variable and the independent variables. Sometimes it ignores the other features and we get the abnormal results.
Prone to noise and overfitting:
If the number of observations is lesser than the number of features, Linear Regression should not be used, otherwise, it may lead to overfitting because is starts considering noise in this scenario while building the model.