In: Math
Discuss the topic most challenging in Inference Regression?
Provide a good description of the mentioned topic
Discuss the reasons for the challenges
Discuss exactly what the challenges are
There is not a single challenging problem in Regression Analysis. Some of those challenging problems are,
1. Outlier: In many cases, the data we are using to build our regression model may contain extreme values. The response variable, as well as the regressor variable, may contain extreme values. These values are called outliers. Outliers can influence the model and inference from the model will be erroneous.
2. Multicollinearity: In some cases, regressor variables are correlated to each other. For example, if we want to predict income of an individual with regressors as Age and Working Experience(in years) then these regressors variable may be correlated to each other. In this case, we could not get unbiased regression coefficients and the result will be erroneous.
3. Heteroscedasticity: When we build a simple linear regression model we assume that error in prediction follows independent Normal Distribution with mean=0 variance=constant but in case of heteroscedasticity, the variance will not remain same for all the cases and the independency of errors will be violated too. We have to treat this kind of problem separately.
4. Variable selection: When we are working with regression models, we have to pass the variables carefully. Populating with too many variables can occur a problem of multicollinearity or heteroscedasticity, again working with fewer variables may end with lack of explained variability. Choosing an optimum number of variables and appropriate variables that can help you to explain as much as variability possible is also an important part of building a model.