In: Statistics and Probability
1.What is specification error?
2.What are the criteria for model selection for an empirical analysis?
3. How will you estimate parameters in case of inclusion of irrelevant variables?
4. How would you detect that a model is overfitted?
(P.S. - please answer every part)
1.Specification Error means that at least one of the key features or assumptions of the model is incorrect. In consequence, estimation of the model may yield results that are incorrect or misleading.
2.Empirical analysis is an evidence-based approach to the study and interpretation of information. The empirical approach relies on real-world data, metrics and results rather than theories and concepts
Steps for conducting empirical research
3.
We can remove the irrelevant variables by testing the significance of each using t-test, etc. Then we can shortlist the important variables and get rid of irrelevant data. Also, we can identify correlated variables to remove multicollinearity using Variance Inflation Factor in regression and Factor Analysis/PCA/variable reduction techniques in general for other models.
4. Overfitting is a scenario where your model performs well on training data but performs poorly on data not seen during training. This basically means that your model has memorized the training data instead of learning the relationships between features and labels.
If you are familiar with the bias/variance, then you can think of overfitting as a situation where your model has high variance, memorizing the random noise in the training set.