In: Statistics and Probability
Problem 1, Total / 10 ( 1 each) Fill in the letter for each term. Each term is used exactly once. Write clearly.
1. Akaike Information Criterion (AIC) ________
2. Dummy variable _________
3. Imputation _________
4. Interaction __________
5. Leave-One-Out __________
6. Overfitted ______
7. Quantile-Quantile _________
8. Stepwise Method _________
9. Training Set ________
10. Variance Inflation Factor ___________
A) Used to find a good model according to some criterion. Works by repeatedly 'dropping' or 'adding' one term to the current model.
B) Used to specify whether an observation is in a specific category, rather than some baseline.
C) Term for a model that applies well to the data set given, but poorly to new observations.
D) The part of the data used to make a model in a cross-validation.
E) Plot used to explore any deviations from normality (or other specified distribution) in a collection of values.
F) A measure to compare statistical models. Considers both model fit and complexity.
G) A special case of K-fold cross validation using training sets of N-1 observations.
H) A regression term made of two (or more) variables, multiplied together.
I) A measure of co-linearity of a regression term.
J) General term for methods of replacing missing data
1. Akaike Information Criterion (AIC) - F) A measure to compare statistical models. Considers both model fit and complexity.
2. Dummy variable - B) Used to specify whether an observation is in a specific category, rather than some baseline.
3. Imputation - J) General term for methods of replacing missing data
4. Interaction - H) A regression term made of two (or more) variables, multiplied together.
5. Leave-One-Out - D) The part of the data used to make a model in a cross-validation
6. Overfitted - C) Term for a model that applies well to the data set given, but poorly to new observations.
7. Quantile-Quantile - E) Plot used to explore any deviations from normality (or other specified distribution) in a collection of values.
8. Stepwise Method - A) Used to find a good model according to some criterion. Works by repeatedly 'dropping' or 'adding' one term to the current model.
9. Training Set - G) A special case of K-fold cross validation using training sets of N-1 observations.
10. Variance Inflation Factor - I) A measure of co-linearity of a regression term.