Question

In: Statistics and Probability

The library would like to compare the regression and exponential smoothing models to determine which is...

The library would like to compare the regression and exponential smoothing models to determine which is a better predictor, using the mean absolute error | (books borrowed) – (model’s estimate)|/n as a measure of prediction quality.

Select the best of the following four options for splitting the data:

A. 15% for training, 15% for validation, 70% for test
B. 15% for training, 70% for validation, 15% for test
C. 70% for training, 15% for validation, 15% for test
D. 55% for training, 15% for cross-validation, 15% for validation, 15% for test

The person who built these models discovered that although the regression model performed much better on the training set, the two models performed about the same on the validation set:

Mean absolute error (training set) Mean absolute error (validation set)

Regression model 110 140

Exponenetial Smoothing Model 140 150

Select all reasonable suggestions below:

A. To choose between the models, we should see which one does better on the training set.

B. The regression model is clearly better, because it does better on the training set and about the same on the validation set.

C. The regression model is probably fit too much to random patterns (i.e., it is overfit), because it performs much worse on the validation set than on the training set.

D. If there had been 20 models, the one that performed best on the validation set would probably not perform as well on the test set as it did on the validation set.

Solutions

Expert Solution

The best spllitting stratetgy is to have more data for training and equal amount of data for testing and validation

hence choice C is most appropriate

c 70% training , 15% validation , 15% for test

A. To choose between the models, we should see which one does better on the training set.


no , we should choose which performs better on the testing set or validation set . The model can be overfitted on the training set


B. The regression model is clearly better, because it does better on the training set and about the same on the validation set.


Regression model 110 140
Exponenetial Smoothing Model 140 150


no there is a difference and the validation error is abou 140


C. The regression model is probably fit too much to random patterns (i.e., it is overfit), because it performs much worse on the validation set than on the training set.


Yes , this could be the case while the training error is low validation error is high , hence it is a classical case of overfitting

D. If there had been 20 models, the one that performed best on the validation set would probably not perform as well on the test set as it did on the validation set.

No this is not true , if the model performs well on the validation set (which is new data) hence the model would perform equally good at the test set


Related Solutions

4. Compare the preceding four simple linear regression models to determine which model is the preferred...
4. Compare the preceding four simple linear regression models to determine which model is the preferred model. Use the Significance F values, p-values for independent variable coefficients, R-squared or Adjusted R-squared values (as appropriate), and standard errors to explain your selection. 5.. Calculate the predicted income of a 45 year old, with 18 years of education, 2 children, and works 40 hours per week using your preferred regression model from part 4. INCOME AGE EARNRS EDUC CHILDS HRS1 500 27...
Tractor Unit Sales, use the exponential smoothing method to develop the forecasting models with various values...
Tractor Unit Sales, use the exponential smoothing method to develop the forecasting models with various values of α (α = 0.1, 0.2, …, 0.9), then calculate MAPE to identify the best value of α.    Please include formulas and screen shots of excel. Month NA SA Eur Pac China World Jan-10 570 250 560 212 0 1592 Feb-10 611 270 600 230 0 1711 Mar-10 630 260 680 240 0 1810 Apr-10 684 270 650 263 0 1867 May-10 650 280...
Write a report on the application of all the forecasting methods.(regression,time series decomposition and exponential smoothing,ARIMA...
Write a report on the application of all the forecasting methods.(regression,time series decomposition and exponential smoothing,ARIMA models
Under what conditions would you prefer a simple exponential smoothing model to the moving averages method...
Under what conditions would you prefer a simple exponential smoothing model to the moving averages method for forecasting a time series? Explain your reasoning.
Identify situations in which tools such as linear regression and the moving averages and smoothing techniques...
Identify situations in which tools such as linear regression and the moving averages and smoothing techniques used for a business in predicting future revenues. Which tool is best for long-range projections? Which is the simplest for a small business? Why?  
Of the three quantitative forecasting techniques (moving average, weighted moving average, and exponential smoothing), which do...
Of the three quantitative forecasting techniques (moving average, weighted moving average, and exponential smoothing), which do you think provides the most accurate forecast and why?
Determine the seventh projection of the following process variable data: Using Exponential Smoothing (α= 2/n+1) Actual...
Determine the seventh projection of the following process variable data: Using Exponential Smoothing (α= 2/n+1) Actual Results 100 123 140 110 150 105
Simple linear regression, like ARIMA, involves statistical modeling. Unlike decomposition, averaging and smoothing methods, fitting a...
Simple linear regression, like ARIMA, involves statistical modeling. Unlike decomposition, averaging and smoothing methods, fitting a simple linear regression model to data involves statistical inference. Moreover, several assumptions/conditions need to be satisfied in order to use a simple linear regression model. One might think that this added level of complexity would make regression analysis less likely to be used in practice. On the contrary, it is widely used by management.  Why do you suppose this is the case? What advantages does...
Research the Internet or the Strayer Library for a technology business that seems like it would...
Research the Internet or the Strayer Library for a technology business that seems like it would need funding from an equity source. Be prepared to discuss this.
Compare and contrast NPV and IRR. Out of the six capital budgeting decision models which would...
Compare and contrast NPV and IRR. Out of the six capital budgeting decision models which would you choose to measure a potential project and why?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT