Question

In: Statistics and Probability

Some statisticians prefer complex models, models that try to fit the data as closely as one...

Some statisticians prefer complex models, models that try to fit the data as closely as one can. Others prefer a simple model. They claim that although simpler models are more remote from the data yet they are easier to interpret and thus provide more insight. What do you think? Which type of model is best to use? When formulating your answer to this question you may think of a situation that involves inference that you do and need to present to other people. Would the consumers of your analysis benefit more from you having used a complex model of from yo having used a simpler model? What would be the best way to report your findings and explain them to the consumers?

Solutions

Expert Solution

Answer:

Here I think the inclination to intricate or basic models relies upon the point of view of the examination.

For instance in the event that we are breaking down the information for arrangement creators where we need straightforward answers like whether demise rate in North Carolina has expanded after some time, or whether more weapon laws can lessen number of crimes we should focus on basic models where explicit replies answers to these inquiries can be given.

Then again on the off chance that we are attempting to comprehend the component of an arbitrary variable and attempt to construct a prescient model I figure progressively complex model ought to get the inclination as in nature nothing is static and neither the procedures work directly.

For instance on the off chance that we need to arrange the clients dependent for them and acknowledge subtleties as fortunate or unfortunate borrower numerous highlights ought to be remembered for the model and afterward some mind boggling characterization strategy like irregular woods and so on can be utilized to get the best classifier which in since quite a while ago run will give us a hazard evaluation of the clients.

By and by there ought to be a harmony between basic model and complex model: the exchange off is called stinginess.

This is the explanation individuals in measurements presently use LASSO sort of thing where you continue adding highlights to the model and yet put a punishment with the goal that complete number of powerful highlights won't surpass a given edge.


Related Solutions

Consider two models that you are to fit to a single data set involving three variables:...
Consider two models that you are to fit to a single data set involving three variables: A, B, and C. Model 1 : A ~B Model 2 : A ~B + C (a) When should you say that Simpson’s Paradox is occuring? A. When Model 2 has a lower R2 than Model 1. B. When Model 1 has a lower R2 than Model 2. C. When the coef. on B in Model 2 has the opposite sign to the coef....
For data DEMOG, fit three simple linear regression models of the per capita income on each...
For data DEMOG, fit three simple linear regression models of the per capita income on each of the three predictor variables. Does a linear regression model appear to provide a good fit for each of the three predictor variables? Use all appropriate tests, descriptive measures, and plots to conclude your findings here. Which predictor variable leads to significant effect on the per capita income? usborn cap.income home pop Alabama 0.98656 21442 75.9 4040587 Alaska 0.93914 25675 34.0 550043 Arizona 0.90918...
9.13 Using the SHHS data in Table 2.10,fit all possible multiple regression models (without interactions) that...
9.13 Using the SHHS data in Table 2.10,fit all possible multiple regression models (without interactions) that predict the y variable serum total cholesterol from diastolic blood pressure,systolic blood pressure,alcohol,carbon monoxide and cotinine. Scrutinize your results to understand how the x variables act in conjuction.For these data,which is the "best " multiple regression model for cholesterol? What percentage of variation does it explain? Serum total cholesrerol (mmol/l) Diastolic blood pressure (mmHg) Systolic blood pressure (mmHg) Alcohol (g/day) Cigarettes (no./day) Carbon monoxide(ppm)...
Listed below are some data that were collected on a solution of the green chloro complex...
Listed below are some data that were collected on a solution of the green chloro complex as a function of time during its conversion to the aquo complex.   Copy these data into your notebook or cut and paste these values into a spreadsheet program. Use this data and the integrated forms of the rate laws to determine the order of the reaction. You must plot all 3 graphs to determine which yields a straight line. The plot that gives a...
Listed below are some data that were collected on a solution of the green chloro complex...
Listed below are some data that were collected on a solution of the green chloro complex as a function of time during its conversion to the aquo complex.   Copy these data into your notebook or cut and paste these values into a spreadsheet program. Use this data and the integrated forms of the rate laws to determine the order of the reaction. You must plot all 3 graphs to determine which yields a straight line. The plot that gives a...
Some modelers prefer to partition the data into three data sets (training/validation/test) vs. the more typical...
Some modelers prefer to partition the data into three data sets (training/validation/test) vs. the more typical two data sets (training/validation). A test set can be used during the final modeling step to measure the expected prediction error in practice given that it has been totally separated from the modeling/validation process. Do you think it is important to partition the data into three data sets (training/validation/test) or just two (training/validation)? Justify your opinion by discussing the pros and cons of each...
Find an article that contains at least one statistical graph. Identify the original data as closely...
Find an article that contains at least one statistical graph. Identify the original data as closely as possible from the information given in the study. Do this again for a study that has a different type of graph. In which study were you able to most closely determine much about the original data? Why?
1. A good predictive model is one that fits the data closely whereas a good explanatory...
1. A good predictive model is one that fits the data closely whereas a good explanatory model is one that predicts new cases accurately. A. True B. False 2. The specificity of a classifier is its ability to detect the important class members correctly and sensitivity is its ability to rule out C0 members correctly. A. True B. False 3. This method of finding the best subset of predictors relies on partial, iterative search through the space of all possible...
One reason that Normal distribution models show up so often is because they have some special...
One reason that Normal distribution models show up so often is because they have some special and useful properties, many of which were covered in class. Here is another: Mathematical Fact: If the variables X and Y are both normally distributed and independent, then new variables X + Y (the sum) and X – Y (the difference) are also normally distributed. * The mean and standard deviation of the sum or difference are calculated using the properties of random variables...
Consider one of the subset regression models for each data set obtained in Problem Set 4...
Consider one of the subset regression models for each data set obtained in Problem Set 4 and answer the following questions. (i) Draw the scatter plot matrix, residual vs. predictor variable plots and added variable plots. Comment on the regression model based on these plots. (ii) Draw the normal-probability plot and comment. (iii) Draw the correlogram and comment. (iv) Detect leverage points from the data. (v) Compute Cook’s distance statistics and detect all outlier points from the data. (vi) Compute...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT