Question

In: Statistics and Probability

What statistical information should one look for in order to determine that a given linear regression...

What statistical information should one look for in order to determine that a given linear regression model is not a good fit? If you shouldn't use such a linear model then what would be a good estimate for a predicted output?

Solutions

Expert Solution

Please don't hesitate to give a "thumbs up" in case you're satisfied with the answer

There are many criterion to look for to know if linear regression model is a good fit:

1. The residual plot should be homoscedastic , i.e. the distribution of errors w.r.t to dependent variables should be random. If there' a trend, then either you don't have a linear relation between the predictor and dependent variables or you need more predictor variables in your linear model

2. R-square is low, meaning your predictors aren't strong enough in terms of predictive strength to predict dependent variables.

3. p-value of linear regression should be below a .05 ( or any small value, upto modeller' discretion). if it is not then the linear regression is not statistically significant i.e. predictor variables don't have significant linear relation with dependent variable.

4. Multi collinearity between 2 variables or high VIF means that certain variables should be kicked out of the linear model.

5. You can also look at the out of box data validation i.e. check for errors on a dataset on which you didn't create your model on to get error / prediction accuracy

If a linear model is not the correct fit, then you can do either of the following to estimate the predicted output:

1. You can try fitting in higher degree polynomials which may be model your variables is much better way

2. If you are using binomial variable as output you should conduct a logistic regression instead of a linear regression

3. If you want to keep using a linear model, then you can do variable transformation either on dependent or independent variable to get a linear relation between your variables.

4. Outliers may be the case of biasing the linear regression, and hence decrease in any of the above ways. Hence, treat dataset to outlier for each of the participating variable to remove biases due to outliers.


Related Solutions

Curve Fitting and Linear Regression a) Determine the linear regression equation for the measured values in...
Curve Fitting and Linear Regression a) Determine the linear regression equation for the measured values in the table above. ?? 1 2 3 4 Value 1 (????) 0 3 7 10 Value 2 (????) 2 4 9 11 b) Plot the points and the linear regression curve. c) Determine the Linear Correlation Coefficient (i.e., Pearson’s r) for the dataset in the table above.
for stat students, model ( linear regression, multiple regression,factorial experiments,liner model) For one statistical method, give...
for stat students, model ( linear regression, multiple regression,factorial experiments,liner model) For one statistical method, give at least three reasons why the underlying statistical model is important. three reasons for each one
What is a residual plot and how should it look? What kind of statistical test would...
What is a residual plot and how should it look? What kind of statistical test would you use? What is Cook's distance? What are “LINE” approximations in regression analysis?
What is the difference between simple linear regression and multiple linear regression? What is the difference...
What is the difference between simple linear regression and multiple linear regression? What is the difference between multiple linear regression and logistic regression? Why should you use adjusted R-squared to choose between models instead of R- squared? Use SPSS to: Height (Xi) Diameter (Yi) 70 8.3 72 10.5 75 11.0 76 11.4 85 12.9 78 14.0 77 16.3 80 18.0 Create a scatterplot of the data above. Without conducting a statistical test, does it look like there is a linear...
.  Draw a plot of the following set of data and determine the linear regression equation.  What is...
.  Draw a plot of the following set of data and determine the linear regression equation.  What is the      value of the slope and intercept?   What is r and R2?  Are there any outlier values?   (15 points)                                 Age (X):     20  25  36  29  41  35  56  43  66  50  59  67  51  75  75  81  54  66  52  48            Total Body Water (Y):     61  57  52  59  53  58  48  51  37  44  42  41  48  38  41  39  47  42  51  50  
Linear regression is a statistical tool commonly used to find a relationship that exists between a...
Linear regression is a statistical tool commonly used to find a relationship that exists between a variable and one explanatory variable. What are the factors that affect a linear regression model? How can you accomplish linear regression in R? Please provide an example to illustrate your assertions.
Linear Regression Linear regression is used to predict the value of one variable from another variable....
Linear Regression Linear regression is used to predict the value of one variable from another variable. Since it is based on correlation, it cannot provide causation. In addition, the strength of the relationship between the two variables affects the ability to predict one variable from the other variable; that is, the stronger the relationship between the two variables, the better the ability to do prediction. What is one instance where you think linear regression would be useful to you in...
What is the statistical tool that I should be using to determine the impact of working...
What is the statistical tool that I should be using to determine the impact of working capital management on sustainable growth.
Given the following information, fill out the following table and determine what should be the optimal...
Given the following information, fill out the following table and determine what should be the optimal capital structure for Concours Corp. The company has $10 million in assets, a marginal tax rate of 30%, and each million in debt used causes Rd to increase by 1.0% and Rs to increase by .8%. Debt Rd Rs WACC 0 -- 12% 12% $1 Mill $2 Mill $3 Mill $4 Mill $5 Mill $6 Mill $7 Mill $8 Mill
for stat students, model ( linear regression, multiple regression,factorial experiments,liner model) for each statistical method ,...
for stat students, model ( linear regression, multiple regression,factorial experiments,liner model) for each statistical method , why is the underlying statistical model important ? more than 4 reasons. please explain in clear way , i will discuss that with my class . Thx
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT