Question

In: Advanced Math

In linear regression , why can we see the lack of fit test as the comparison...

In linear regression , why can we see the lack of fit test as the comparison between the model we choose and the saturated model?

Solutions

Expert Solution

Let the two alternative models are under consideration, one model is simpler or more parsimonious than the other,one of the models is the saturated model. Another common situation is to consider ‘nested’ models, where one model is obtained from the other one by putting some of the parameters to be zero. Suppose now we test

The difference in the null and alternative hypothesis from the section above. Here to test the null hypothesis that an arbitrary group of k coefficients from the model is set equal to zero (e.g. no relationship with the response), we need to fit two models:

  • the reduced model which omits the k predictors in question, and
  • the current model which includes them

The likelihood-ratio statistic is

the degrees of freedom is k (the number of coefficients in question).

To perform the test, we must look at the "Model Fit Statistics" section and examine the value of "−2 Log L" for "Intercept and Covariates." Here, the reduced model is the "intercept-only" model (e.g. no predictors) and "intercept and covariates" is the current model we fitted. For our running example, this would be equivalent to testing "intercept-only" model vs. full (saturated) model (since we have only one covariate).

For our example, ΔG2 = 5176.510 − 5147.390 = 29.1207 with df = 2 − 1 = 1. Notice that this matches Deviance we got in the earlier text above.

Another way to calculate the test statistic is

ΔG2 = G2 from reduced model
            −G2 from current model,

where the G2's are the overall goodness-of-fit statistics.

This value of -2 Log L is useful to compare two nested models which differ by an arbitrary set of coefficients.

Also notice that the ΔG2 we calculated for this example equals to

Likelihood Ratio     29.1207    1    <.0001

from "Testing Global Hypothesis: BETA=0" section

Testing the Joint Significance of All Predictors.

Testing the null hypothesis that the set of coefficients is simultaneously zero.

H0 : β1 = β2 = ... = 0 versus the alternative that at least one of the coefficients β1, . . . , βk is not zero.

the alternative that the current model (in this case saturated model) is correct


Using SAS:-

the SAS output, three different chisquare statistics for this test are displayed in the section

"Testing Global Null Hypothesis: Beta=0," corresponding to the likelihood ratio, score and Wald tests. Recall their definitions from the very first lessons.

The Homer-Lemeshow Statistic

An alternative statistic for measuring overall goodness-of-fit is Hosmer-Lemeshow statistic.

This is a Pearson-like χ2 that is computed after data are grouped by having similar predicted probabilities. It is more useful when there is more than one predictor and/or continuous predictors in the model too.


Related Solutions

Can someone show me how to do a test for lack of fit for the following...
Can someone show me how to do a test for lack of fit for the following data? Please show all work for an up vote. Thanks. y x4 x7 x9 29.5 1.5 4 0 27.9 1.175 3 0 25.9 1.232 3 0 29.9 1.121 3 0 29.9 0.988 3 0 30.9 1.24 3 1 28.9 1.501 3 0 35.9 1.225 3 0 31.5 1.552 3 0 31 0.975 2 0 30.9 1.121 3 0 30 1.02 2 1 36.9 1.664...
Can any linear regression model be checked for model adequacy by statistical testing for lack of...
Can any linear regression model be checked for model adequacy by statistical testing for lack of fit or goodness of fit? Why or why not? Please provide your answer with detailed justification (i.e., by mathematical proof or by showing a numerical example)
a) In a linear regression, why do we need to be concerned with the range of...
a) In a linear regression, why do we need to be concerned with the range of the independent (X) variable? (Provide an example) b) Explain the idea that correlation doesn’t imply causation (Provide and example)
How can we use “linear regression” to estimate non-linear functional forms?
How can we use “linear regression” to estimate non-linear functional forms?
Determine and interpret the linear correlation coefficient, and use linear regression to find a best fit...
Determine and interpret the linear correlation coefficient, and use linear regression to find a best fit line for a scatter plot of the data and make predictions. Scenario According to the U.S. Geological Survey (USGS), the probability of a magnitude 6.7 or greater earthquake in the Greater Bay Area is 63%, about 2 out of 3, in the next 30 years. In April 2008, scientists and engineers released a new earthquake forecast for the State of California called the Uniform...
Find a study that uses linear regression and a line of best fit. What is the...
Find a study that uses linear regression and a line of best fit. What is the Correlation Coefficient? What conclusions can you make about the data? Is there a correlation and how strong is it?
in a linear regression, the distribution of error is not i.i.t. How can we use MLE...
in a linear regression, the distribution of error is not i.i.t. How can we use MLE function?
Find the linear regression equation (line of best fit), determine the correlation, and then make a...
Find the linear regression equation (line of best fit), determine the correlation, and then make a prediction. 1. The table below gives the amount of time students in a class studied for a test and their test scores. Graph the data on a scatter plot, find the line of best fit, and write the equation for the line you draw. Hours Studied 1 0 3 1.5 2.75 1 0.5 2 Test Score 78 75 90 89 97 85 81 80...
a. Using the following R codes to fit the linear regression model for VitC on HeadWt,...
a. Using the following R codes to fit the linear regression model for VitC on HeadWt, and obtain its summary. Paste the R output in your homework. cabbages_data <- read.csv("http://users.stat.umn.edu/~wuxxx725/data/cabbages_data.csv") cabbages_reg <- lm(VitC ~ HeadWt, data = cabbages_data) summary(cabbages_reg) b. State and interpret the value of r 2 from the model summary output in part a). c. Calculate the correlation r between HeadWt and VitC, and state the strength and the direction of the correlation. d. State the estimated regression...
Using the following R codes to fit the linear regression model for VitC on HeadWt, and...
Using the following R codes to fit the linear regression model for VitC on HeadWt, and obtain its summary. Paste the R output in your homework. cabbages_data <- read.csv("http://users.stat.umn.edu/~wuxxx725/data/cabbages_data.csv") cabbages_reg <- lm(VitC ~ HeadWt, data = cabbages_data) summary(cabbages_reg)
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT