Question

In: Statistics and Probability

Consider one of the subset regression models for each data set obtained in Problem Set 4...

Consider one of the subset regression models for each data set obtained in Problem Set 4 and answer the following questions. (i) Draw the scatter plot matrix, residual vs. predictor variable plots and added variable plots. Comment on the regression model based on these plots. (ii) Draw the normal-probability plot and comment. (iii) Draw the correlogram and comment. (iv) Detect leverage points from the data. (v) Compute Cook’s distance statistics and detect all outlier points from the data. (vi) Compute DFFITS statistics and detect all outlier points from the data. (vii) Compute DFBETAS statistics and comment.

Two data sets are given for the following variables. 30 observations on 11 variables – Miles/(US) gallon, Number of cylinders, Displacement (cu.in.), Gross horsepower, Rear axle ratio, Weight (1000 lbs), 1/4 mile time, Engine (0 = Vshaped, 1 = straight), Transmission (0 = automatic, 1 = manual), Number of forward gears, Number of carburettors. This data set is available in R as “mtcars” under the package datasets. (2) 54 observations on the 10 surgical aspects. This data set is available in R as “SurgicalUnit” under the package ALSM. Answer the following questions for each data sets. (i) Find out appropriate models among all possible subset regression models based on the criteria of adjusted R-square, Mallow’s statistic, AIC and BIC. (ii) Use the forward selection approach to find the appropriate subset regression model. (iii) Use the backward elimination approach to find the appropriate subset regression model. (iv) Use the stepwise selection approach to find the appropriate subset regression model. (v) Comment on the performance of the subset regression models obtained in (i)-(iv).

Solutions

Expert Solution


Related Solutions

Consider one of the subset regression models for each data set obtained in Problem Set 4...
Consider one of the subset regression models for each data set obtained in Problem Set 4 and answer the following questions. (i) Draw the scatter plot matrix, residual vs. predictor variable plots and added variable plots. Comment on the regression model based on these plots. (ii) Draw the normal-probability plot and comment. (iii) Draw the correlogram and comment. (iv) Detect leverage points from the data. (v) Compute Cook’s distance statistics and detect all outlier points from the data. (vi) Compute...
The data presented in Problem 7 are analyzed using multiple linear regression analysis and the models...
The data presented in Problem 7 are analyzed using multiple linear regression analysis and the models are shown here. In the models, the data are coded as 1 = new medication and 0 = standard medication, and age 65 and older is coded as 1 = yes and 0 = no. ŷ = 53.85 − 23.54 (Medication) ŷ = 45.31 − 19.88 (Medication) + 14.64 (Age 65 +) ŷ = 45.51 − 20.21 ( Medication ) + 14.29 ( Age...
Consider two models that you are to fit to a single data set involving three variables:...
Consider two models that you are to fit to a single data set involving three variables: A, B, and C. Model 1 : A ~B Model 2 : A ~B + C (a) When should you say that Simpson’s Paradox is occuring? A. When Model 2 has a lower R2 than Model 1. B. When Model 1 has a lower R2 than Model 2. C. When the coef. on B in Model 2 has the opposite sign to the coef....
Directions: Use SPSS to compute the Regression Line. Problem: Using the following set of data and...
Directions: Use SPSS to compute the Regression Line. Problem: Using the following set of data and Excel, compute the regression line. The data set represents the number of hours of training to predict how severe injuries will be if someone is injured playing football. Briefly summarize your findings. Training Injuries Training Injuries 12 8 11 5 3 7 16 7 22 2 14 8 12 5 15 3 11 4 16 7 31 1 22 3 27 5 24 8...
Consider the following statements about unusual observations in linear regression models and pick the correct one....
Consider the following statements about unusual observations in linear regression models and pick the correct one. A. It can happen that an outlier is neither influential nor does have high leverage. B. It can happen that an observation with high leverage is not an outlier and is neither influential. C. Both Statements are correct. D. Both Statements are incorrect. Consider the following statements about VIFs and choose the incorrect one. A. If all the predictors are completely uncorrelated with each...
Consider the following quarterly time series. The regression model developed for this data set that has...
Consider the following quarterly time series. The regression model developed for this data set that has seasonality and trend is as follows, yˆt = 864.08 + 87.8Qtr1t + 137.98Qtr2t + 106.16Qtr3t + 28.16t Compute the quarterly forecasts for next year based on the regression model? Quarter Year 1 Year 2 Year 3 1 923 1112 1243 2 1056 1156 1301 3 1124 1124 1254 4 992 1078 1198
For data DEMOG, fit three simple linear regression models of the per capita income on each...
For data DEMOG, fit three simple linear regression models of the per capita income on each of the three predictor variables. Does a linear regression model appear to provide a good fit for each of the three predictor variables? Use all appropriate tests, descriptive measures, and plots to conclude your findings here. Which predictor variable leads to significant effect on the per capita income? usborn cap.income home pop Alabama 0.98656 21442 75.9 4040587 Alaska 0.93914 25675 34.0 550043 Arizona 0.90918...
Problem 4: Antismoking Campaign (no data set is posted for this problem). From 2005 to 2015,...
Problem 4: Antismoking Campaign (no data set is posted for this problem). From 2005 to 2015, an intensive antismoking campaign has been sponsored by the U.S federal government. Suppose in both years, the American Cancer Society randomly and independently sampled 2,000 adults. In 2005, they discovered 418 of the 2,000 sampled defined themselves as smokers. In 2015, they discovered 302 of the 2,000 sampled defined themselves as smokers. Define the population parameter in context in one sentence. Calculate the two...
4) In this problem, we will explore how the cardinality of a subset S ⊆ X...
4) In this problem, we will explore how the cardinality of a subset S ⊆ X relates to the cardinality of a finite set X. (i) Explain why |S| ≤ |X| for every subset S ⊆ X when |X| = 1. (ii) Assume we know that if S ⊆ <n>, then |S| ≤ n. Explain why we can show that if T ⊆ <n+ 1>, then |T| ≤ n + 1. (iii) Explain why parts (i) and (ii) imply that...
QSO 320 Problem Set (Problem Set 4-20) Complete problem 4-20 at the end of Chapter 4...
QSO 320 Problem Set (Problem Set 4-20) Complete problem 4-20 at the end of Chapter 4 in your textbook. You will demonstrate your work using Excel templates provided. You do not need to include a graphical procedure. Problem 4-20 X Y Profit $4 $5 =SUMPRODUCT(B5:C5,$B$4:$C$4) Constraints Labor 1 2 =SUMPRODUCT(B7:C7,$B$4:$C$4) <= 10 Material 6 6 =SUMPRODUCT(B8:C8,$B$4:$C$4) <= 36 Storage 8 4 =SUMPRODUCT(B9:C9,$B$4:$C$4) <= 40 LHS Sign RHS Hi All, The homework problem does not seem straight forward, so I put...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT