Question

In: Math

Consider two models that you are to fit to a single data set involving three variables:...

Consider two models that you are to fit to a single data set involving three variables: A, B, and C.

Model 1 : A ~B

Model 2 : A ~B + C

(a) When should you say that Simpson’s Paradox is occuring?

A. When Model 2 has a lower R2 than Model 1.

B. When Model 1 has a lower R2 than Model 2.

C. When the coef. on B in Model 2 has the opposite sign to the coef. on B in Model 1.

D. When the coef. on C in Model 2 has the opposite sign to the coef. on B in Model 1.

(b) True or False: If B is uncorrelated with A, then the coefficient on B in the model A ~ B must be zero.

(c) True or False: If B is uncorrelated with A, then the coefficient on B in a model A ~ B+C must be zero.

(d) True or False: Simpson’s Paradox can occur if B is uncorrelated with C.

Solutions

Expert Solution

C. When the coef. on B in Model 2 has the opposite sign to the coef. on B in Model 1.

simpson paradox is said to occur when the correlation between explanatory and response variable is reversed. iethe sign of the regression slopes will be opposite.The Simpson’s Paradox may occur when there is (at least) one confounding variable (like age group ,gender etc that has not been accounted for.

for eg if there are three age groups and you are studying the effectiveness of a new medcine w. r. t the age groups. the effectivess may vary according to the age group and if the age factor is confounded then we may or may not conclude that the new medicine is ineffective, when actually is more effective on people less than age 40.

b) True, because in the regression having only one explantory variable, slope b depends on r ie

c)False, The coefficient need not be zero ie When you have more than one explanatory variable in a multiple regression, an explanatory variable uncorrelated with the response variable can have a nonzero slope because the interation of two or more explanatory variables may contribute to the variability in the response

d)False, simpson paradox has nothing to with the correlation between explanatory variables.


Related Solutions

Consider the following data for two variables, and
Consider the following data for two variables,  and . 7 30 21 18 25   10 27 23 16 21   a. Develop an estimated regression equation for the data of the form . Comment on the adequacy of this equation for predicting . Enter negative value as negative number. The regression equation is     s= (to 3 decimals)   R2= % (to 1 decimal)   R Adjusted= % (to 1 decimal)   Analysis of Variance SOURCE DF SS(to 2...
For data DEMOG, fit three simple linear regression models of the per capita income on each...
For data DEMOG, fit three simple linear regression models of the per capita income on each of the three predictor variables. Does a linear regression model appear to provide a good fit for each of the three predictor variables? Use all appropriate tests, descriptive measures, and plots to conclude your findings here. Which predictor variable leads to significant effect on the per capita income? usborn cap.income home pop Alabama 0.98656 21442 75.9 4040587 Alaska 0.93914 25675 34.0 550043 Arizona 0.90918...
Consider one of the subset regression models for each data set obtained in Problem Set 4...
Consider one of the subset regression models for each data set obtained in Problem Set 4 and answer the following questions. (i) Draw the scatter plot matrix, residual vs. predictor variable plots and added variable plots. Comment on the regression model based on these plots. (ii) Draw the normal-probability plot and comment. (iii) Draw the correlogram and comment. (iv) Detect leverage points from the data. (v) Compute Cook’s distance statistics and detect all outlier points from the data. (vi) Compute...
Consider one of the subset regression models for each data set obtained in Problem Set 4...
Consider one of the subset regression models for each data set obtained in Problem Set 4 and answer the following questions. (i) Draw the scatter plot matrix, residual vs. predictor variable plots and added variable plots. Comment on the regression model based on these plots. (ii) Draw the normal-probability plot and comment. (iii) Draw the correlogram and comment. (iv) Detect leverage points from the data. (v) Compute Cook’s distance statistics and detect all outlier points from the data. (vi) Compute...
Consider the following data for two variables, x and y.
  Consider the following data for two variables, x and y. x 22 24 26 30 35 40 y 11 20 33 34 39 36 (a) Develop an estimated regression equation for the data of the form ŷ = b0 + b1x. (Round b0 to one decimal place and b1 to three decimal places.)ŷ = −8.3+1.259x    (b) Use the results from part (a) to test for a significant relationship between x and y. Use α = 0.05. Find the...
Consider the following data for two variables, x and y.
Consider the following data for two variables, x and y. x   2 3 4 5 7 7 7 8 9 y 4 5 4 6 4 6 9 5 11 a. Does there appear to be a linear relationship between x and y? Explain.(f-test, to do f-test for the overall significance) b. Develop the estimated regression equation relating x and y. c. Plot the standardized residuals versus yˆ for the estimated regression equation developed in part (b). Do the model assumptions...
A Guideline for Project I 1. Collect a data set with two related variables. You can...
A Guideline for Project I 1. Collect a data set with two related variables. You can create your own data set or can download a data set from a web or to use data sets that posted on project organizer. 2. Calculate descriptive statistics, i.e., mean, median and standard deviation, 3 . Estimate a regression equation for the two variables. 4. Submit a written report
Some statisticians prefer complex models, models that try to fit the data as closely as one...
Some statisticians prefer complex models, models that try to fit the data as closely as one can. Others prefer a simple model. They claim that although simpler models are more remote from the data yet they are easier to interpret and thus provide more insight. What do you think? Which type of model is best to use? When formulating your answer to this question you may think of a situation that involves inference that you do and need to present...
Consider the following set of dependent and independent variables. Use this data to complete parts a...
Consider the following set of dependent and independent variables. Use this data to complete parts a and b below. Construct a​ 95% confidence interval for the regression coefficient for x1 y   x1   x2 10   2   15 14   7   8 15   5   12 17   9   12 22   6   0 23   12   9 29   13   5 33   20   2 a. The​ 95% confidence interval for the true population coefficient B1 is _____ to _____. (Round to three decimal places as​ needed.) b....
Pick two variables that could be collected that would produce a set of data that would...
Pick two variables that could be collected that would produce a set of data that would have the mean much higher than the median or much lower than the median. please explain the variables
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT