Question

In: Statistics and Probability

Suppose you were asked to investigate which predictors explain the number of minutes that 10- to18-year-old...

Suppose you were asked to investigate which predictors explain the number of minutes that 10- to18-year-old students spend on Twitter. To do so, you build a linear regression model with Twitter usage (Y) measured as the number of minutes per week. The four predictors you include in the model are Height, Weight, Grade Level, and Age of each student. You build four simple linear regression models with Y regressed separately on each predictor, and each predictor is statistically significant. Then you build a multiple linear regression model with Y regressed on all four predictors, but only one predictor, Age, is statistically significant, and the others are not. What is likely going on among the four predictors? If you include more than one of these predictors in the model, what are some problems that can result?

Solutions

Expert Solution

Assumption in regression problems is predictors should not have multicolinearity.

Since when you build linear regression model with Y regressed  separately on each predictor(x), then you have only one predictor and predict Y.It means your data have no corelation ( Multicolinearity) between others predictors because no others predictor available in your data set. therefore , each predictor is statistically significant.

On other hand , you build a multiple linear regression model with Y regressed on all four predictors(x1,x2,x3,x4), then you have get corelation between Height, weight and grade level to each other therefore its statistically significant on output (Y) is very low, but age is not corelated with Height, weight and grade level therefore its statistically significant on output (Y) is very high.

If you include more than one of these predictors in the model and these predictors will have corelation( Multicolinearity), then you will get problems and your accuracy will go down(i.e adjusted R - square decreases).


Related Solutions

Suppose you were asked to investigate which predictors explain the number of minutes that 10- to18-year-old...
Suppose you were asked to investigate which predictors explain the number of minutes that 10- to18-year-old students spend on Twitter. To do so, you build a linear regression model with Twitter usage (Y) measured as the number of minutes per week. The four predictors you include in the model are Height, Weight, Grade Level, and Age of each student. You build four simple linear regression models with Y regressed separately on each predictor, and each predictor is statistically significant. Then...
Suppose that you are asked to investigate the hypothesis that a grade on a final test...
Suppose that you are asked to investigate the hypothesis that a grade on a final test in Econ 227 (grade) depends on classes attended (att). The data come from ECON 227 class for the spring 2016 semester. I need help on the part C, D, and E parts of the question. (a) Present economic model of the hypothesis. grade = f (att , others) (b) Turn your economic model in (a) into an econometric model. grade = B0 + B1...
Suppose you were asked to compute national income and product for an economy. Explain the modifications...
Suppose you were asked to compute national income and product for an economy. Explain the modifications of the usual methods that you will apply to correct the existing problems. Do you think it will result in a larger or a smaller income?
You, as a researcher, were asked to investigate differences in fruit and vegetable consumption among males...
You, as a researcher, were asked to investigate differences in fruit and vegetable consumption among males and females in your city. Suggest three different probabilistic sampling methods to design this study. The sampling frame must be also stated. Explain your answers.
A study was conducted in which students were asked to estimate the number of calories in...
A study was conducted in which students were asked to estimate the number of calories in a cheeseburger. One group was asked to do this after thinking about a​ calorie-laden cheesecake. A second group was asked to do this after thinking about an organic fruit salad. The mean number of calories estimated was 771 for the group that thought about the cheesecake and 1008 for the group that thought about the organic fruit salad. Suppose that the study was based...
You were asked to investigate extreme high, unexplained merchandise shortages at a department store chain. You...
You were asked to investigate extreme high, unexplained merchandise shortages at a department store chain. You found the following: The receiving department supervisor owns and operates a boutique carrying many of the same labels as the chain store. The general manager is unaware of the ownership interest. The receiving supervisor signs receiving reports showing that the total quantity shipped by a supplier was received and then diverts 5% to 10 % of each shipment to the boutique. The store is...
The mean number of minutes for app engagement by a table user is 7.5 minutes. Suppose...
The mean number of minutes for app engagement by a table user is 7.5 minutes. Suppose the standard deviation is 1.25 minutes. Take a sample of 80. a. What are the mean and standard deviation for the sample mean number of app engagement by a tablet user? b. Find the 90th percentile for the sample mean time for app engagement for a tablet user. Interpret this value in a complete sentence. c. Find the probability that the sample mean is...
1. You have a number of variables -- a response variable and some proposed predictors. A...
1. You have a number of variables -- a response variable and some proposed predictors. A correlation matrix may help to give an idea of the strength of the linear relationships present. For the purposes of regression analysis, what would be desirable in terms of … a) …the observed correlation between each proposed predictor and the response variable? (2) b) …the observed correlations among the predictors themselves? (2) 2. Explain the fact that, over the course of a year, the...
Suppose that it is known that the number of minutes that the daily 1 pm train...
Suppose that it is known that the number of minutes that the daily 1 pm train from NJ Station arrives late to destination Linden is a random variable X with density function f(x) = {c/15^2(15^2-x^2), -15<x<15; 0, elsewhere. (a) Determine the value of c so that f(x) is a valid probability density function. (b) Explain why c must be a particular value. That is, explain why probability density functions must have the specific property that you used to solve part...
1.2 Suppose that you are asked to evaluate three medical interventions that reduce the number of...
1.2 Suppose that you are asked to evaluate three medical interventions that reduce the number of deaths due to congestive heart failure. Among these interventions, there are no influences on the patient’s quality of life. What evaluation method would you use and why? in the book's solutions: the answer is "cost Utility Analysis" why this method although the the statement said that "there are no influences on the patient’s quality of life" as i understand from lecture cost utility analysis...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT