Question

In: Statistics and Probability

1. If a dependent variable Y and independent variable X have a positive correlation, what can...

1. If a dependent variable Y and independent variable X have a positive correlation, what can we predict about the sign of the slope coefficient between X and Y in a simple regression? How about in a multiple regression?

2. What would you suspect if your multiple regression t-statistics were all insignificant but the F-statistic was significant?

3. Suppose you built a 95% confidence interval around a simple regression slope coefficient and got a lower bound of 5 and an upper bound of 10. Interpret the confidence interval.

4. Suppose you wanted to know whether one categorical variable is associated with another and did a hypothesis test to see whether the proportions in a chosen category of one variable are the same for the two levels of the other variable. What two other methods could you use to see whether there is association between the variables?

Solutions

Expert Solution

1) in Simple Linear Regression slope coefficient is positive because of correlation is postive. in contrast, Multiple Linear Regression regression coefficient negative but correlation coefficient positive.

2) Because this occurs when you have highly correlated predictor variables, i will explained you by using toy example.

RSS = 3:10 #Right shoe size
LSS = rnorm(RSS, RSS, 0.1) #Left shoe size - similar to RSS
cor(LSS, RSS) #correlation ~ 0.99

weights = 120 + rnorm(RSS, 10*RSS, 10)

##Fit a joint model
m = lm(weights ~ LSS + RSS)

##F-value is very small, but neither LSS or RSS are significant
summary(m)

Call:

lm(formula = weights ~ LSS + RSS)

Residuals:

1 2 3 4 5 6 7 8

-16.04 9.80 6.89 7.58 -7.09 11.48 -12.94 0.33

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) 141.99 14.04 10.11 0.00016 ***

LSS -1.74 54.13 -0.03 0.97563   

RSS 9.29 54.04 0.17 0.87023   

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.7 on 5 degrees of freedom

Multiple R-squared: 0.748, Adjusted R-squared: 0.647

F-statistic: 7.43 on 2 and 5 DF, p-value: 0.0318

##Fitting RSS or LSS separately gives a significant result.

summary(lm(weights ~ LSS))

Call:

lm(formula = weights ~ LSS)

Residuals:

Min 1Q Median 3Q Max

-15.43 -8.58 3.29 7.36 13.20

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) 141.42 12.49 11.33 0.000028 ***

LSS 7.56 1.80 4.21 0.0057 **

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 11.6 on 6 degrees of freedom

Multiple R-squared: 0.747, Adjusted R-squared: 0.704

F-statistic: 17.7 on 1 and 6 DF, p-value: 0.00565

3) suppose we have 95 percent confidence interval around slope coefficient say beta_1

beta_1 belong to (5,10)

which means that 95 times the beta_1 values lies in between 5 to 10, out of 100 run.

4) To test to see whether the proportions in a chosen category of one variable are the same for two levels of the other variable. there are different methods to check is there association between the variables.

1) chi-square test of independence.

2) Kruskal valis test

The Kruskal-Wallis test will tell you if there are any significant differences among the medians of two or more groups. It is an extension of the Mann-Whitney U test, and will give you the same results as the Mann-Whitney U test if you just compare two groups


Related Solutions

Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y 2 70 0 70 4 130 a.) By hand, determine the simple regression equation relating Y and X. b.) Calculate the R-Square measure and interpret the result. c.) Calculate the adjusted R-Square. d.) Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. e.) Test to see whether X and Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y 2          70 0          70 4          130 Test to see whether X and Y are significantly related using a t-test on the slope of X. Test this at the 0.05 level. Test to see whether X and Y are significantly related using an F-test on the slope of X. Test this at the 0.05 level.
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y 2 70 0 70 4 130 a.) By hand, determine the simple regression equation relating Y and X. b.) Calculate the R-Square measure and interpret the result. c.) Calculate the adjusted R-Square. d.) Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. e.) Test to see whether X and Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y 2          70 0          70 4          130 By hand, determine the simple regression equation relating Y and X. Calculate the R-Square measure and interpret the result. Calculate the adjusted R-Square. Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. Test to see whether X and Y are significantly related using a...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y 2 70 0 70 4 130 (SOLVE ALL BY HAND, NOT BY USING EXCEL) By hand, determine the simple regression equation relating Y and X. Calculate the R-Square measure and interpret the result. Calculate the adjusted R-Square. Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. Test to see whether...
The data shown below for the dependent​ variable, y, and the independent​ variable, x, have been...
The data shown below for the dependent​ variable, y, and the independent​ variable, x, have been collected using simple random sampling. x 10 13 16 11 20 17 16 13 16 17 y 90 50 30 80 10 10 40 70 20 30 a. Develop a simple linear regression equation for these data. b. Calculate the sum of squared​ residuals, the total sum of​ squares, and the coefficient of determination. c. Calculate the standard error of the estimate. d. Calculate...
The data shown below for the dependent variable, y, and the independent variable, x, have been...
The data shown below for the dependent variable, y, and the independent variable, x, have been collected using simple random sampling. x 10 15 17 11 19 18 17 15 17 18 y 120 150 170 120 170 180 160 140 180 190 a. Develop a simple linear regression equation for these data. b. Calculate the sum of squared residuals, the total sum of squares, and the coefficient of determination. c. Calculate the standard error of the estimate. d. Calculate...
The following data for the dependent variable, y, and the independent variable, x, have been collected...
The following data for the dependent variable, y, and the independent variable, x, have been collected using simple random sampling: X Y 10 120 14 130 16 170 12 150 20 200 18 180 16 190 14 150 16 160 18 200 Construct a scatter plot for these data. Based on the scatter plot, how would you describe the relationship between the two variables? Compute the correlation coefficient.
Group A Independent Variable ( X ) Dependent Variable ( Y ) Use of Facebook in...
Group A Independent Variable ( X ) Dependent Variable ( Y ) Use of Facebook in work time Performance from 1 - 10 The time is in Minutes 1 = poor       10 = Excellent 45 8 30 8 20 8 30 9 90 7 60 8 50 7 50 8 60 7 30 8 40 8 90 7 60 6 Group B Independent Variable ( X ) Dependent Variable ( Y ) Use of Facebook in work time Performance from...
he data shown below for the dependent​ variable, y, and the independent​ variable, x, have been...
he data shown below for the dependent​ variable, y, and the independent​ variable, x, have been collected using simple random sampling. x 10 14 17    11 19 18 17 14 17 18 y 120 140 190 140 190 180 180 160 170 190 .a. Develop a simple linear regression equation for these data. .b. Calculate the sum of squared​ residuals, the total sum of​squares, and the coefficient of determination. .c. Calculate the standard error of the estimate. .d. Calculate...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT