Question

In: Statistics and Probability

A study considered a sample of 50 observations used to predict SALES. Included in the analysis...

A study considered a sample of 50 observations used to predict SALES. Included in the analysis were 9 predictors variables, ( Independent Variables).

X 1

    X 2

X 3

X 4

X 5

X 6

X 7

X 8

X 9

X 2        

0.804

X 3

0.625     

0.443

X 4

0.032

0.032

0.231

X 5

0.159

0.214

0.177

-0.194

X 6

0.319

0.373

0.308

0.054

0.293

X 7

-0.016

0.030

0.079

0.168

-0.309

0.067

X 8

-0.026

0.103

0.015

0.151

-0.311

0.059

0.912

X 9

0.169

-0.027

-0.104

0.017

-0.248

0.114

0.174

0.223

SALES

0.764

0.630

0.756

0.149

0.171

0.426

0.145

0.141

-0.068

  1. Based on the correlation matrix shown below, is there any concern about Multicollinearity?
    - If YES, list all pairs of variables involved:

    -What limit did you use?

    - What action must be taken to eliminate multicollinearity?  Be very specific.  

    -Which variable appears to have the weakest relationship with SALES?

    -The following is the initial Regression Printout. Variable X2, and X8 were not included in the regression. Why?  

    Source

    DF

    Adj SS

    Adj MS

    F-Value

    Regression

    7

    145157

    20736.7

    19.62

    Error

    42

    44385

    1056.8

    Total

    49

    189543



    Use the F test and a 0.05 level of significance to determine whether the Regression model is significant.

    What is the value of the F Critical Point

    How many degrees of freedom did you use?

    Compute the R-Square:

    Show your work and compute the Adjusted R-Square:

Solutions

Expert Solution

Based on the correlation matrix shown below, is there any concern about Multicollinearity?

Yes , there any concern about Multicollinearity

For Pair ( X7 , X8 ) the correlation is 0.912 , which implies they are highly correlated with each other.

Also Pair ( X1 ,X2 ) has correlation = 0.804 , so these pair is also highly correlated

So this pairs of variables involved : Pair ( X1, X2) and Pair ( X7, X8 )


-What limit did you use?
Limit we have used is " > | 0.70 | " for any pair

- What action must be taken to eliminate multicollinearity?  Be very specific.  

The only think can be done is to remove any of one variables of each pair which have lowest relationship with SALES .

Like in pair ( X7 , X8 ) X8 has lowest relationship with SALES ( which is 0.141 ) , so it is better to remove it .

-Which variable appears to have the weakest relationship with SALES?
X9 variable have the weakest relationship with SALES ( which is -0.068 )

-The following is the initial Regression Printout. Variable X2, and X8 were not included in the regression. Why?  

Source

DF

Adj SS

Adj MS

F-Value

Regression

7

145157

20736.7

19.62

Error

42

44385

1056.8

Total

49

189543

Use the F test and a 0.05 level of significance to determine whether the Regression model is significant.

To test

H0 : bi = 0 , i = 1,3,4,5,6,7,9    { Model is not significant }

H1 : bi 0    for atleast one i    { Model is significant }

Test Statistics F:

F = MSR / MSRES = 20736.7 / 1056.8 = 19.62216

Thus calculated F- value is 19.62

What is the value of the F Critical Point

It is given by

is F-distributed with df1= 7 and df2=42 degree of freedom and =0.05,

It can be computed from statistical book or more accurately from any software like R,Excel

From R

> qf(1-0.05,df1=7,df2=42)
[1] 2.23707

Thus value of the F Critical Point is 2.23707

We reject null hypothesis if calculated F-value is less than

How many degrees of freedom did you use?

is F-distributed with df1= 7 and df2=42 degree of freedom and =0.05,

Conclusion -

Since F- value = 19.62 > 2.23707 i.e F- value >

So we reject null hypothesis at 5% of level of significance at hence conclude that model is significant.

Compute the R-Square:

Formula :

R-Square = 1 - SSRES / TSS

                = 1 - 44385 / 189543

            = 0.7658315

Thus R-Square = 0.7658315

Show your work and compute the Adjusted R-Square:

Formula :

Adjusted R-Square: = 1 - SSRES/ df(Error) / TSS / df(Total)

        or = 1 - SSRES/ (n-k ) / TSS / (n-1)         

Here k = 7        ( number of regressor )

also   n-k = 42    ( given)     

               = 1 - ( 44385 / 42 ) / ( 189543 / 49 )

               = 0.7268034

Thus Adjusted R-Square = 0.7268034


Related Solutions

A study considered a sample of 50 observations used to predict SALES. Included in the analysis...
A study considered a sample of 50 observations used to predict SALES. Included in the analysis were 9 predictors variables, ( Independent Variables). Based on the correlation matrix shown below, is there any concern about Multicollinearity? Correlations X 1     X 2 X 3 X 4 X 5 X 6 X 7 X 8 X 9 X 2         0.804 X 3 0.625     0.443 X 4 0.032 0.032 0.231 X 5 0.159 0.214 0.177 -0.194 X 6 0.319 0.373 0.308 0.054...
A sample of 50 observations is selected from a normal population. The sample mean is 47,...
A sample of 50 observations is selected from a normal population. The sample mean is 47, and the population standard deviation is 7. Conduct the following test of hypothesis using the 0.10 significance level: H0: μ = 48 H1: μ ≠ 48 a. Is this a one- or two-tailed test? (Click to select)  Two-tailed test  One-tailed test b. What is the decision rule? Reject H0 and accept H1 when z does not lie in the region from  to. c. What is the value...
The sample variance of a random sample of 50 observations from a normal population was found...
The sample variance of a random sample of 50 observations from a normal population was found to be s2 = 80. Can we infer at the 1% significance level (i.e., a = .01) that the population variance is less than 100 (i.e., x < 100) ? Repeat part a changing the sample size to 100 What is the affect of increasing the sample size?
A sample of 1600 observations from a normal distribution has sample mean 50 and sample standard...
A sample of 1600 observations from a normal distribution has sample mean 50 and sample standard deviation 10. a. What is the point estimate for the population mean of X? b. Write a 95% confidence interval for the population mean of X (Use the t table to obtain the critical value and round to two decimal places). ( , ) c. Write a 99% confidence interval for the population mean of X (Use the t table to obtain the critical...
A random sample of 42 observations is used to estimate the population variance. The sample mean...
A random sample of 42 observations is used to estimate the population variance. The sample mean and sample standard deviation are calculated as 74.5 and 5.6, respectively. Assume that the population is normally distributed. a. Construct the 90% interval estimate for the population variance. (Round intermediate calculations to at least 4 decimal places and final answers to 2 decimal places.) b. Construct the 99% interval estimate for the population variance. (Round intermediate calculations to at least 4 decimal places and...
A random sample of 20 observations is used to estimate the population mean. The sample mean...
A random sample of 20 observations is used to estimate the population mean. The sample mean and the sample standard deviation are calculated as 162.5 and 22.60, respectively. Assume that the population is normally distributed. a. Construct the 99% confidence interval for the population mean. (Round intermediate calculations to at least 4 decimal places. Round "t" value to 3 decimal places and final answers to 2 decimal places.) b. Construct the 95% confidence interval for the population mean. (Round intermediate...
A random sample of 29 observations is used to estimate the population mean. The sample mean...
A random sample of 29 observations is used to estimate the population mean. The sample mean and the sample standard deviation are calculated as 130.2 and 29.60, respectively. Assume that the population is normally distributed Construct the 95% confidence interval for the population mean. Construct the 99% confidence interval for the population mean Use your answers to discuss the impact of the confidence level on the width of the interval. As the confidence level increases, the interval becomes wider. As...
A random sample of 27 observations is used to estimate the population mean. The sample mean...
A random sample of 27 observations is used to estimate the population mean. The sample mean and the sample standard deviation are calculated as 113.9 and 20.40, respectively. Assume that the population is normally distributed. [You may find it useful to reference the t table.] a. Construct the 90% confidence interval for the population mean. (Round intermediate calculations to at least 4 decimal places. Round "t" value to 3 decimal places and final answers to 2 decimal places.) b. Construct...
A random sample of 27 observations is used to estimate the population variance. The sample mean...
A random sample of 27 observations is used to estimate the population variance. The sample mean and sample standard deviation are calculated as 44 and 4.5, respectively. Assume that the population is normally distributed. (You may find it useful to reference the appropriate table: chi-square table or F table) a. Construct the 95% interval estimate for the population variance. (Round intermediate calculations to at least 4 decimal places and final answers to 2 decimal places.) b. Construct the 99% interval...
discuss what key Analysis/documentation/items should be included in a)study Analysis Report b)Study Analysis presentation 15marks each...
discuss what key Analysis/documentation/items should be included in a)study Analysis Report b)Study Analysis presentation 15marks each with references
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT