Question

In: Statistics and Probability

Sales (Y) Calls (X1) Time (X2) Years (X3) Type 48 168 12.3 5 ONLINE 36 131...

Sales (Y) Calls (X1) Time (X2) Years (X3) Type
48 168 12.3 5 ONLINE
36 131 16.4 4 NONE
46 162 15.7 3 NONE
47 183 13.0 3 ONLINE
44 177 15.3 3 ONLINE
49 181 12.4 2 ONLINE
35 123 19.0 3 NONE
46 169 14.8 3 GROUP
44 158 13.9 1 GROUP
39 146 15.4 3 GROUP
48 178 12.6 4 ONLINE
42 142 17.0 0 ONLINE
45 137 13.0 2 ONLINE
54 195 15.2 2 ONLINE
43 146 16.4 0 ONLINE
44 165 17.4 3 ONLINE
34 121 13.2 2 NONE
44 146 16.5 1 NONE
40 132 18.2 1 NONE
51 182 17.9 2 ONLINE
41 151 18.0 1 NONE
45 146 15.6 3 ONLINE
52 190 13.2 3 ONLINE
39 150 19.4 0 GROUP
41 149 13.2 3 GROUP
45 167 14.5 4 GROUP
46 189 20.0 1 GROUP
47 162 16.4 3 ONLINE
42 147 13.2 3 GROUP
45 171 19.4 2 ONLINE
44 165 15.0 0 ONLINE
50 175 15.1 3 ONLINE
46 161 13.2 3 GROUP
53 188 11.0 2 ONLINE
39 136 17.3 0 NONE
39 135 17.7 1 ONLINE
48 168 15.9 5 ONLINE
46 167 10.1 0 ONLINE
43 150 17.4 3 GROUP
44 151 15.2 2 GROUP
42 141 12.2 3 NONE
39 131 19.4 2 NONE
49 174 18.3 0 ONLINE
41 154 14.5 4 NONE
42 131 20.2 3 GROUP
39 128 15.3 1 GROUP
37 126 13.4 4 NONE
46 180 15.1 4 NONE
45 166 19.5 5 NONE
44 152 16.0 2 ONLINE
50 179 12.8 3 ONLINE
39 140 18.2 1 NONE
43 154 15.3 1 ONLINE
45 164 17.2 3 ONLINE
42 139 18.6 2 NONE
44 165 19.2 2 NONE
45 172 12.6 3 GROUP
41 147 18.5 3 GROUP
43 152 17.2 1 GROUP
48 160 15.8 2 ONLINE
42 159 13.6 4 GROUP
46 186 14.1 3 GROUP
46 150 20.7 2 GROUP
43 155 11.2 3 ONLINE
45 157 16.3 4 ONLINE
48 170 12.1 1 ONLINE
45 175 18.3 2 GROUP
49 186 17.5 1 GROUP
51 181 11.4 4 GROUP
47 171 17.3 2 ONLINE
50 185 16.4 0 ONLINE
39 146 15.8 1 GROUP
42 156 18.6 2 GROUP
46 157 19.3 2 ONLINE
43 163 11.7 1 GROUP
54 175 14.2 1 ONLINE
51 175 12.0 2 ONLINE
50 173 13.3 1 ONLINE
41 140 14.9 3 NONE
43 156 20.5 2 ONLINE
40 146 18.2 2 NONE
42 148 10.5 2 GROUP
50 183 11.7 1 GROUP
49 191 13.1 2 GROUP
40 149 14.2 4 ONLINE
40 143 18.3 2 NONE
47 185 15.2 2 ONLINE
41 136 17.4 3 GROUP
51 198 13.0 1 ONLINE
43 153 13.2 3 GROUP
38 129 15.2 3 NONE
44 158 11.8 3 ONLINE
43 149 12.7 1 GROUP
47 175 13.9 2 GROUP
40 154 16.4 3 GROUP
43 151 14.3 1 GROUP
46 153 22.0 0 ONLINE
46 167 14.8 1 ONLINE
46 167 15.8 0 ONLINE
39 143 17.7 3

NONE

Part C: Regression and Correlation Analysis

Use the dependent variable (labeled Y) and the independent variables (labeled X1, X2, and X3) in the data file. Use Excel to perform the regression and correlation analysis to answer the following.

Generate a scatterplot for the specified dependent variable (Y) and the X1 independent variable, including the graph of the "best fit" line. Interpret.

Determine the equation of the "best fit" line, which describes the relationship between the dependent variable and the selected independent variable.

Determine the coefficient of correlation. Interpret.

Determine the coefficient of determination. Interpret.

Test the utility of this regression model. Interpret results, including the p-value.

Based on the findings in Steps 1-5, analyze the ability of the independent variable to predict the designated dependent variable.

Compute the confidence interval for β1 (the population slope) using a 95% confidence level. Interpret this interval.

Using an interval, estimate the average for the dependent variable for a selected value of the independent variable. Interpret this interval.

Using an interval, predict the particular value of the dependent variable for a selected value of the independent variable. Interpret this interval.

What can be said about the value of the dependent variable for values of the independent variable that are outside the range of the sample values? Explain.

In an attempt to improve the model, use a multiple regression model to predict the dependent variable .Y, based on all of the independent variables. X1, X2, and X3.

Using Excel, run the multiple regression analysis using the designated dependent and three independent variables. State the equation for this multiple regression model.

Perform the Global Test for Utility (F-Test). Explain the conclusion.

Perform the t-test on each independent variable. Explain the conclusions and clearly state how the analysis should proceed. In particular, which independent variables should be kept and which should be discarded. If any independent variables are to be discarded, re-run the multiple regression, including only the significant independent variables, and summarize results with discussion of analysis.

Is this multiple regression model better than the linear model generated in parts 1-10? Explain. Please use the actual data from below in the analysis.

Solutions

Expert Solution

a)

equation of best fit line => y = 0.2018X1+12.243

Coefficient of correlation = 0.871

Coefficient of Determination = (Coefficient of correlation)2 = 0.8712 = 0.759

ANOVA
df SS MS F Significance F
Regression 1 1307.747 1307.747 309.046 4.58E-32
Residual 98 414.6929 4.231561
Total 99 1722.44

Since p-value is 0 which is less than 0.05, we conclude that there exist a linear relationship between Y and X1.

Confidence interval for β1 = (0.179 , 0.224)

Multiple Regression test:-

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.874471
R Square 0.7647
Adjusted R Square 0.757347
Standard Error 2.054696
Observations 100
ANOVA
df SS MS F Significance F
Regression 3 1317.15 439.0499 103.9965 4.76E-30
Residual 96 405.2904 4.221775
Total 99 1722.44
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 14.75674 2.664714 5.537831 2.67E-07 9.467321 20.04615
Calls (X1) 0.197783 0.011964 16.53132 7.75E-30 0.174034 0.221531
Time (X2) -0.0938 0.082642 -1.13501 0.259195 -0.25784 0.070243
Years (X3) -0.19451 0.167938 -1.15821 0.24965 -0.52786 0.138846

Multiple linear regression model is better because it explains 76.4 % variability between the data and the Simple linear regression only explains 75.9% of the variability between the data.


Related Solutions

Sales (Y) Calls (X1) Time (X2) Years (X3) Type 20 210 8.0 1 NONE 32 139...
Sales (Y) Calls (X1) Time (X2) Years (X3) Type 20 210 8.0 1 NONE 32 139 16.9 4 NONE 44 165 15.7 3 ONLINE 47 186 13.5 3 ONLINE 41 180 14.0 2 ONLINE 35 150 13.0 4 ONLINE 32 120 19.9 3 NONE 46 172 14.7 3 GROUP 42 161 13.2 1 GROUP 33 143 15.4 3 NONE 42 181 11.5 4 ONLINE 55 160 17.0 3 NONE 42 140 17.5 2 GROUP 41 198 13.2 2 ONLINE 41...
Does the input requirement set V (y) = {(x1, x2, x3) | x1 + min {x2,...
Does the input requirement set V (y) = {(x1, x2, x3) | x1 + min {x2, x3} ≥ 3y, xi ≥ 0 ∀ i = 1, 2, 3} corresponds to a regular (closed and non-empty) input requirement set? Does the technology satisfies free disposal? Is the technology convex?
Sales price, y (thousands) Square feet, x1 Rooms, x2 Bedrooms, x3 Age, x4 53.5 1008 5...
Sales price, y (thousands) Square feet, x1 Rooms, x2 Bedrooms, x3 Age, x4 53.5 1008 5 2 35 49 1290 6 3 36 50.5 860 8 2 36 49.9 912 5 3 41 52 1204 6 3 40 55 1204 5 3 10 80.5 1764 8 4 64 86 1600 7 3 19 69 1255 5 3 16 149 3600 10 5 17 46 864 5 3 37 38 720 4 2 41 49.5 1008 6 3 35 103 1950...
(1) z=ln(x^2+y^2), y=e^x. find ∂z/∂x and dz/dx. (2) f(x1, x2, x3) = x1^2*x2+3sqrt(x3), x1 = sqrt(x3),...
(1) z=ln(x^2+y^2), y=e^x. find ∂z/∂x and dz/dx. (2) f(x1, x2, x3) = x1^2*x2+3sqrt(x3), x1 = sqrt(x3), x2 = lnx3. find ∂f/∂x3, and df/dx3.
By using Big-m method Minimize z=4x1+8x2+3X3subject to x1+x2>=2, 2x1+x3>=5 and x1,x2,x3>=0
By using Big-m method Minimize z=4x1+8x2+3X3subject to x1+x2>=2, 2x1+x3>=5 and x1,x2,x3>=0
Using Y as the dependent variable and X1, X2, X3, X4 and X5 as the explanatory...
Using Y as the dependent variable and X1, X2, X3, X4 and X5 as the explanatory variables, formulate an econometric model for data that is (i) time series data (ii) cross-sectional data and (iii) panel data – (Hint: please specify the specific model here not its general form).
income (Y in $1,000s), GPA (X1), age (X2), and the gender of the individual (X3; zero...
income (Y in $1,000s), GPA (X1), age (X2), and the gender of the individual (X3; zero representing female and one representing male) was performed on a sample of 10 people. Coefficients Standard Error Intercept 4.0928 1.4400 X1 10.0230 1.6512 X2 0.1020 0.1225 X3 -4.4811 1.4400 ANOVA DF SS MS Regression 360.59 Error 23.91 a. use Excel/XLSTAT to calculate p-value for the coefficient of X1. Is it significant? α = 0.05. Next, the T table and interpolate the p-value b. use...
Let x,y ∈ R3 such that x = (x1,x2,x3) and y = (y1,y2,y3) determine if <x,y>=...
Let x,y ∈ R3 such that x = (x1,x2,x3) and y = (y1,y2,y3) determine if <x,y>= x1y1+2x2y2+3x3y3    is an inner product
Using the data, determine whether the model using (x1, x2, x3, x4) to predict y is...
Using the data, determine whether the model using (x1, x2, x3, x4) to predict y is sufficient, or should some or all other predictors be considered? Write the full and reduced models, and then perform the test. Show your work and state your conclusion, but you do not need to specify your hypothesis statements. y 60323 61122 60171 61187 63221 63639 64989 63761 66019 67857 68169 66513 68655 69564 69331 70551 x1 83 88.5 88.2 89.5 96.2 98.1 99 100...
Consider the problem   maximize   Z = 5 x1 + 3 x2 + 2 x3 + 4...
Consider the problem   maximize   Z = 5 x1 + 3 x2 + 2 x3 + 4 x4        subject to                       5 x1 + x2 + x3 + 8 x4 = 10                       2 x1 + 4 x2 + 3 x3 + 2 x4 = 10                                     X j > 0, j=1,2,3,4 (a) Make the necessary row reductions to have the tableau ready for iteration 0. On this tableau identify the corresponding initial (artificial) basic feasible solution. Also, identify the initial entering and...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT