In: Statistics and Probability
Regression and Correlation Analysis
Use the dependent variable (labeled Y) and one of the independent variables (labeled X1, X2, and X3) in the data file. Select and use one independent variable throughout this analysis. Use Excel to perform the regression and correlation analysis to answer the following.
Generate a scatterplot for the specified dependent variable (Y) and the selected independent variable (X), including the graph of the "best fit" line. Interpret.
Determine the equation of the "best fit" line, which describes the relationship between the dependent variable and the selected independent variable.
Determine the coefficient of correlation. Interpret.
Determine the coefficient of determination. Interpret.
Test the utility of this regression model, represented by a hypothesis test of b=0 using α=0.10. Interpret results, including the p-value.
Based on the findings in steps 1-5, analyze the ability of the independent variable to predict the dependent variable?
Compute the confidence interval for b, using a 95% confidence level. Interpret this interval.
Compute the 99% confidence interval for the dependent variable, for a selected value of the independent variable. Each student can choose a value to use for the independent variable (use same value in the next step). Interpret this interval.
Using the same chosen value for part (8), estimate the 99% prediction interval for the dependent variable. Interpret this interval.
What can be said about the value of the dependent variable for values of the independent variable that are outside the range of the sample values? Explain.
Summarize your results from Steps 1–10 in a 3-page report. The report should explain and interpret the results in ways that are understandable to someone who does not know statistics.
Submission: The summary report and all of the work done in 1–10 (Excel output and interpretations) as an appendix
Format for report:
Summary Report
Steps 1-10 addressed with appropriate output, graphs and interpretations. Be sure to number each step 1-10.
Sales (Y) | Calls (X1) | Time (X2) | Years (X3) | Type |
46 | 172 | 14.7 | 3 | GROUP |
42 | 161 | 13.2 | 1 | GROUP |
42 | 140 | 17.5 | 2 | GROUP |
38 | 135 | 18.5 | 1 | GROUP |
33 | 152 | 15.0 | 3 | GROUP |
31 | 170 | 14.3 | 4 | GROUP |
44 | 192 | 16.7 | 1 | GROUP |
39 | 150 | 15.3 | 3 | GROUP |
41 | 164 | 17.8 | 3 | GROUP |
49 | 153 | 19.0 | 3 | GROUP |
42 | 154 | 14.3 | 2 | GROUP |
44 | 134 | 19.4 | 5 | GROUP |
49 | 131 | 14.6 | 1 | GROUP |
43 | 169 | 14.0 | 5 | GROUP |
44 | 168 | 12.4 | 2 | GROUP |
43 | 175 | 13.6 | 5 | GROUP |
33 | 150 | 14.9 | 2 | GROUP |
32 | 155 | 17.9 | 1 | GROUP |
48 | 162 | 14.5 | 4 | GROUP |
49 | 178 | 18.3 | 2 | GROUP |
35 | 149 | 15.6 | 1 | GROUP |
44 | 159 | 14.6 | 2 | GROUP |
67 | 166 | 18.9 | 1 | GROUP |
47 | 151 | 16.6 | 2 | GROUP |
41 | 152 | 14.5 | 4 | GROUP |
33 | 139 | 19.3 | 3 | GROUP |
45 | 156 | 13.2 | 3 | GROUP |
50 | 157 | 15.9 | 3 | GROUP |
42 | 154 | 15.3 | 1 | GROUP |
20 | 210 | 8.0 | 1 | NONE |
32 | 139 | 16.9 | 4 | NONE |
32 | 120 | 19.9 | 3 | NONE |
33 | 143 | 15.4 | 3 | NONE |
55 | 160 | 17.0 | 3 | NONE |
36 | 121 | 18.0 | 2 | NONE |
67 | 155 | 17.9 | 1 | NONE |
37 | 159 | 18.1 | 0 | NONE |
37 | 132 | 10.0 | 0 | NONE |
36 | 140 | 15.7 | 1 | NONE |
37 | 142 | 13.9 | 3 | NONE |
37 | 130 | 16.9 | 2 | NONE |
39 | 160 | 14.3 | 4 | NONE |
35 | 130 | 19.4 | 4 | NONE |
39 | 140 | 12.4 | 1 | NONE |
50 | 144 | 15.8 | 2 | NONE |
45 | 138 | 15.3 | 2 | NONE |
40 | 145 | 14.7 | 2 | NONE |
29 | 145 | 19.0 | 2 | NONE |
36 | 131 | 18.5 | 2 | NONE |
39 | 144 | 17.7 | 3 | NONE |
44 | 165 | 15.7 | 3 | ONLINE |
47 | 186 | 13.5 | 3 | ONLINE |
41 | 180 | 14.0 | 2 | ONLINE |
35 | 150 | 13.0 | 4 | ONLINE |
42 | 181 | 11.5 | 4 | ONLINE |
41 | 198 | 13.2 | 2 | ONLINE |
41 | 149 | 17.3 | 0 | ONLINE |
44 | 168 | 11.0 | 5 | ONLINE |
30 | 125 | 11.0 | 5 | ONLINE |
21 | 185 | 18.9 | 2 | ONLINE |
45 | 149 | 13.5 | 1 | ONLINE |
52 | 193 | 13.7 | 5 | ONLINE |
44 | 165 | 12.4 | 3 | ONLINE |
43 | 174 | 12.7 | 2 | ONLINE |
42 | 168 | 16.4 | 0 | ONLINE |
49 | 178 | 15.1 | 3 | ONLINE |
40 | 191 | 19.0 | 5 | ONLINE |
46 | 171 | 14.9 | 5 | ONLINE |
41 | 170 | 12.3 | 0 | ONLINE |
21 | 177 | 17.0 | 0 | ONLINE |
46 | 183 | 15.4 | 4 | ONLINE |
41 | 155 | 16.0 | 2 | ONLINE |
48 | 182 | 13.0 | 2 | ONLINE |
40 | 157 | 15.4 | 1 | ONLINE |
48 | 167 | 14.8 | 3 | ONLINE |
46 | 163 | 16.6 | 2 | ONLINE |
56 | 189 | 15.0 | 3 | ONLINE |
44 | 153 | 15.3 | 2 | ONLINE |
34 | 158 | 14.2 | 3 | ONLINE |
43 | 160 | 10.9 | 4 | ONLINE |
33 | 173 | 17.5 | 1 | ONLINE |
50 | 189 | 14.3 | 1 | ONLINE |
52 | 184 | 11.4 | 4 | ONLINE |
45 | 174 | 13.6 | 2 | ONLINE |
48 | 188 | 13.6 | 0 | ONLINE |
44 | 160 | 14.8 | 2 | ONLINE |
51 | 178 | 16.5 | 1 | ONLINE |
41 | 178 | 13.4 | 2 | ONLINE |
40 | 176 | 12.6 | 1 | ONLINE |
41 | 159 | 18.8 | 2 | ONLINE |
48 | 186 | 14.2 | 1 | ONLINE |
42 | 194 | 13.6 | 2 | ONLINE |
48 | 188 | 11.3 | 2 | ONLINE |
48 | 201 | 12.5 | 1 | ONLINE |
43 | 161 | 17.3 | 3 | ONLINE |
42 | 152 | 14.6 | 1 | ONLINE |
49 | 178 | 16.4 | 2 | ONLINE |
44 | 156 | 20.0 | 0 | ONLINE |
45 | 170 | 14.2 | 1 | ONLINE |
48 | 170 | 17.4 | 5 | ONLINE |
Lets perform the analysis by going to data > data analysis tab and then selecting regression . Please see the below screenshots of the setup in excel along with the scatterplot
The regression line is formed using the regression coefficients as
Y = 19.26 +0.1121*X1 + 0.2673*X2 + 0.2387*X3
the coefficient of determination is
R Square 0.069835154
this means that the model is able to explain oonly 6% variation in Y due to variation in independent variables x1 , x2 and x3
The coefficient of correlation is
0.2642
P-value | |
Intercept | 0.058399 |
Calls (X1) | 0.009202 |
Time (X2) | 0.42301 |
Years (X3) | 0.663431 |
all the independnet variables whose p value is less than 0.1 are considered signficant for the model
Please note that we can answer only 4 subparts of a question at a time , as per the answering guidelines