Question

In: Statistics and Probability

Collect data on one response (dependent or y) variable and two different explanatory (independent or x)...

Collect data on one response (dependent or y) variable and two different explanatory (independent or x) variables. This will require a survey with three questions. For example: To predict a student’s GPA (y), you might collect data on two x variables: SAT score and age. So we would be trying to determine if there was a linear correlation between someone’s SAT score and their GPA, as well as their age and their GPA. (Note: students may not choose GPA as their dependent variable, must pick a different topic.)

• This data must be quantitative, not qualitative.

• Collect data from at least 15 people. Each person must answer all three questions for their data to count.

• Prepare a brief report that shares the questions used, as to why they are important to be studied.

• Present data in table form and as a scatter plot. You can create your tables and graphs in Excel, but they will need to be copy and pasted into your Word document. Do NOT submit an Excel file as it will not be graded.

• Model the data with two linear regressions (one for each x & y pair.)

• Interpret each linear model.

• Use each of your models to make a prediction.

Solutions

Expert Solution

Consider the following dataset that quantifies Restaurant Score, the Food Quality Score and the Service score, as given by 20 different customers. The scores have been mapped to a scale of 0 to 100 for analysis purposes.

Restaurant Score Food Quality Service
94.5 90.9 97.8
93.2 84.1 96.8
92.8 99.9 88.7
91.1 95 96.9
90.6 87.8 91.3
90.1 82.2 98.7
90.4 86.3 91.9
89.7 92.5 89.1
89.5 85.7 90.9
89.2 83.1 90.6
89.3 81.9 88.6
89 93.3 89.5
88.8 78.4 91.3
87.2 91.9 73.4
87.4 75 89.8
86.8 78.2 91.7
86.2 77.5 91.1
86.1 76.7 91.5
85.9 72.2 89.5
85.1 77.5 92

We need to study the 3 scores for the restaurants in order to recommend to users effectively and in a more objective manner which restaurant is better in terms of overall satisfaction, food quality and service standards. This will help customers make informed choice for dine-out or delivery.

The scatter plot of overall Restaurant Score and Food Quality looks as follows:

And, the scatter plot of overall Restaurant Score and Food Quality looks as follows:

Now, modeling the Restaurant Score using Food Quality score in Excel (go to Data tab -> Data Analysis -> Regression and choose Restaurant Score as Y-column and Service as X-column), we get the following results:

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.67523679
R Square 0.45594473
Adjusted R Square 0.42571943
Standard Error 1.94117804
Observations 20
ANOVA
df SS MS F Significance F
Regression 1 56.84240103 56.842401 15.0848737 0.001087691
Residual 18 67.82709897 3.76817216
Total 19 124.6695
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 69.8613634 4.98392443 14.01734 3.9829E-11 59.39052672 80.33220009
Food Quality 0.22819521 0.058753764 3.88392503 0.00108769 0.104758137 0.351632292

Hence, the model obtained is: Restaurant Score = 69.86 + 0.228 * Food Quality ------ (i)

The low p-value of Food Quality coefficient (<< 0.05) shows it is a significant predictor of Restaurant Score at a 1% significance level.

Similarly, modeling Restaurant Score using Service score, we get the following results:

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.4147832
R Square 0.1720451
Adjusted R Square 0.12604761
Standard Error 2.3946784
Observations 20
ANOVA
df SS MS F Significance F
Regression 1 21.44877664 21.4487766 3.74031461 0.068995054
Residual 18 103.2207234 5.73448463
Total 19 124.6695
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 70.4200815 9.696813355 7.262188 9.4491E-07 50.04783265 90.79233045
Service 0.20564404 0.106331532 1.9339893 0.06899505 -0.017750214 0.429038303

Now, the model obtained is: Restaurant Score = 70.42 + 0.206 * Service ------- (ii)

The p-value of Service coefficient is 0.0689 >> 0.05, hence it is not a significant predictor of Restaurant Score at the 5% significance level.

Making some predictions:

Using model 1, for a Food Quality score of 85, we get

Restaurant Score = 69.86 + 0.228 * 85 = 89.24

Using model 2, for a Service score of 90, we get

Restaurant Score = 70.42 + 0.206 * 90 = 88.96


Related Solutions

Collect data on one response (dependent or y) variable and two different explanatory (independent or x)...
Collect data on one response (dependent or y) variable and two different explanatory (independent or x) variables. This will require a survey with three questions. For example: To predict a student’s GPA (y), you might collect data on two x variables: SAT score and age. So we would be trying to determine if there was a linear correlation between someone’s SAT score and their GPA, as well as their age and their GPA. (Note: students may not choose GPA as...
Suppose we have the following data on a dependent variable (Y) and an explanatory variable (X):...
Suppose we have the following data on a dependent variable (Y) and an explanatory variable (X): X          Y 0          140 1          140 4             0 0          180 3           80 2          120 4             0 1          200 2          120             3            40 Calculate the simple linear regression equation by hand. Show all your work. Using your result, predict the value of Y when X = 3.5. (15 points) Calculate the R2and the adjusted R2measures (by hand) and provide an interpretation of what they tell us. (12 points) Test (by hand) to see whether this is...
Consider data regarding a response y and an explanatory variable x, both numeric x y 1...
Consider data regarding a response y and an explanatory variable x, both numeric x y 1 2.8 0.7 2 2.6 1.3 3 6.8 -1.1 4 3.0 0.2 5 4.7 1.1 6 5.0 -0.1 7 5.0 0.9 8 2.9 1.0 9 7.0 -0.2 10 3.7 0.8 The null hypothesis that the expected value of the response is constant for all values of the explanatory variable is: Select one: a. Not rejected with a significance level of 5%. b. Rejected with a...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y 2 70 0 70 4 130 a.) By hand, determine the simple regression equation relating Y and X. b.) Calculate the R-Square measure and interpret the result. c.) Calculate the adjusted R-Square. d.) Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. e.) Test to see whether X and Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y 2          70 0          70 4          130 Test to see whether X and Y are significantly related using a t-test on the slope of X. Test this at the 0.05 level. Test to see whether X and Y are significantly related using an F-test on the slope of X. Test this at the 0.05 level.
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y 2 70 0 70 4 130 a.) By hand, determine the simple regression equation relating Y and X. b.) Calculate the R-Square measure and interpret the result. c.) Calculate the adjusted R-Square. d.) Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. e.) Test to see whether X and Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X         Y 2          70 0          70 4          130 By hand, determine the simple regression equation relating Y and X. Calculate the R-Square measure and interpret the result. Calculate the adjusted R-Square. Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. Test to see whether X and Y are significantly related using a...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y 2 70 0 70 4 130 (SOLVE ALL BY HAND, NOT BY USING EXCEL) By hand, determine the simple regression equation relating Y and X. Calculate the R-Square measure and interpret the result. Calculate the adjusted R-Square. Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. Test to see whether...
Consider the following data for a dependent variable y and two independent variables, x1and x2. x...
Consider the following data for a dependent variable y and two independent variables, x1and x2. x 1 x 2 y 29 13 94 47 10 109 24 17 113 50 16 178 40 6 95 52 20 176 75 7 171 37 13 118 59 14 142 77 17 211 Round your all answers to two decimal places. Enter negative values as negative numbers, if necessary. a. Develop an estimated regression equation relating y to x1. ŷ =  +  x1 Predict y...
The data shown below for the dependent​ variable, y, and the independent​ variable, x, have been...
The data shown below for the dependent​ variable, y, and the independent​ variable, x, have been collected using simple random sampling. x 10 13 16 11 20 17 16 13 16 17 y 90 50 30 80 10 10 40 70 20 30 a. Develop a simple linear regression equation for these data. b. Calculate the sum of squared​ residuals, the total sum of​ squares, and the coefficient of determination. c. Calculate the standard error of the estimate. d. Calculate...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT