Question

In: Statistics and Probability

OrderNo. DeliveryTime NumberOfPizzas Distance Location 1 16.68 7 5.6 Downtown 2 11.5 3 2.2 Not Downtown...

OrderNo. DeliveryTime NumberOfPizzas Distance Location
1 16.68 7 5.6 Downtown
2 11.5 3 2.2 Not Downtown
3 12.03 3 3.4 Downtown
4 14.88 8 0.8 Not Downtown
5 13.75 6 1.5 Not Downtown
6 18.11 7 3.3 Downtown
7 8 2 1.1 Downtown
8 17.83 7 2.1 Downtown
9 79.24 30 14.6 Not Downtown
10 21.5 5 6.05 Not Downtown
11 40.33 16 6.88 Downtown
12 21 10 2.15 Not Downtown
13 13.5 4 2.55 Not Downtown
14 19.75 6 4.62 Not Downtown
15 24 9 4.48 Not Downtown
16 29 10 7.76 Not Downtown
17 15.35 6 2 Not Downtown
18 19 7 1.32 Downtown
19 9.5 3 0.36 Not Downtown
20 35.1 17 7.7 Downtown
21 17.9 10 1.4 Downtown
22 52.32 26 8.1 Not Downtown
23 18.75 9 4.5 Downtown
24 19.83 8 6.35 Not Downtown
25 10.75 4 1.5 Downtown
26 26 9 7.3 Not Downtown
27 14.21 5 2.4 Not Downtown
28 21 8 1.4 Downtown
29 10 4 0.9 Not Downtown
30 36 18 8 Downtown
31 18.1 9 1.5 Downtown
  1. What is the correlation between downtown variable (1= downtown, 0=not downtown) and each of the other three variables? What does each of the relationship suggest?

  2. Model A. Run a linear regression that predicts delivery time using number of pizzas, distance, and downtown as independent variables. Summarize and show results in a table. Explain each relationship.

  3. What is R2? What does it mean?

  4. What are the business implications of the regression results?

  5. Model B. Run a linear regression that predicts delivery time using number of pizzas and distance. Summarize and show results in a table. Explain any differences from the previous model.

Solutions

Expert Solution

> Book1 <- read_excel("Book1.xlsx")
> d=Book1
> r1=cor(d$Location,d$DeliveryTime);r2=cor(d$NumberOfPizzas,d$Location);r3=cor(d$Distance,d$Location)
> r1;r2;r3
[1] -0.09029982
[1] -0.006621049
[1] -0.1282853
> ##here correlation of location variable (downtown=1, not downtwn=0) with remaining three variables (delievery time, number of pizzas and distance) are negative so if we are changing our location from not downtown to downtown then remaining three variable will decreases (i.e delievry time, no. pizzas, distance will decreases). Decrease in number of pizza will be very very small and decrease in delivery time will be small and dicrease in distance will be significant comparatively
>
> #
> m=lm(d$DeliveryTime~d$NumberOfPizzas+d$Distance+d$Location)
> m

Call:
lm(formula = d$DeliveryTime ~ d$NumberOfPizzas + d$Distance +
d$Location)

Coefficients:
(Intercept) d$NumberOfPizzas d$Distance d$Location
2.922 1.619 1.344 -1.346

> # fitted linear regression is DeliveryTime=2.922 + 1.619*NumberOfPizzas + 1.344*Distance - 1.346*Location it means that for increase in number of pizza by one delivery time will increase by 1.619 units while increasing distance by one unit delivery time will increase by 1.344 units and if we change location from not downtown to downtown then delievery time will reduces by 1.346 units
> summary(m)

Call:
lm(formula = d$DeliveryTime ~ d$NumberOfPizzas + d$Distance +
d$Location)

Residuals:
Min 1Q Median 3Q Max
-5.4641 -1.5229 0.0274 1.1334 8.1359

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.9223 1.1123 2.627 0.014016 *
d$NumberOfPizzas 1.6189 0.1470 11.016 1.71e-11 ***
d$Distance 1.3435 0.2978 4.511 0.000113 ***
d$Location -1.3461 1.1381 -1.183 0.247219
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.088 on 27 degrees of freedom
Multiple R-squared: 0.9586,   Adjusted R-squared: 0.954
F-statistic: 208.5 on 3 and 27 DF, p-value: < 2.2e-16

> # from summary we can say that all independent variables except location are significant for explaining delievery time
> summary(m)$r.squared ##R^2 = 0.9586 indicates that 95.86% of the total error is explained by above regression line
[1] 0.9586213
>
> #
> m1=lm(d$DeliveryTime~d$NumberOfPizzas+d$Distance)
> m1

Call:
lm(formula = d$DeliveryTime ~ d$NumberOfPizzas + d$Distance)

Coefficients:
(Intercept) d$NumberOfPizzas d$Distance
2.275 1.591 1.415

> summary(m1)

Call:
lm(formula = d$DeliveryTime ~ d$NumberOfPizzas + d$Distance)

Residuals:
Min 1Q Median 3Q Max
-6.2376 -1.0912 0.0869 1.3644 8.5679

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.2746 0.9750 2.333 0.0271 *
d$NumberOfPizzas 1.5912 0.1461 10.890 1.42e-11 ***
d$Distance 1.4151 0.2937 4.818 4.56e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.11 on 28 degrees of freedom
Multiple R-squared: 0.9565,   Adjusted R-squared: 0.9534
F-statistic: 307.7 on 2 and 28 DF, p-value: < 2.2e-16

> summary(m1)$r.squared #R^2=0.9565 indicates that 95.65% of total variation is explianed by above regression line
[1] 0.9564774
> #Difference between modelA (model m) and modelB(model m1) is that coefficients of number of pizza and distance are changed. also R^2 for model m is greater(but not much significantly greater) than that for model m1 this is because nuber of independent variable in model m is more than that in model m1.
>


Related Solutions

OrderNo. DeliveryTime NumberOfPizzas Distance Location 1 16.68 7 5.6 Downtown 2 11.5 3 2.2 Not Downtown...
OrderNo. DeliveryTime NumberOfPizzas Distance Location 1 16.68 7 5.6 Downtown 2 11.5 3 2.2 Not Downtown 3 12.03 3 3.4 Downtown 4 14.88 8 0.8 Not Downtown 5 13.75 6 1.5 Not Downtown 6 18.11 7 3.3 Downtown 7 8 2 1.1 Downtown 8 17.83 7 2.1 Downtown 9 79.24 30 14.6 Not Downtown 10 21.5 5 6.05 Not Downtown 11 40.33 16 6.88 Downtown 12 21 10 2.15 Not Downtown 13 13.5 4 2.55 Not Downtown 14 19.75 6...
A= 2 B= 3 C= 7 A student pulls a (25.0+A) kg box a distance of...
A= 2 B= 3 C= 7 A student pulls a (25.0+A) kg box a distance of (4.50 + B) m up a ramp set at 25.8 degrees using a force of (154 + C) N applied parallel to the ramp. Find the efficiency of the ramp. Give your answer in percent (%) and with 3 significant figures.
Gender Caff_Consumption Stress 1 5 1 1 6 3 2 7 3 1 7 2 1...
Gender Caff_Consumption Stress 1 5 1 1 6 3 2 7 3 1 7 2 1 5 3 1 6 1 1 8 2 1 8 2 2 9 1 2 8 1 2 9 1 2 7 2 2 4 1 2 3 1 1 0 1 2 4 2 1 5 1 2 6 2 1 2 2 1 4 3 1 5 3 2 5 3 1 4 2 1 3 2 1 7 3 2 8...
Mother's age 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,51 Female 1, 0, 2, 2, 3, 4, 7, 3, 2, 4, 7, 1,...
Mother's age 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,51 Female 1, 0, 2, 2, 3, 4, 7, 3, 2, 4, 7, 1, 6, 4, 5, 3, 1, 4, 0, 1, 1, 1, 0, 1, 0 Use the stem and leaf plots that you previously created to help you draw and label histograms on your scratch paper with bin width of 2 for mothers's age at birth of female students and for mother's age at birth of male students. Make the lower bound of your first bin...
4) Let ? = {2, 3, 5, 7}, ? = {3, 5, 7}, ? = {1,...
4) Let ? = {2, 3, 5, 7}, ? = {3, 5, 7}, ? = {1, 7}. Answer the following questions, giving reasons for your answers. a) Is ? ⊆ ?? b)Is ? ⊆ ?? c) Is ? ⊂ ?? d) Is ? ⊆ ?? e) Is ? ⊆ ?? 5) Let ? = {1, 3, 4} and ? = {2, 3, 6}. Use set-roster notation to write each of the following sets, and indicate the number of elements in...
For the following data set [ 1, 4, 3, 6, 2, 7, 18, 3, 7, 2,...
For the following data set [ 1, 4, 3, 6, 2, 7, 18, 3, 7, 2, 4, 3, 5, 3, 7] please compute the following 1. measures of central tendency (3 points) 2. standard deviation ( 5 points) 3. is 18 an outlier? (5 points) 4. describe the shape of the distribution (2 points)
ID Affiliation Location Education Confidence 1 1 3 0 72 2 1 3 5 65 3...
ID Affiliation Location Education Confidence 1 1 3 0 72 2 1 3 5 65 3 0 4 5 66 4 0 1 4 78 5 0 3 1 81 6 1 2 5 81 7 1 1 2 83 8 1 3 3 74 9 0 4 0 78 10 0 2 2 85 11 0 1 1 85 12 1 3 5 69 13 1 2 0 69 14 1 3 2 79 15 1 4 1 82...
MONTH   ZEMIN CORP.   MARKET 1 7% 4% 2 2% 1% 3 2% 2% 4 -3% -1%...
MONTH   ZEMIN CORP.   MARKET 1 7% 4% 2 2% 1% 3 2% 2% 4 -3% -1% 5 4% 2% 6   2% 3% a. Given the​ holding-period returns shown above, compute the average returns and the standard deviations for the Zemin Corporation and for the market. b. If​ Zemin's beta is 1.04 and the​ risk-free rate is 9 ​percent, what would be an appropriate required return for an investor owning​ Zemin? ​(​Note: Because the returns of Zemin Corporation are based on...
Find the distance from (2, −7, 7) to each of the following. (a) the xy-plane (b)...
Find the distance from (2, −7, 7) to each of the following. (a) the xy-plane (b) the yz-plane (c) the xz-plane (d) the x-axis (e) the y-axis (f) the z-axis
Maria says that 4 2/3 x 7 1/5 = 4 x 7 + 2/3 x 1/5....
Maria says that 4 2/3 x 7 1/5 = 4 x 7 + 2/3 x 1/5. Explain why Maria has made a good attempt, but her answer is not correct. Explain how to work with what Maria has already written and modify it to get the correct answer. In other words, don’t just start from scratch and show Maria how to do the problem, but rather take what she has already written, use it, and make it mathematically correct. Which...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT