Question

In: Statistics and Probability

UESTION 3 and QUESTION 4: Context Suppose you are part of the analytics team for the...

UESTION 3 and QUESTION 4: Context

Suppose you are part of the analytics team for the online retailer Macha Bucks which sells two types of tea to its online visitors: Rouge Roma (RR) and Emerald Earl (EE). Everyday approximately 10,000 people visit the site over a 24 hour period. For simplicity suppose we consider the “buy one or don’t buy” (BODB) market segment of customers which when they visit the site will conduct one of the following actions: (a) buy one order of RR, (b) buy one order of EE, or (c) don’t buy (DB) anything. You have been tasked with determining customer behavior on the website for the BODB segment using a random sample of 35 visits.

In the dataset for the random sample, each row corresponds to a random visitor. For each visitor we provide both the visitor’s action as well as the profit earned on the transaction. In the action column:

if the visitor buys one order of RR, we see a RR,
if the visitor buys one order of EE, we see an EE,
if the visitor doesn’t buy anything, we see a DB.

Note that even if two customers buy the same product, the profit can differ due to the shipping costs, promotions, or coupons that are applied

Random Sample of Data

1=yes, 0 = no

Transaction ID

Action

Profit ($)

Bought RR?

Bought EE?

Didn't Buy?

Profit RR ($)

Profit EE ($)

1

RR

8.43

1

0

0

.

0.00

2

DB

0.00

0

0

1

0.00

0.00

3

EE

1.75

0

1

0

0.00

1.75

4

DB

0.00

0

0

1

0.00

0.00

5

EE

4.37

0

1

0

0.00

4.37

6

EE

5.79

0

1

0

0.00

5.79

7

RR

6.27

1

0

0

6.27

0.00

8

RR

6.22

1

0

0

6.22

0.00

9

DB

0.00

0

0

1

0.00

0.00

10

EE

4.49

0

1

0

0.00

4.49

11

RR

10.54

1

0

0

10.54

0.00

12

EE

3.79

0

1

0

0.00

3.79

13

DB

0.00

0

0

1

0.00

0.00

14

DB

0.00

0

0

1

0.00

0.00

15

RR

9.03

1

0

0

9.03

0.00

16

EE

3.54

0

1

0

0.00

3.54

17

DB

0.00

0

0

1

0.00

0.00

18

DB

0.00

0

0

1

0.00

0.00

19

EE

5.02

0

1

0

0.00

5.02

20

DB

0.00

0

0

1

0.00

0.00

21

EE

3.60

0

1

0

0.00

3.60

22

DB

0.00

0

0

1

0.00

0.00

23

EE

2.61

0

1

0

0.00

2.61

24

RR

11.75

1

0

0

11.75

0.00

25

RR

12.22

1

0

0

12.22

0.00

26

DB

0.00

0

0

1

0.00

0.00

27

DB

0.00

0

0

1

0.00

0.00

28

EE

6.17

0

1

0

0.00

6.17

29

RR

8.83

1

0

0

8.83

0.00

30

DB

0.00

0

0

1

0.00

0.00

31

DB

0.00

0

0

1

0.00

0.00

32

DB

0.00

0

0

1

0.00

0.00

33

DB

0.00

0

0

1

0.00

0.00

34

RR

14.16

1

0

0

14.16

0.00

35

EE

6.06

0

1

0

0.00

6.06


PARTS


Using the sample data, obtain a point estimate for the proportion of customers in this BODB market segment that
a) purchase EE:   

b) purchase RR:
c) don’t buy:       


a) What is the name of the model/distribution that would be appropriate to use for the probability distribution of the sample proportion of the BODB market segment that purchases EE?


b) Please provide as much information as you can about the relevant parameters for the distribution (e.g., mean and standard deviation).


Please provide a 95% confidence interval for population proportion of the BODB market segment that
a) purchase EE:

b) purchase RR:
c) don’t buy:       


a) What does the 95% confidence interval mean intuitively? Please provide an interpretation.


b) What could you do to obtain a narrower 95% confidence interval?


c) What would you need to do to have a margin of error of 0.05? Please do the calculation.


a) Please provide a 99% confidence interval for the population proportion of the BODB market segment that purchases EE.

b) When would you prefer a 99% confidence interval rather than a 95% confidence interval?


What is the 95% confidence interval for the average profit from a

a) EE customer (i.e., a customer in the BODB market segment that buys EE):

b) RR customer (i.e., a customer in the BODB market segment that buys RR):

c) Clearly state any assumptions you make about the sampling distribution.



QUESTION 4

PARTS


a) What could be an appropriate probability distribution to use for modeling the number of visitors that the website has in an hour?



b) What parameters would you use for the probability distribution?


c) Using that distribution, determine the probability that more than 600 people visit the site in an hour.




a) What could be an appropriate probability distribution to use for modeling the number of seconds between customer visits?



b) What parameters would you use for the probability distribution?


c) Using that distribution, determine the probability that the time between customer visits to the website is less than 10 seconds.



a) What could be an appropriate probability distribution to use for modeling the number of website visitors from 100 visitors that do not buy anything?



b) What parameters would you use for the probability distribution?


c) Using that distribution, determine the probability that from among 100 customers, it turns out that 30 or more customers do not buy anything.


d) What is the average number of visitors (from among 100 customers) that do not buy anything?

e) What is the standard deviation of the number of visitors (from among 100 customers) that do not buy anything?


What is the average profit from among 100 random customers that visit the site?
Please explain your answer or show your calculations.

Solutions

Expert Solution

From the given sample data we observe that No.of customers opted RR =9

No.of customers opted EE= 11

No.of customers opted DB= 15

proportion of customers opted RR = 9/35 = 0.2571

proportion of customers opted EE = 11/35= 0.3143

proportion of customers opted DB = 15/35= 0.4286

a. Binomial distribution is an appropriate distribution used to find the proportion of customers opted EE

since EE is an attribute

but the test we have to apply is Normal test for proportion

b.1. Parameters of the distribution are n= sample size and p= proportion of customers opted EE (in case of EE)

Mean no.of customers opted EE = np = 35x0.3143= 11

Its standard deviation =SQRT of ( npq) =SQRT(35x0.3143x0.6857)= 2.7464

2. a.95% cofidence limits for proportion of customers opted EE are given as

[(p - 1.96xS.E(p) , p+1.96 S.E(p)]

[0.3143 - 0.1538 ,0.3143+0.1538]

[ 0.1605 , 0.4681]

population proportion of customers opted EE is lies in between 0.1605 and 0.4681 under 95% confidence interval

b.Simillarly 95% confidence limits for proportion of customers opted RR are given as

[0.2571- 0.1447 , 0.2571+ 0.1447]

[0.1124 ,0.4019]

95% confidence limits for the population proportion of customers opted RR are 0.1124 and 0.4019

c. 95% confidence limits for the proportion of customers opted DB are given as

[0.4286 - 0.1639 , 0.4286+ 0.1639]

[0.2647 , 0.5926 ]

95& confidence limits for the population proportion of DB are 0.2647 and 0.5926

a. The purpose of finding the 95% confidence limits in each of the case is to find the general tendency

of EE, RR, DB it means these limits provide information about the population proportion of EE,RR and DB with the help of these limits we can understand the customers behaviour i.e how much percentage of customers opted RR, EE and DB

b.narrow confidence interval implies that there is a smaller chance of obtaining an observation within that interval. 95% confidence interval is narrower than 99% confidence interval

c. margin of error at 5% level

   margin of error for EE is 1.96x S.E (EE) = 0.1538

   margin of error for RR= 1.96xS.E(RR) = 0.1447

   margin of error for DB= 1.96xS.E(DB) = 0.1639

   a. 99% confidence interval for proportion of EE

[0.3143 - 2.58xS.E(EE) ,0.3143 +2.58xS.E(EE)]

[0.1118 ,0.5167]

b.99% confidence interval is more wider than the 95% confidence interval

   to find more accurate results we can use 99% confidence limits

c. 95% confidence interval for average profit from EE customers


Related Solutions

Suppose you are part of the analytics team for the online retailer Macha Bucks which sells...
Suppose you are part of the analytics team for the online retailer Macha Bucks which sells two types of tea to its online visitors: Rouge Roma (RR) and Emerald Earl (EE). Everyday approximately 10,000 people visit the site over a 24 hour period. For simplicity suppose we consider the “buy one or don’t buy” (BODB) market segment of customers which when they visit the site will conduct one of the following actions: (a) buy one order of RR, (b) buy...
Suppose you are part of the analytics team for the online retailer Macha Bucks which sells...
Suppose you are part of the analytics team for the online retailer Macha Bucks which sells two types of tea to its online visitors: Rouge Roma (RR) and Emerald Earl (EE). Everyday approximately 10,000 people visit the site over a 24 hour period. For simplicity suppose we consider the “buy one or don’t buy” (BODB) market segment of customers which when they visit the site will conduct one of the following actions: (a) buy one order of RR, (b) buy...
Suppose you are part of the analytics team for the online retailer Macha Bucks which sells...
Suppose you are part of the analytics team for the online retailer Macha Bucks which sells two types of tea to its online visitors: Rouge Roma (RR) and Emerald Earl (EE). Everyday approximately 10,000 people visit the site over a 24 hour period. For simplicity suppose we consider the “buy one or don’t buy” (BODB) market segment of customers which when they visit the site will conduct one of the following actions: (a) buy one order of RR, (b) buy...
Suppose that you are part of the Management team at Porsche. Suppose that it is the...
Suppose that you are part of the Management team at Porsche. Suppose that it is the end of December 2019 and a novel coronavirus that causes a respiratory illness was identified in Wuhan City, Hubei Province, China. The illness was reported to the World Health Organization and there is heightened uncertainty around the Globe. You (as part of the management team) are reviewing Porsche’s hedging strategy for the cash flows it expects to obtain from vehicle sales in North America...
Suppose that you are part of the Management team at Porsche. Suppose that it is the...
Suppose that you are part of the Management team at Porsche. Suppose that it is the end of December 2019 and a novel coronavirus that causes a respiratory illness was identified in Wuhan City, Hubei Province, China. The illness was reported to the World Health Organization and there is heightened uncertainty around the Globe. You (as part of the management team) are reviewing Porsche’s hedging strategy for the cash flows it expects to obtain from vehicle sales in North America...
Suppose that you are part of the Management team at Porsche. Suppose that it is the...
Suppose that you are part of the Management team at Porsche. Suppose that it is the end of December 2019 and a novel coronavirus that causes a respiratory illness was identified in Wuhan City, Hubei Province, China. The illness was reported to the World Health Organization and there is heightened uncertainty around the Globe. You (as part of the management team) are reviewing Porsche’s hedging strategy for the cash flows it expects to obtain from vehicle sales in North America...
Suppose that you are part of the Management team at Porsche. Suppose that it is the...
Suppose that you are part of the Management team at Porsche. Suppose that it is the end of December 2019 and a novel coronavirus that causes a respiratory illness was identified in Wuhan City, Hubei Province, China. The illness was reported to the World Health Organization and there is heightened uncertainty around the Globe. You (as part of the management team) are reviewing Porsche’s hedging strategy for the cash flows it expects to obtain from vehicle sales in North America...
Suppose that you are part of the Management team at Porsche. Suppose that it is the...
Suppose that you are part of the Management team at Porsche. Suppose that it is the end of December 2019 and a novel coronavirus that causes a respiratory illness was identified in Wuhan City, Hubei Province, China. The illness was reported to the World Health Organization and there is heightened uncertainty around the Globe. You (as part of the management team) are reviewing Porsche’s hedging strategy for the cash flows it expects to obtain from vehicle sales in North America...
Suppose that you are part of the Management team at Porsche. Suppose that it is the...
Suppose that you are part of the Management team at Porsche. Suppose that it is the end of December 2019 and You (as part of the management team) are reviewing Porsche’s hedging strategy for the cash flows it expects to obtain from vehicle sales in North America during the calendar year 2020. Assume that Porsche’s management entertains three scenarios: Scenario 1 (Expected): The expected volume of North American sales in 2020 is 35,000 vehicles. Scenario 2 (Pandemic): The low-sales scenario...
Suppose that you are part of the Management team at Porsche. Suppose that it is the...
Suppose that you are part of the Management team at Porsche. Suppose that it is the end of December 2019 and a novel coronavirus that causes a respiratory illness was identified in wuhan city, China. You (as part of the management team) are reviewing Porsche’s hedging strategy for the cash flows it expects to obtain from vehicle sales in North America during the calendar year 2020. Assume that Porsche’s management entertains three scenarios: Scenario 1 (Expected): The expected volume of...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT