In: Statistics and Probability
The attached code in Chili's Example HW.R models the Chili's restaurant example as seen in the course materials. A version of the example is repeated here for convenience: Let us assume we have two competing models for what effect a proposed Chili’s exterior remodel will have on its stores. One manager is very excited about the effect of exterior renovations and argues that 70% of all Chili’s stores will see at least a 5% increase in sales. Let’s define some notation; the probability of seeing an increase of 5% in sales at any one store is \thetaθ and for this optimistic manager \theta=70\%θ=70% . Another manager argues that the old façade was not really dated and felt that only 20% of remodeled stores will see at least a 5% increase in sales, i.e. \theta = 20\%θ=20% . As a compromise, the managers agree to roll the renovation out to 20 randomly selected stores and then decide whether to remodel all the stores or wait to remodel the other stores. They elect you to make the decision as to who is right. RUN THE CODE LINE-BY-LINE AND DIGEST WHAT HAPPENS WITH EACH LINE. CONFIRM THAT THE FINAL PLOT ECHOES THE MATH DONE FOR THE EXAMPLE AS PRESENTED IN THE COURSE MATERIALS. THEN, MODIFY THE CODE TO ACCOMMODATE THE FOLLOWING CHANGES TO THE BUSINESS ASSUMPTIONS: Assume a prior that gives equal weight to all managers' models. Also, assume that you are interested in comparing more than just the two candidate \theta'sθ′s. Modify the code to compare 8 potential \thetaθ values: \theta\in\left\{0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8\right\}θ∈{0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8}. Also, assume that you collect data from FIVE stores. The first, second, and third stores' sales did increase by over 5% and the fourth and fifth stores' sales did not. For coding purposes, assume that a coin flip of heads represents success (i.e. store sales increased by 5% or more) and tails represents a failure (i.e. store sales did not increase). From your graph of the posterior distribution, what value for \thetaθ seems most likely (i.e. P\left(\theta\mid D\right)P(θ∣D) is the highest)? Enter your answer with one decimal place.
#install.packages(c("gridExtra"))
library("ggplot2")
library("gridExtra")
# theta is a parameter representing the probability that a
remodeled store sees at least a 5% increase in sales
#Two bosses: Boss1 Theta = 20% and Boss2 Theta = 70%
nThetaVals = 2 ## we have two competing models (i.e. the Boss1 and
Boss2) of the world
# Now make the vector of theta values that represent the models of
the world that we have:
thetaVals = c(0.2,0.7)
#CALCULATE PRIOR
# pTheta is the vector of prior probabilities on the theta
values.
pTheta = c(0.5,0.5) # Makes a uniform belief distribution.
#SPECIFY OBSERVED DATA
# Specify the data. The follwoing are 3 heads and 9 tails and 1
head and 11 tails.
#Data = c(1,1,1,0,0,0,0,0,0,0,0,0)
#Data = c(1,0,0,0,0,0,0,0,0,0,0,0)
Data = c(1) ## one store is a success
nSuccess = sum( Data == 1 ) ##count # of successful stores
nFail = sum( Data == 0 ) ##count # of failure stores
#CALCULATE LIKELIHOOD
# Compute the likelihood of the data for each value of theta:
pDataGivenTheta = thetaVals^nSuccess * (1-thetaVals)^nFail
#CALCULATE POSTERIOR PROBABILITIES
# Compute the posterior:
pData = sum( pDataGivenTheta * pTheta )
pThetaGivenData = pDataGivenTheta * pTheta / pData # This is Bayes'
rule!
#make data frame for plotting
plotDF = data.frame(thetaVals, prior = pTheta,likelihood =
pDataGivenTheta, posterior = pThetaGivenData)
###code below for plotting the dataframe and showing
probabilities
baseLayer = ggplot(data = plotDF, aes(x = thetaVals)) +
coord_cartesian(xlim = c(0,1))
priorLayer = geom_bar(aes(y = prior),stat = "identity",width =
0.01, fill = "blue")
likelihoodLayer = geom_bar(aes(y = likelihood),stat =
"identity",width = 0.01, fill = "red4")
posteriorLayer = geom_bar(aes(y = posterior),stat =
"identity",width = 0.01, fill = "purple4")
p1 = baseLayer + priorLayer + geom_text(aes(y = prior + 0.1,
label = signif(prior,3)))
p2 = baseLayer + likelihoodLayer + geom_text(aes(y =
1.2*likelihood, label = signif(likelihood,2)))
p3 = baseLayer + posteriorLayer + geom_text(aes(y = posterior +
0.1, label = signif(posterior,3)))
grid.arrange(p1,p2,p3) ##see line 1 to install gridExtra package for this functionality
Please see the results of the R code below
library("ggplot2")
library("gridExtra")
# theta is a parameter representing the probability that a
remodeled store sees at least a 5% increase in sales
#Two bosses: Boss1 Theta = 20% and Boss2 Theta = 70%
nThetaVals = 2 ## we have two competing models (i.e. the Boss1 and
Boss2) of the world
# Now make the vector of theta values that represent the models of
the world that we have:
thetaVals = c(0.2,0.7)
#CALCULATE PRIOR
# pTheta is the vector of prior probabilities on the theta
values.
pTheta = c(0.5,0.5) # Makes a uniform belief distribution.
#SPECIFY OBSERVED DATA
# Specify the data. The follwoing are 3 heads and 9 tails and 1
head and 11 tails.
#Data = c(1,1,1,0,0,0,0,0,0,0,0,0)
#Data = c(1,0,0,0,0,0,0,0,0,0,0,0)
Data = c(1) ## one store is a success
nSuccess = sum( Data == 1 ) ##count # of successful stores
nFail = sum( Data == 0 ) ##count # of failure stores
#CALCULATE LIKELIHOOD
# Compute the likelihood of the data for each value of theta:
pDataGivenTheta = thetaVals^nSuccess * (1-thetaVals)^nFail
#CALCULATE POSTERIOR PROBABILITIES
# Compute the posterior:
pData = sum( pDataGivenTheta * pTheta )
pThetaGivenData = pDataGivenTheta * pTheta / pData # This is Bayes'
rule!
#make data frame for plotting
plotDF = data.frame(thetaVals, prior = pTheta,likelihood =
pDataGivenTheta, posterior = pThetaGivenData)
###code below for plotting the dataframe and showing
probabilities
baseLayer = ggplot(data = plotDF, aes(x = thetaVals)) +
coord_cartesian(xlim = c(0,1))
priorLayer = geom_bar(aes(y = prior),stat = "identity",width =
0.01, fill = "blue")
likelihoodLayer = geom_bar(aes(y = likelihood),stat =
"identity",width = 0.01, fill = "red4")
posteriorLayer = geom_bar(aes(y = posterior),stat =
"identity",width = 0.01, fill = "purple4")
p1 = baseLayer + priorLayer + geom_text(aes(y = prior + 0.1,
label = signif(prior,3)))
p2 = baseLayer + likelihoodLayer + geom_text(aes(y =
1.2*likelihood, label = signif(likelihood,2)))
p3 = baseLayer + posteriorLayer + geom_text(aes(y = posterior +
0.1, label = signif(posterior,3)))
grid.arrange(p1,p2,p3) ##see line 1 to install gridExtra package for this functionality
The results are