In: Statistics and Probability
Task 1: Roulette wheel simulation
A roulette wheel has 38 slots of which 18 are red, 18 are black, and 2 are green. If a ball spun on to the wheel stops on the color a player bets, the player wins. Consider a player betting on red. Winning streaks follow a Geometric(p = 20/38) distribution in which we look for the number of red spins in a row until the first black or green. Use the derivation of the Geometric distribution from the Bernoulli distribution to simulate the game. Namely, generate Bernoulli(p = 20/38) random variates (0 = red; 1 = black or green) until a black or green occurs.
Code set-up
A while loop allows us to count the number of spins until a loss. If we use indicator variable lose to note a win (1) or loss (0), the syntax is “while we have not lost (i.e., lose==0), keep spinning.” Once you win, the while loop ends and the variable streak has counted the number of spins. Try running a few times.
streak = 0
lose = 0
p = 20/38
while(lose==0){
lose = (runif(1) < p) #
generate Bernoulli with probability p
streak = streak + 1 # tally streak
}
streak
## [1] 2
The problem
The code chunk above performs the experiment once: spin the roulette wheel until you lose and record the number of spins. Simulate 1000 experiments. As usual, do this by wrapping the code chunk above within a for-loop and storing the number of spins streak in a vector.
# [Place code here]
Report the following:
hist(winstreak, br=seq(min(winstreak)-0.5, max(winstreak+0.5)), main="")
[Answer here]
Let X be the random variable indicating the win streak length. We know that X has a Geometric distribution with parameter, probability of success (the probability of a black or green spin) p=20/38
The expected value if X is
The standard deviation of X is
These are the theoretical average and standard deviation of win streak length
R code already given for task 1
-----
#set the random seed
set.seed(123)
streak = 0
lose = 0
p = 20/38
while(lose==0){
lose = (runif(1) < p) # generate Bernoulli with probability
p
streak = streak + 1 # tally streak
}
streak
----
get this
This means we won 0 times betting on red, before we lost.
Note: This will not be same as "## [1] 2" a streak length of 2 given in the question as each run of the simulation gives different result, unless we set same the random seed
The problem
R code with comments
--
#set the random seed
set.seed(123)
#set the number of experiments
n<-1000
#initialize streak
winstreak<- numeric(n)
for (i in 1:n){
streak = 0
lose = 0
p = 20/38
while(lose==0){
lose = (runif(1) < p) # generate
Bernoulli with probability p
streak = streak + 1 # tally
streak
}
winstreak[i]<-streak
}
#Histogram of the win streak length.
hist(winstreak, br=seq(min(winstreak)-0.5, max(winstreak+0.5)),
main="Histogram of the win streak length",xlab="win streak
length")
#Average length of the win streak.
sprintf('Average length of the win streak is
%.4f',mean(winstreak))
#Standard deviation of the winning streak lengths.
sprintf('Standard deviation of the win streak lengths is
%.4f',sd(winstreak))
#Compare the empirical average and standard deviation with
#Geometric(p = 20/38) distribution.
sprintf('Theoretical average length of the win streak is
%.4f',1/p)
#Standard deviation of the winning streak lengths.
sprintf('Theoretical Standard deviation of the win streak lengths
is %.4f',sqrt((1-p)/p^2))
Longest winning streak.
sprintf('Longest winning streak is %g',max(winstreak))
--
get this
ans:
We can see that the true value of Average length of the win streak is 1.90 and the empirical value of 1.9010 is close enough to it.
The true value of Standard deviation of the winning streak lengths is 1.3077 and the empirical value of 1.2903 is close enough to it