Question

In: Statistics and Probability

3. A rival music streaming company wishes to make inference for the proportion of individuals in...

3. A rival music streaming company wishes to make inference for the proportion of individuals in the United States who subscribe to Spotify. They plan to take a survey. Let S1, . . . , Sn be the yet-to-be observed survey responses from n individuals, where the event Si = 1 corresponds to the ith individual subscribing to Spotify and the event Si = 0 corresponds to the ith individual does not subscribe to Spotify (i = 1, . . . , n). Assume that S1, . . . , Sn are i.i.d. Ber(π).

(a) What distribution does the random variable S = Pn i=1 Si have? Compute E(S) and var(S). The formulas should involve π and n.

(b) Suppose that n = 30 and π = 0.2. Run a Monte Carlo simulation with m = 10000 replications to verify the formulas for E(S) and var(S) from the previous question. That is, simulate 10000 i.i.d. copies of S and compare the observed average of these to the true mean, and the observed (sample) variance to the true variance. Comment. 1

(c) Let S¯ = n −1S = n −1 Pn i=1 Si . What is the mean and variance of S¯?

(d) Verify your answers to the previous question by a Monte Carlo simulation with m = 10000 replications.

(e) Is S¯ a continuous random variable? Explain.

(f) Run a Monte Carlo simulation to estimate the probability P(S¯− 1/ √ n ≤ π ≤ S¯ + 1/ √ n) when π = 0.2 and n = 10, 20, 80, 160. Hint: For every n considered, do the following m = 10000 times: generate a random variable S˜ with the same distribution as S¯ and record whether |S˜−0.2| ≤ 1/ √ n. The Monte Carlo estimate of the desired probability is the number of times this happened divided by the total number of simulations, m = 10000.

Solutions

Expert Solution

The random variables [S Bernoulli (T) .i-1,2,..., n] .

a)The random variable

[S=\sum_{i=1}^{n}S_i] has pmf

[{\color{Blue} P\left ( S=k \right )=\binom{n}{k}\pi ^k\left ( 1-\pi \right )^{n-k},k=0,1,2,...,n}]

That is the sum of Bernoulli random variables are Binomially distributed. We know the mean and variance of the Binomial random variable is

[{\color{Blue} E\left ( S \right )=n\pi ,Var\left ( S \right )=n\pi \left ( 1-\pi \right )}] .

b) The R code for simulating 10000 runs of 30 combinations of Bernoulli RV [S_i\sim Bernoulli\left ( 0.2 \right ),i=1,2,...,40]    is given below.

m <- 10000
n <- 30
p <- 0.2
sims = array (dim = c(m,n))
S = array (dim = c(m))

for ( i in 1:m)
{
sims[i,] <- rbinom(n, 1, prob=p)
S[i] <- sum(sims[i,])
}
meanS <- mean(S)
varS <- var(S)
meanS
varS

The output is

> meanS
[1] 6.0379
> varS
[1] 4.782342

So the true mean is [E\left ( S \right )=30\times 0.2={\color{Blue} 6}] and the observed average is [{\color{Blue} 6.0379}] .

So the true variance is [Var\left ( S \right )=30\times 0.2\times 0.8={\color{Blue} 4.8}] and the observed variance is [{\color{Blue} 4.782342}] .

We can see that the theoretical and simulated values are approximately equal.

c) The RV [S^{-}=\frac{1}{n}\sum_{i=1}^{n}S_i=\frac{S}{n}] . The mean and variance of [S^{-}] are

[E\left (S^{-} \right )=\frac{E\left ( S \right )}{n}=\frac{n\pi }{n}={\color{Blue} \pi }]

[Var\left (S^{-} \right )=\frac{Var\left ( S \right )}{n^2}=\frac{n\pi\left ( 1-\pi \right ) }{n^2}={\color{Blue}\frac{\pi\left ( 1-\pi \right )}{n} }]

d)The R code for simulating 10000 runs of 30 combinations of Bernoulli RV and finding the mean variance of the mean of the sum is given below.

m <- 10000
n <- 30
p <- 0.2
sims = array (dim = c(m,n))
Sm = array (dim = c(m))

for ( i in 1:m)
{
sims[i,] <- rbinom(n, 1, prob=p)
Sm[i] <- sum(sims[i,])/n
}
meanSm <- mean(Sm)
varSm <- var(Sm)
meanSm
varSm

The output is:

> meanSm
[1] 0.1995667
> varSm
[1] 0.005378795

So the true mean is [E\left ( S^{-} \right )={\color{Blue} 0.2}] and the observed average is [{\color{Blue} 0.1996}] .

So the true variance is [Var\left ( S \right )= \frac{0.2\times 0.8}{30}={\color{Blue} 0.00533}] and the observed variance is [{\color{Blue}0.00538}] .

e) The RV [S^{-}=\frac{1}{n}\sum_{i=1}^{n}S_i=\frac{S}{n}] is discrete and has values, [\frac{i}{n},i=0,1,...,n] .

f) The question is not clear.

If you have any doubt please revert. Kindly upvote.


Related Solutions

R-Code 3. A rival music streaming company wishes to make inference for the proportion of individuals...
R-Code 3. A rival music streaming company wishes to make inference for the proportion of individuals in the United States who subscribe to Spotify. They plan to take a survey. Let S1, . . . , Sn be the yet-to-be observed survey responses from n individuals, where the event Si = 1 corresponds to the ith individual subscribing to Spotify and the event Si = 0 corresponds to the ith individual does not subscribe to Spotify (i = 1, ....
Question 3: a. In making an inference about a population, it is usually desirable to make...
Question 3: a. In making an inference about a population, it is usually desirable to make a/an __________ estimate. sample standard average interval b. If housing starts are always stronger in the spring and summer than during the fall and winter. This is a result of what type of data pattern? Cyclical Irregular Seasonal Trend c. For the forecasting process, where would the model selection step fall in the process? After specifying the objectives and before determining what to forecast....
A courier service company wishes to estimate the proportion of people in various states that will...
A courier service company wishes to estimate the proportion of people in various states that will use its services. Suppose the true proportion is 0.05. If 219 are sampled, what is the probability that the sample proportion will differ from the population proportion by more than 0.03? Round your answer to four decimal places.
A direct mail company wishes to estimate the proportion of people on a large mailing list...
A direct mail company wishes to estimate the proportion of people on a large mailing list that will purchase a product. Suppose the true proportion is 0.07. If 220 are sampled, what is the probability that the sample proportion will differ from the population proportion by less than 0.04? Round your answer to four decimal places.
A direct mail company wishes to estimate the proportion of people on a large mailing list...
A direct mail company wishes to estimate the proportion of people on a large mailing list that will purchase a product. Suppose the true proportion is 0.03. If 384 are sampled, what is the probability that the sample proportion will be less than 0.05? Round your answer to four decimal places.
A direct mail company wishes to estimate the proportion of people on a large mailing list...
A direct mail company wishes to estimate the proportion of people on a large mailing list that will purchase a product. Suppose the true proportion is 0.07 . If 310 are sampled, what is the probability that the sample proportion will differ from the population proportion by less than 0.04? Round your answer to four decimal places.
A direct mail company wishes to estimate the proportion of people on a large mailing list...
A direct mail company wishes to estimate the proportion of people on a large mailing list that will purchase a product. Suppose the true proportion is 0.07. If 402 are sampled, what is the probability that the sample proportion will be less than 0.04? Round your answer to four decimal places.
A company wishes to make $10,000,000 in 10 years. Which of the following option is the...
A company wishes to make $10,000,000 in 10 years. Which of the following option is the best based on future value. A. 10% compounded quarterly B. 9. 85% Compounded daily C. 9.95% compounded monthly. D. 9.80% compounded continuously.
The proportion of individuals insured by the All-Driver Automobile Insurance Company who received at least one...
The proportion of individuals insured by the All-Driver Automobile Insurance Company who received at least one traffic ticket during a five-year period is .15. a) Show the sampling distribution of p if a random sample of 150 insured individuals is used to estimate the proportion having received at least one ticket. b) What is the probability that the sample proportion will be within +-.03 of the population proportion?
A company wishes to deposit $5000 in the bank today and to make ten additional deposits...
A company wishes to deposit $5000 in the bank today and to make ten additional deposits every six months beginning six months from now, the first of which will be $5000 and increasing $1000 per deposit after that. Immediately after making the last deposit, the company decides to withdraw all the money deposited. If the bank pays 12% nominal interest compounded semi-annually, how much money will the company receive?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT