Question

In: Statistics and Probability

The first problem of this assignment considers a situation where the random variable in question is...

The first problem of this assignment considers a situation where the random variable in question is a sample mean. This exercise addresses the situation where the random variable in question is a proportion. Suppose you have been hired by the Better Business Bureau (BBB) to investigate the settlement ratio of the complaints they have received. You plan to select a sample of n complaints to estimate the proportion of complaints the BBB is able to settle. We use p to denote the percentage or proportion of complaints settled among all the complaints that the BBB has received. Let Y be the random variable, which indicates whether a complaint is settled. Without loss of generality, let Y be 1 if a complaint is settle, the probability of which is p; 0 if not settled. Unlike in problem 1 where we don’t know the probability distribution of the population random variable X (the amount a retail customer pays for H&R Block’s service), here we do know the probability distribution of Y. What probability distribution does Y follow? Compute its mean and standard deviation. Now suppose you select a random sample of n complaints and find that p ̅ of them have been settled (not surprisingly, p ̅ is called the sample proportion). Assume the sample size n is sufficiently large. What do we know about the probability distribution of p ̅ (sampling distribution of the sample proportion)? Let’s apply the results above and derive some confidence intervals. Note that the population proportion p is unknown. In order to compute the standard error σ_p ̅ , we substitute p ̅ for p. As long as the sample size n is sufficiently large, a normal distribution would approximate the sample distribution of the sample proportion p ̅ well enough. Suppose the sample proportion you’ve found is 0.6. Find a 95% confidence interval of the population proportion, if the sample size is 36, 100, and 400, respectively. What effect does the sample size n have on the resulting confidence interval? Please copy your R code and the result and paste them here. It is often the case that we have a target for margin of error in mind and we want to know the sample size needed to guarantee such a margin of error when the confidence level is given. Suppose that the margin of error is m and the confidence level is 1-α, and norm.s.inv is the inverse standard normal distribution function. Derive a formula for computing the sample size needed. You can use R function qnorm. In the above formula, you will probably need the population proportion p. As we know, p is unknown. You may consider using the sample proportion p ̅ instead. But typically when we are deciding the sample size, we haven’t started the sampling process and thus the sample proportion p ̅ is also unavailable. Thus most people use p=0.5 instead. Provide the revised formula for the sample size needed given p=0.5 and explain why it is reasonable to do so. Use the above formula to compute the sample sizes needed when the respective value of m is 1%, 3%, and 5% and the respective confidence level is 90%, 95%, and 99%. You may fill out the table below and round your answers up to an integer. m = 1% m = 3% m = 5% 90% 95% 99% Historically, the Better Business Bureau settled 75% of complaints they received. Suppose you have been hired by the Better Business Bureau to investigate the complaints they received involving new car dealers because the bureau thinks that the settlement ratio of complaints involving new car dealers is significantly different from 75%. You plan to conduct a hypothesis test. You select a sample of 450 new car dealer complaints and find that 70% of them have settled. What would be the null and the alternative hypotheses you test? Compute the test statistic for your tests above. Suppose the significance level α is 5%. Compute the both critical values of the z test statistic. And explain how we can use these two critical values to draw a conclusion to the hypothesis test. (This is called the critical value approach for hypothesis testing). Please copy your R code and the result and paste them here. What conclusion should you draw for your tests? Provide a practical interpretation of this conclusion. Suppose the significance level α is 5%. Compute the p value for your test. And explain how we can use the p value to draw a conclusion to the hypothesis test. What conclusion should you draw for your tests? (This is called the critical value approach for hypothesis testing). Please copy your R code and the result and paste them here. The hypothesis test above is a two-tailed test. Now, let’s consider a one-tailed test. There are two types of one-tailed test: upper-tail test and lower-tail test. To determine whether it’s upper- or lower-tail test, simply look at the sign of the alternative hypothesis. If it is “less than” type, then this is a lower-tail test; if it is “greater than”, then this is an upper-tail test. Let’s reuse our BBB example. Historically, the Better Business Bureau settled 75% of complaints they received. Suppose you have been hired by the Better Business Bureau to investigate the complaints they received involving new car dealers because the bureau thinks that the settlement ratio of complaints involving new car dealers is significantly lower than 75%. You plan to conduct a hypothesis test. You select a sample of 450 new car dealer complaints and find that 70% of them have settled. What would be the null and the alternative hypotheses you test? Your test statistic remains the same. Suppose the significance level α is 1%. Compute the critical value of the z test statistic. Compute the p value for your test. What conclusion should you draw for your tests? Please copy your R code and the result and paste them here.

Solutions

Expert Solution

Here is a Bernoulli random variable with PMF

The mean of Bernoulli random variable is

The mean and sd of the sample is

The CI for sample mean is .

When

The R code is

n <- 36
p <- 0.6
alpha <- 0.05
p + qnorm(alpha/2)*c(1,-1)*sqrt(p*(1-p)/n)

The outputs (CI) are respectively for

0.4399696, 0.7600304

0.5039818, 0.6960182

0.5519909 , 0.6480091

Margin of error is  . The sample size is

The R code for finding the sample size for varous values of is given below.

p <- 0.5
alpha <- 0.05
m <- 0.01
n <- (qnorm(alpha/2))^2*p*(1-p)/m^2
n

9603.647

The test hypotheses are

The CI is

The 95% CI is

n <- 450
p <- 0.7
alpha <- 0.05
p + qnorm(alpha/2)*c(1,-1)*sqrt(p*(1-p)/n)

Since the CI is less than we reject the null hypothesis with 95% confidence.

The test statistic is

R code for computing test statistic is below.

n <- 450
p <- 0.7
alpha <- 0.051-p)/n)
z<- p/sqrt(p*(1-p)/n)
z

The output is


Related Solutions

Create a problem where the given is about a random variable that is exponential. Ask a...
Create a problem where the given is about a random variable that is exponential. Ask a question that requires the exponential distribution &amp; solve. Ask a question that requires the use of the Poisson &amp; solve. (Note – problem 4 gives information about a Poisson random variable and then asks Poisson and exponential questions.)
4. Consider the random variable Z from problem 1, and the random variable X from problem...
4. Consider the random variable Z from problem 1, and the random variable X from problem 2. Also let f(X,Z)represent the joint probability distribution of X and Z.  f is defined as follows: f(1,-2) = 1/6 f(2,-2) = 2/15 f(3,-2) = 0 f(4,-2) = 0 f(5,-2) = 0 f(6,-2) = 0 f(1,3) = 0 f(2,3) = 1/30 f(3,3) = 1/6 f(4,3) = 0 f(5,3) = 0 f(6,3) = 0 f(1,5) = 0 f(2,5) = 0 f(3,5) = 0 f(4,5) = 1/6...
Question: Give an example of a hypergeometric random variable for which a binomial random variable is...
Question: Give an example of a hypergeometric random variable for which a binomial random variable is NOT a good approximation. You must describe each of the following: i. the experiment ii. a random variable X from the experiment and what X represents iii. the probability mass function (PMF) of X iv. a binomial random variable that approximates X and its parameters v. the PMF of the binomial random variable and why it's a good estimate of the PMF of X
Question 1. For each random variable, state whether the random variable should be modeled with a...
Question 1. For each random variable, state whether the random variable should be modeled with a Binomial distribution or a Poisson distribution. Explain your reasoning. State the parameter values that describe the distribution and give the probability mass function. Random Variable 1. A quality measurement for cabinet manufacturers is whether a drawer slides open and shut easily. Historically, 2% of drawers fail the easy slide test. A manufacturer samples 10 drawers from a batch. Assuming the chance of failure is...
Ryan steel considers himself a geek and a nerd. He just landed his first teaching assignment...
Ryan steel considers himself a geek and a nerd. He just landed his first teaching assignment in a community that couldn't be further from his comfort zone if he were assigned to Mars. Ryan grew up in as mall, culturally homogeneos Midwestern farming town where everyone knows everyone else, and many of the residents are related to one or more other families in the commnnity. Ryran's first teaching job is in ta culturally diverse middle school in a an urban...
Waleed decided to construct a probability distribution of tossing five coins. He considers his random variable...
Waleed decided to construct a probability distribution of tossing five coins. He considers his random variable ?, to be the number of Tails on all five coins. a. List the sample space for the experiment. b. What are the possible values for ?? c. Construct a probability distribution for his experiment. d. Find ?(−4?+16). e. Find ?(−6?).
Waleed decided to construct a probability distribution of tossing five coins. He considers his random variable...
Waleed decided to construct a probability distribution of tossing five coins. He considers his random variable ?, to be the number of Tails on all five coins. a. List the sample space for the experiment. b. What are the possible values for ?? c. Construct a probability distribution for his experiment. d. Find ?(−4? + 16). e. Find ?(−6?).
Apply variable costs into these questions below. MINIMUM 200 words ONLY! Describe a situation where variable...
Apply variable costs into these questions below. MINIMUM 200 words ONLY! Describe a situation where variable costs are applied outside the classroom. In your career? Your personal life? A current event? Another class? Did you describe variable costs in a few sentences with enough detail to indicate that you understand the concept? Does the topic you described relate to the situation or incident that you described? (For example, do not describe the IMA Statement of Ethics, then explain how you...
Probability And Statistics Question: Explain the exponential random variable and normal random variable with at-least ten...
Probability And Statistics Question: Explain the exponential random variable and normal random variable with at-least ten examples in real life? In which situation we prefer normal random variable instead of exponential variable?
Given that z is a standard normal random variable, find z for each situation.
  Given that z is a standard normal random variable, find z for each situation. (Round your answers to two decimal places.) (a) The area to the left of z is 0.2119. (b) The area between −z and z is 0.9398. (c) The area between −z and z is 0.2052. (d) The area to the left of z is 0.9949. (e) The area to the right of z is 0.5793.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT