In: Statistics and Probability
Use the following scenario to answer the questions below:
A campaign manager for a local politician is interested in the percentage of Leon County residents who intend to vote in the 2020 elections in November. Residents from each neighborhood in Leon County are randomly sampled. Using simple random sample techniques, larger neighborhoods have a larger number of residents randomly sampled and smaller neighborhoods have a smaller number of residents randomly sampled to get a representative sample.
Altogether 434 Leon County residents are sampled. The result is that 39% of the sampled Leon County residents intend to vote in the 2020 elections in November.
1. The result value 39% would be considered a statistic or parameter?
2. Would using a statistic or a parameter be more appropriate in this situation?
A). It does not matter as statistics and parameters have the same values.
B). A statistic because it would be more accurate than a parameter.
C). A statistic because it may be more easily obtained due to time constraints and available resources.
D). A parameter because it includes responses from everyone in the sample and will not vary from sample to sample.
3. A campaign manager for another candidate claims that the sample taken would not represent the population because the entire population was not surveyed. Which of the following provides the best evidence that the sample taken would be representative of the population?
A). A large enough sample of residents was randomly taken and proportional to the size of each neighborhood in Leon County.
B). 434 is a large enough sample of residents since more than 30 people were interviewed.
C). There is no evidence to say that the sample taken would be representative of the population.
D). 39% is a large enough percentage of Leon County residents who intend to vote in the 2018 midterm elections in November.
4. Suppose that someone at a local news station wants to consider the sampling variability and uses the results of the survey to calculate the margin of error for the result. The person at the station assumes that the normal distribution can be used in calculating the margin of error. Is the assumption of the normal distribution here correct?
A). Although a random sample was taken, the normal distribution is not appropriate since not enough residents were sampled.
B). The normal distribution is appropriate since a random sample was taken and there were enough residents who answered “Yes” as well as “No” to voting in the 2018 midterms in November.
C). The normal distribution is not appropriate since only a sample of all the residents of Leon County were surveyed.
D). The normal distribution is appropriate since a random sample was taken and there were more than 30 residents surveyed.
5. In the midterm primaries in August 2018, 37% of Leon County residents voted. Based on the information provided from the sample above, would it be surprising if 37% of all Leon county residents also intend to vote in the 2020 elections? State YES or NO and use statistical reasoning (inference) to support your position.
6. Using the information from the sample above, what is your conclusion regarding the proportion (or percentage) of Leon County residents who intend to vote in the 2020 elections in November? Be as detailed as possible in your explanation.
Concept
Statistic describes a sample while a parameter describes an entire population.
Q1.
Since we are sampling and getting the value of 39%, we consider it as statistic. Because it is describing the sample and not the population and it might change by changing the sampling
So, Statistic is the Answer
Q2.
We take statistic because we can't collect population data due to a lot of constraints. Collecting statistic data would be a lot easier that collecting population data. Lets look at the Options
So, Answer is Option C
Q3.
Sample Should be random, size should be large enough and should represent the proportion of the population. And above sampling is following all the criteria. So, it is a representation of the population. Lets look at the Options
So, Answer is Option A
Q4
For assuming the distribution to be normal, the sample should be random, should be large (>=30 ) and should contain both the outcomes (success and failure, >=10). And this sample satisfies all these criteria. Lets look at the Options
So, Answer is Option B
Q5
We need to find the 95% confidence level to verify the claim.
CI is given by
p = 0.39, z = 1.96 for 95% CI, substituting these values
As we can see that .37 lies in the confidence interval range, So it is not at all surprising if 37% vote in 2020 elections
So, Answer is NO
Q6
As we know the population parameter in 2018 was po = 0.37. So
Doing a p-value test for testing the hypothesis
p-value = 0.3881.
Since, p-value is very high, we fail to reject the null hypothesis. And we conclude that voting proportion would be same.
Conclusion: The proportion of Leon country residents who intend to vote in 2020 would not differ from 37%