In: Statistics and Probability
3. In a previous election, we were wondering if president Obama would be re-elected. We took a random poll of 1057 Americans and asked them if they would vote for Obama to be re-elected. Of the 1057 people in the poll, 583 said they would support Obama. Is this evidence convincing enough for us to know if more than 50% of all Americans would vote to re-elect president Obama? To answer this question we will look at simulations from a population with mean average percent of 0.5, create a distribution, and then see how it behaves. If you go to the top left corner you can click on the link for “election poll support Obama” and the numbers will be automatically entered for you.
a) What is the null and alternative hypothesis? Which one is the claim?
b) What was the original sample percent in the poll? Is the original sample percent lower or higher than the population value 0.5?
c) Click on generate 1 sample. The computer has simulated talking to 1057 people when the population percent is 50% (0.5). How many people said they support Obama in the first simulation? What was the simulated percent?
d) Now click on generate 1000 samples. Let’s see if we understand what we are looking at. Again, these are not actual samples from a population. Each dot represents the percent of a simulated data set of size 1057. You now have 1001 samples and have created a randomization distribution. In a sense, we have predicted how we expect data sets from a population with a percentage of exactly 50%. What is the shape, center (mean) and spread (estimated standard error) from the distribution?
e) Our goal was to know if getting a sample percent of 0.552 was something that could happen by random chance from a population with a population percent of 0.5? Look at the distribution. Since this was a right tail test, let’s look at how many dots were 0.5. Again, if we are wondering if 0.552 is significantly higher than the population value 0.5, wouldn’t dots that have a value greater than 0.552 also cause us to doubt the validity of the population value 0.5? Of course. So we don’t want to just count how many dots are exactly 0.552, but how many dots are that or higher. (Right Tail)
Click on the button that says “right tail”. In the box at the bottom of the distribution, type in “ 0.552 ”. How many dots were greater than or equal to 0.552? What percent of the simulated distribution was higher than 0.552? This percent is called a “P-value”.
P-value = The probability of getting the sample data or more extreme, if the null hypothesis is really true. The randomized simulation has helped us flush out what we expect to happen if the null hypothesis was really true.
f) Decision time. In a simulation of samples of size 1057 from a population with a population proportion of 0.5 , would a sample mean of 0.552 be likely to happen by random chance? What does this tell us about the validity of the “so-called” population value of 0.5 (50%)? Do you still agree that 50% of people will vote for Obama? Obama’s campaign managers are worried. Do we have convincing evidence that more than 50% of Americans will vote for Obama. (This is the same as asking if we have convincing evidence that Obama will win the re-election.) Do you have proof? (Obama of course did win the re-election.)
3)
A single proportion z test is used to test whether the hypothesized proportion is different from the actual proportion.
a) The hypothesis is defined as,
Null hypothesis: Both the proportion are same;
Alternate hypothesis: There is a significant large proportion;
b) The original sample proportion is obtained by dividing the number of favourable case by total case,
the sample proportion = 0.55 is higher than the poupulation proportion = 0.5.
c) Here the sample is generated in excel by using the following steps,
Step 1: Assuming the poprution is normally distributed with,
Step 2: DATA > Data Analysis > Random Number Generation > OK. Thes screenshot is shown below,
Step : Insert Number of Variables: 1, Number of Random Numbers: 1, Distribution: Normal, Mean = 0.5, Standard Deviation = 0.015379 then OK. The screenshot is shown below,
The one sample is generated with value,
hence from the simulated percentage 49.0359% support Obama.
d)
Similar as above 1000 samples are obtained by enter Number of Random Numbers: 1000. The screenshot is shown below,
The mean of the sample is obtained using the excel function =AVERAGE(). the screenshot is shown below,
The standard deviation of the sample is obtained using the excel function =STDEV(). the screenshot is shown below,
The histogram is obtained in excel to see the shape of the distribution by following steps,
Step 1: Make a column bin with values 0.44, 0.45,........0.56. The screenshot is shown below,
Step 2: DATA > Data Analysis > Histogram > OK. The screenshot is shown below,
Step 3: Insert Input Range: Sample proportion column, Bin Range: bin column then OK. The screenshot is shown below,
The frequency is obtained
Step 4: Select bin and Frequency column then INSERT > Recommended Charts > Clustered column. The screenshot is shown below,
The chart is obtained. The screenshot is shown below,
We can see that the samples are normally distributed with slightly skewed to right.
To answer part e) and f) in software let me know in comment what software you are using or if you want manually calculated solution?.