In: Statistics and Probability
1. Alice thinks that the more often you read or watch the news,
the more likely you are to vote. She asks five people how many days
per week they read or watch the news (X) and how likely they are to
vote on a scale from 1 to 10 (Y).
a. (14 points) Calculate the correlation between the two
variables:
X Y
3 2
2 1
1 3
5 8
6 9
b. (2 points) How much of the variability in likelihood to vote is
explained by frequency of reading or watching the news?
2. Over the years, Coach Bob has developed a formula to predict
how many of the country’s top 100 football recruits will sign with
his university based on how many games his team won that year. The
formula is:
Y = 2(X) – 14
a. (6 points) How many top recruits would be predicted for seasons
that ended with:
12 wins?
10 wins?
7 wins?
b. (2 points) What is the predictor variable? What is the criterion
variable?
c. (2 points) What is the slope? What is the y-intercept?
d. (2 points) Is the correlation between games won and number of
top recruits signed positive or negative? How do we know?
3. (6 points) Calculate the standard error for the following
samples. The population standard deviation for each sample is
15.
a. N = 4
b. N = 16
c. N = 225
4. a. (3 points) If we know the population mean is 50 and the
standard error of the mean is .5, what is the z-score for a sample
mean of 51? What is the likelihood of getting a sample mean of 51
or more?
b. (3 points) If we know the population mean is 20 and the standard
error of the mean is 5, what is the z-score for a sample mean of
105? What is the likelihood of getting a sample mean of 105 or
higher?
Solution:
Question 1)
Given:
X: Number of days per week they read or watch the news
Y: how likely they are to vote on a scale from 1 to 10
X Y
3 2
2 1
1 3
5 8
6 9
Part a) Calculate the correlation between the two variables:
Formula for correlation :
where
Thus we need to make following table:
X | Y | X^2 | Y^2 | XY |
3 | 2 | 9 | 4 | 6 |
2 | 1 | 4 | 1 | 2 |
1 | 3 | 1 | 9 | 3 |
5 | 8 | 25 | 64 | 40 |
6 | 9 | 36 | 81 | 54 |
Thus
Thus
Thus the correlation between the two variables is r = 0.8860
Part b) How much of the variability in likelihood to vote is explained by frequency of reading or watching the news?
Coefficient of determination is the amount of variation explained in dependent variable by variation is independent variable.
Thus we need to find :
Coefficient of determination = r2
Coefficient of determination = 0.88602
Coefficient of determination = 0.7849
Coefficient of determination = 78.49%
Thus about 78.49% of the variability in likelihood to vote is explained by frequency of reading or watching the news.