In: Statistics and Probability
Using the unemployment data provided, investigate the association between the male unemployment rate in 2007 and 2010 for a sample of 52 countries. Complete parts a through d.
a) Find a regression model predicting the 2010 rate from the 2007 rate for the sample of 52 countries. State in simple language what the model says.
2010 Index=_________+____________x 2007 index (Round to two decimal places as needed.)
State in simple language what the model says. Select the correct choice below and fill in the answer box to complete your choice. (Round to two decimal places as needed.)
b) Determine the test statistic.
T=_________ (Round to two decimal places as needed.)
Determine the P-value.
P=_______________(Round to three decimal places as needed.)
Make a conclusion.
Since the P-value is (1) the significance level, α, (2) the null hypothesis. The association (3) significant.
c) What percentage of the variability in the 2010 Index is accounted for by the regression model?
The regression model accounts for nothing % of the variability in the 2010 Index.
(Round to one decimal place as needed.)
Country | Male 2007 | Male 2010 |
1 | 24.1 | 22.4 |
2 | 11.8 | 10.7 |
3 | 8.7 | 11.7 |
4 | 19.6 | 17.7 |
5 | 26.8 | 22.2 |
6 | 13.9 | 13.2 |
7 | 16.4 | 14.3 |
8 | 13.5 | 10.7 |
9 | 12.1 | 11.8 |
10 | 30.5 | 29.5 |
11 | 9.5 | 13.3 |
12 | 22.2 | 18.8 |
13 | 8.5 | 8.2 |
14 | 14.9 | 11.3 |
15 | 22.8 | 15.1 |
16 | 28.4 | 16.6 |
17 | 23.8 | 4.2 |
18 | 25.3 | 26.4 |
19 | 13.8 | 26.1 |
20 | 18.6 | 18.8 |
21 | 8.3 | 4.3 |
22 | 15.2 | 20.8 |
23 | 8.8 | 8.2 |
24 | 19.1 | 8.2 |
25 | 21.3 | 16.8 |
26 | 11.2 | 9.7 |
27 | 11.3 | 12.3 |
28 | 15.2 | 11.5 |
29 | 22.2 | 16.3 |
30 | 21.8 | 20.9 |
31 | 6.8 | 6.2 |
32 | 9.9 | 8.5 |
33 | 12.3 | 12.1 |
34 | 22.8 | 17.2 |
35 | 19.7 | 21.1 |
36 | 18.1 | 15.3 |
37 | 38.3 | 35.4 |
38 | 12.1 | 14.3 |
39 | 22.6 | 24.8 |
40 | 22.1 | 18.5 |
41 | 22.1 | 21.6 |
42 | 34.6 | 30.1 |
43 | 12.8 | 10.4 |
44 | 19.7 | 17.2 |
45 | 22.7 | 20.8 |
46 | 16.3 | 15.7 |
47 | 7.7 | 7.6 |
48 | 5.1 | 5.7 |
49 | 65.3 | 63.8 |
50 | 19.8 | 19.7 |
51 | 11.8 | 12.4 |
52 | 12.3 | 12.8 |
I have used R code to build simple linear regression model to the given data set.
(a) SIMPLE LINEAR REGRESSION R OUTPUT:
ESTIMATED SIMPLE LINEAR REGRESSION EQUATION:
The estimated simple linear regression equation is,
where
is the predicted dependent variable "Male unemployment rate in 2010"
is the intercept
is the slope coefficient of independent variable "Male unemployment rate in 2007"
is the independent variable "Male unemployment rate in 2007"
The model says that Option (C): For each 1% increase in the 2007 rate, the 2010 rate increased by 87% since the slope coefficient of independent variable "Male unemployment rate in 2007" is .
(b) T TEST STATISTIC:
From the given R output,
The value of t test statistic is and the p value for t test statistic is .
T TEST FOR INDIVIDUAL SIGNIFICANCE OF PREDICTOR:
HYPOTHESIS:
The t test is used to test the significance of individual predictor "Male unemployment rate in 2007"
The hypothesis is given by,
(That is, the slope coefficient of predictor "Male unemployment rate in 2007" is not statistically significant. In other words, there is no significant linear relationship between independent variable "Male unemployment rate in 2007" and dependent variable "Male unemployment rate in 2010")
(That is, the slope coefficient of predictor "Male unemployment rate in 2007" is statistically significant. In other words, there is significant linear relationship between independent variable "Male unemployment rate in 2007" and dependent variable "Male unemployment rate in 2010")
CONCLUSION:
Since the calculated p value is less than the significance level , we reject null hypothesis and conclude that there is significant linear relationship between independent variable "Male unemployment rate in 2007" and dependent variable "Male unemployment rate in 2010".
(c) COEFFICIENT OF DETERMINATION:
From the R output, the coefficient of determination is . Thus the regression model accounts for % of the total variability in the dependent variable "Male unemployment rate in 2010".