In: Statistics and Probability
The following sample data contains the number of years of college and the current annual salary for a random sample of heavy equipment salespeople. A researcher believes that spending more years in college leads to higher annual income.
Annual Income (In Thousands) |
Years in College |
20 |
2 |
23 |
2 |
25 |
3 |
35 |
4 |
28 |
3 |
18 |
1 |
37 |
4 |
30 |
3 |
40 |
4 |
39 |
4 |
a) Which variable is the dependent variable? Which is the independent variable?
b) What is the estimated regression equation.
c) Interpret the meaning of the regression coefficient of independent variable.
d) What is the value of intercept? How to interpret the meaning of the intercept?
e) What is the point estimate of the annual income of a salesperson with three years of college?
f) Test whether the number of years in college has a significant impact on income at α= .05. What are the hypotheses? How do you test the hypotheses? What is the conclusion?
g) What is the value of the correlation coefficient between dependent variable and independent variable? What is the implication of the correlation coefficient?
h) What is the R Square of this model? What does it mean? Use the information in ANOVA table to calculate R square.
a)
Dependent variable: Annual Income
Independent Variable: Years in College
b)
To find the regression, using Excel
So, the regression equation is
Y-hat = 7.9 + 7.2 X
c)
meaning of the regression coefficient of the independent variable
If we increase the year in college by 1 year, the change in annual income is 7.2 (in thousands)
d)
interpret the meaning of the intercept
Intercept = 7.9.
If the year in college is zero, the base annual income is 7.9 (in thousand)
e)
X = 3
Y-hat = 7.9 + 7.2*3 = 29.5
f)
Hypothesis
We see the p-value for beta1 from the output table
p-value = ~0, Since p-value is less than 0.05, we reject null and conclude that beta1 is significant
g)
correlation coefficient = 0.9499
The positive value indicates that the relationship is linear, positive and strong
h)
R2 = 0.9023
It means that 90.23% variability in Y is explained by X