In: Statistics and Probability
Rejecting the null hypothesis that the population slope is equal to zero or no relationship and concluding that the relationship between x and y is significant does not enable one to conclude that a cause-and-effect relationship is present between x and y. Explain why.
The test focuses on the slope of the regression line
Y = Β0 + Β1X
where Β0 is a constant, Β1 is the slope (also called the regression coefficient), X is the value of the independent variable, and Y is the value of the dependent variable.
If we find that the slope of the regression line is significantly different from zero, we will conclude that there is a significant relationship between the independent and dependent variables.
Test Requirements
The approach described is valid whenever the standard requirements for simple linear regression are met.
The test procedure consists of four steps:
(1) state the hypotheses,
(2) formulate an analysis plan,
(3) analyze sample data, and
(4) interpret results.
State the Hypotheses
If there is a significant linear relationship between the independent variable X and the dependent variable Y, the slope will not equal zero.
Ho: Β1 = 0
Ha: Β1 ≠ 0
The null hypothesis states that the slope is equal to zero, and the alternative hypothesis states that the slope is not equal to zero.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the following elements.
Analyze Sample Data
Using sample data, find the standard error of the slope, the slope of the regression line, the degrees of freedom, the test statistic, and the P-value associated with the test statistic. The approach described in this section is illustrated in the sample problem at the end of this lesson.
Predictor |
Coef |
SE Coef |
T |
P |
Constant |
76 |
30 |
2.53 |
0.01 |
X |
35 |
20 |
1.75 |
0.04 |
DF = n - 2
where n is the number of observations in the sample.
t = b1 / SE
where b1 is the slope of the sample regression line, and SE is the standard error of the slope.
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.
Test Your Understanding
Problem
The local utility company surveys 101 randomly selected customers. For each survey participant, the company collects the following: annual electric bill (in dollars) and home size (in square feet). Output from a regression analysis appears below.
Regression equation: Annual bill = 0.55 * Home size + 15 |
||||
Predictor |
Coef |
SE Coef |
T |
P |
Constant |
15 |
3 |
5.0 |
0.00 |
Home size |
0.55 |
0.24 |
2.29 |
0.01 |
Is there a significant linear relationship between annual bill and home size? Use a 0.05 level of significance.
Solution
The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:
Ho: The slope of the regression line is equal to zero.
Ha: The slope of the regression line is not equal to zero.
If the relationship between home size and electric bill is significant, the slope will not equal zero.
We get the slope (b1) and the standard error (SE) from the regression output.
b1 = 0.55 SE = 0.24
We compute the degrees of freedom and the t statistic test statistic, using the following equations.
DF = n - 2 = 101 - 2 = 99
t = b1/SE = 0.55/0.24 = 2.29
where DF is the degrees of freedom, n is the number of observations in the sample, b1 is the slope of the regression line, and SE is the standard error of the slope.
Based on the t statistic test statistic and the degrees of freedom, we determine the P-value. The P-value is the probability that a t statistic having 99 degrees of freedom is more extreme than 2.29. Since this is a two-tailed test, "more extreme" means greater than 2.29 or less than -2.29. We use the t Distribution Calculator to find P(t > 2.29) = 0.0121 and P(t < -2.29) = 0.0121. Therefore, the P-value is 0.0121 + 0.0121 or 0.0242.