In: Statistics and Probability
Intro Discussion and Questions- Regression and Correlation:
How does correlation analysis differ from regression analysis? What are the goals of each?
What does the correlation coefficient r measure? Be specific. What values can it take on? What do
the values indicate?
What values can the coefficient of determination take on? What does it measure?
What are some of the limitations of simple linear regression?
Describe the equation for multiple linear regression; be sure to clearly define all of its parts.
In a simple linear regression, the following regression equation is obtained :
? = ???−???+?.
?) Interpret the slope coefficient
?) Predict the response if ? = −??.
______________________________________
(a)
Correlation Analysis |
(a) In correlation analysis, we calculate the correlation coefficient which is a measure of the degree of covariablity between X and Y (b) Correlation is merely a tool to ascertain the degree of relationship between X and Y. We cannot assign cause and effect relationship between X and Y |
Regression Analysis |
(a) Regression analysis is done to study the nature of relationship between X and Y so that we may be able to predict the value of one on the basis of the other (b)In regression analysis we take X as the independent variable and Y as the dependent variable. This makes the study of cause and effect relation between X and Y possible |
(b) The correlation coefficient measures and reveals the degree of covariablity between X and Y. It indicates the strength of relationship between X and Y.
It always lies in the interval -1 ≤ r ≤ 1. r = -1 and r = 1 indicate a perfect negative or positive correlation between X and Y. r = 0 implies that X and Y have no correlation. Higher the value of r, stronger is the correlation between X and Y. We cannot assign cause and effect relationship between X and Y. |
(c) It can assume any value between 0 and 1, inclusive. It measures the strength of the model, that is how good the model is in predicting the dependent variable.
(d)
The limitation of linear regression are :
(1) Linear regression is often inappropriately used to model non-linear relationships (due to lack in understanding when linear regression is applicable)
(2) Linear regression is limited to predicting numeric outputs only. It gives no qualitative information.
(3) It is meaningless to do a regression without knowing what to expect, at least roughly.
In making estimates from a regression equation, it is important to remember that the relationship has not changed since the regression equation was computed. It is also important to realize that the relationship shown by the scatter diagram may not be the same if the equation is extended beyond the values used in computing the equation. For example, there may be a close linear relationship between the yield of a crop and the amount of fertilizer used. It would not be logical, however to extend this equation beyond the limits of the experiment, because it is quite likely that too much fertilizer may in fact reduce the yield of the crop.