In: Statistics and Probability
Discussion Prompt 2: Correlation and Regression
Correlation and regression are two important terms in statistics. Select an area that interests you and use it to answer the following:
Correlation is used to represent the linear relationship between two quantitative variables. It is used to check if the variables are linearly related to each other.
Regression is used when We want to find the relationship between dependent variable and Independent variable. There is no need of Linearity to carry out the regression. Here we want to find out how the variable of Interest(Dependent variable) is affected by Independent/Explanatory variables.
There are many types of regression like-
1-Simple linear Regression:This includes one Dependent variable and One Independent variable. Here we make use of correlation amongst the variables as the name linear regression suggests it.
For example we would like to find how Height and Weight are related to each other. In general they have linear relationship among them.
2-Multiple linear regression-This contains one dependent variable and multiple Independent variables. We may come across situation where only one variable is not enough to predict the nature of Dependent variable. Hence we have multiple variables.
Eg-Price of a House depends on various factors like Area, locality, distance from city, Type of House, etc.
3) Polynomial Regression-This contains Simple and Multiple polynomial Regression. Simple polynomial is used when we have Dependent variable and One Independent variable but in Higher orders. This is used when X and Y are not linearly related.
Multiple Polynomial has one dependent variable and many Independent variables(in higher order)
There are certain regression like Ridge regression (used when no of sample is less than no of parameters), Spline regression (used when the behaviour of Y changes at a certain point)