In: Statistics and Probability
a. Describe what the Pearson correlation coefficient is and what the coefficient of determination is.
compare it to a real world example or a example problem
Pearson correlation coefficient : - It is the test statistics that measures the statistical relationship, or association, between two continuous variables. In which one variable is dependent variable and other variable is independent variable.
We can consider
Y = dependent variable
X = independent variable
We predict the value of variable Y using the corresponding value of variable X.
There are 3 types correlation occurs between these two variables (Y and X)
We use the notation for Pearson correlation coefficient is r
1) If r = 0, there is no correlation between Y and X.
2) If r < 0, there is negative correlation between Y and X.
3) If r > 0, there is positive correlation between Y and X.
We can find coefficient of determination using the value of Pearson correlation coefficient(r) is as below
coefficient of determination = r2
For example : - If value of Pearson correlation coefficient is r = 0. 8423
then coefficient of determination = r2 = (0. 8423)2 = 0.7094
Here, r = 0. 8423 and r2 =0.7094 =70.94%
Using value of coefficient of determination we can interpret : - the 70.94% of the variation of Y explained by the linear relationship with X.
The formula for finding Pearson correlation coefficient (r) is as below
where
For example
Find Pearson correlation coefficient(r) and coefficient of determination(r2) for the following data
X | 1004 | 975 | 992 | 935 | 985 | 932 |
Y | 40 | 100 | 65 | 145 | 80 | 150 |
Solution : -
We construct the table is as below
X | Y | X2 | Y2 | XY |
1004 | 40 | 1008016 | 1600 | 40160 |
975 | 100 | 950625 | 10000 | 97500 |
992 | 65 | 984064 | 4225 | 64480 |
935 | 145 | 874225 | 21025 | 135575 |
985 | 80 | 970225 | 6400 | 78800 |
932 | 150 | 868624 | 22500 | 139800 |
5823 | 580 | 5655779 |
65750 |
556315 |
We have to find Pearson correlation coefficient (r) is as below
1) We can say that Pearson correlation coefficient (r) = -0.9897 that is there is negative correlation between variable Y and variable X.
Pearson correlation coefficient (r) = -0.9897
2) coefficient of determination : -
coefficient of determination = r2 = (-0.9897)2 = 0.9795
Here, r = -0.9897 and r2 = 0.9795 =97.95%
Using value of coefficient of determination we interpret : - the 97.95% of the variation of Y explained by the linear relationship with X.