In: Statistics and Probability
The following bivariate data set contains an outlier.
x | y |
---|---|
62.8 | 256.5 |
61.9 | -39.3 |
47.2 | 765.4 |
54.3 | 1350.5 |
72.5 | 479.2 |
84.7 | 2508.9 |
43.9 | -2120.9 |
54 | -1687.9 |
45.6 | 1021.6 |
68.5 | 786.7 |
34.7 | -4360.1 |
51.3 | -1825.9 |
50.7 | -1365.3 |
77 | 488.1 |
234.7 | -236 |
What is the correlation coefficient with the
outlier?
rw =
What is the correlation coefficient without the
outlier?
rwo =
Would inclusion of the outlier change the evidence for or against a
significant linear correlation?
1)
S.No | X | Y | (x-x̅)2 | (y-y̅)2 | (x-x̅)(y-y̅) |
1 | 62.8 | 256.5 | 46.0588 | 272205.67 | -3540.8302 |
2 | 61.9 | -39.3 | 59.0848 | 51045.87 | -1736.6742 |
3 | 47.2 | 765.4 | 501.1628 | 1062205.07 | -23072.4449 |
4 | 54.3 | 1350.5 | 233.6822 | 2610594.20 | -24699.1769 |
5 | 72.5 | 479.2 | 8.4875 | 554180.99 | 2168.7824 |
6 | 84.7 | 2508.9 | 228.4128 | 7695815.75 | 41926.4018 |
7 | 43.9 | -2120.9 | 659.8048 | 3443498.78 | 47665.8911 |
8 | 54 | -1687.9 | 242.9442 | 2023980.44 | 22174.6311 |
9 | 45.6 | 1021.6 | 575.3602 | 1655940.03 | -30866.8422 |
10 | 68.5 | 786.7 | 1.1808 | 1106563.74 | -1143.1009 |
11 | 34.7 | -4360.1 | 1217.0795 | 16767933.02 | 142856.2484 |
12 | 51.3 | -1825.9 | 334.4022 | 2435680.44 | 28539.3911 |
13 | 50.7 | -1365.3 | 356.7062 | 1210146.67 | 20776.5924 |
14 | 77 | 488.1 | 54.9575 | 567511.11 | 5584.7111 |
15 | 234.7 | -236 | 27262.4128 | 854.59 | 4826.8131 |
Total | 1043.8 | -3978.5 | 31781.7373 | 41458156.37 | 231460.3933 |
Mean | 69.587 | -265.23 | SSX | SSY | SXY |
correlation coefficient with the outlier rw= | Sxy/(√Sxx*Syy) = | 0.2016 |
removing point (234.7 , -236)
correlation coefficient without the outlier rw= | Sxy/(√Sxx*Syy) = | 0.6930 |
Yes. Including the outlier changes the evidence regarding a linear correlation.