In: Statistics and Probability
The following bivariate data set contains an outlier.
x | y |
---|---|
47.6 | 53.7 |
40.6 | 112.6 |
28.9 | 72.2 |
36.1 | 101.1 |
32.4 | 112.4 |
26.8 | 67.1 |
36.8 | 38.9 |
33.1 | 70.1 |
67.5 | 8.6 |
43.3 | -6.4 |
50.1 | 41.3 |
30.1 | 3.2 |
29.8 | 41.7 |
55 | -32.8 |
189.6 | 548.8 |
What is the correlation coefficient with the
outlier?
rw =
What is the correlation coefficient without the
outlier?
rwo =
Would inclusion of the outlier change the evidence for or against a
significant linear correlation at 5% significance?
A) No. Including the outlier does not change the evidence regarding a linear correlation.
B) Yes. Including the outlier changes the evidence regarding a linear correlation.
Would you always draw the same conclusion with the addition of an
outlier?
A) Yes, any outlier would result in the same conclusion.
B) No, a different outlier in a different problem could lead to a different conclusion.
Explain your answer.
a)
using excel data analysis tool for regression, following o/p is
obtained
Regression Statistics | ||||||
Multiple R | 0.8680 | |||||
R Square | 0.7534 | |||||
Adjusted R Square | 0.7344 | |||||
Standard Error | 70.1143 | |||||
Observations | 15 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 195208.35 | 195208.35 | 39.7087 | 0.0000 | |
Residual | 13 | 63908.14 | 4916.01 | |||
Total | 14 | 259116.49 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | -63.9811 | 29.4216 | -2.1746 | 0.0487 | -127.543 | -0.420 |
X | 2.9319 | 0.4653 | 6.3015 | 0.0000 | 1.927 | 3.937 |
so, correlation coefficient with the outlier=0.8680
------------------------------------------------------
correlation coefficient without the outlier=
-0.4853
----------------------------------
with outlier,
correlation hypothesis test
Ho: ρ = 0
Ha: ρ ╪ 0
n= 15
alpha,α = 0.1
correlation , r= 0.8680
t-test statistic = t = r*√(n-2)/√(1-r²) =
6.3015
critical t-value = 1.7709
p-value = 0.0000 <α=0.05, reject Ho, so, linear
correlation exists at α=0.05
-----------
now without lier
correlation hypothesis test
Ho: ρ = 0
Ha: ρ ╪ 0
n= 14
alpha,α = 0.1
correlation , r= -0.4853
t-test statistic = t = r*√(n-2)/√(1-r²) =
-1.9227
critical t-value = 1.7823
p-value = 0.0786 >α=0.05, fail to reject Ho, so,
linear correlation does not exists at α=0.05
hence, answer is Yes. Including the outlier changes the evidence regarding a linear correlation.
-------------------------------------
No, a different outlier in a different problem could lead to a different conclusion.