In: Statistics and Probability
Which of the following factors can possibly make the observed linear correlation from a sample in a different direction or pattern from the true correlation in the population? Can be many answers.
1. Presence of a non-linear relationship
2. Outliers that deviate from the main cluster of data
3. Different correlations for different subgroups within the data
4. Range restriction on the predictor and/or criterion variables
5. Unreliable measures of the predictor and/or the criterion variables
6. Standardizing the scores on the predictor and/or the criterion
The factors that can make the obseverd linear correlation from a sample in a different direction or pattern from the true correlation in the population are;
1) Presence of a non linear relationship : It may happen that the true data possesses a non linear relationship between the two variables.However, the sample is able to capture only the linear correlation which does not match with the population.
2) Deifferent correlations for different dubgroups within the data : If this is the case, then we may miss out capturing the true correlation from the sample without the help of Clustering.
3) Unreliable measures of the predictor and/or the criterion : If the measurements or the observations on the sample are not reliable, then we may end up with a rogue estimate of sample correlation which is of no use to us.
Standardizing or the outliers do not really have any effect on the correlation since the variation gets divided in the denominatoe as well. Hence, they may not be able to leave as significant an effect on the sample correlation measure.