In: Statistics and Probability

A researcher wants to determine the association between two continuous variables, X and Y. A sample of 100 individuals were taken from a population where the values of X and Y were measured from each individual.

a) The correlation coefficient of X and Y were calculated from this sample and the value is 0.2, what is the implication of this value?

b) What may go wrong if the researcher concludes on the association between variables X and Y based on only the correlation coefficient?

c) What should the researcher do to avoid the mistake that might occur in part(b)?

a) The correlation between X and Y is 0.2, which means there is a weak positive linear relationship between X and Y.

b) The correlation coefficient only tells us the information on linear relationship. But is relationship is not linear but non-linear then correlation coefficient is not a good choice. Therefore it is possible that small correlation between X and Y is due to a non-linear relationship. And researcher may be concluding wrongly by only showing and interpreting correlation coefficient.

c) The researcher must plot a scatter plot between X and Y, to check for non-linear relationship, and if non linear relationship exists that he should analyse this using regression analysis by having a quadratic, exponential etc. terms in the model

