In: Statistics and Probability
CHAPTER 6: CORRELATION
Key Terms
----------------------------------------------------------------------------------------------------------------------------
Positive relationship --- Occurs in so far as pairs of observations tend to occupy similar relative positions in their respective distribution.
Negative relationship --- Occurs in so far as pairs of observations tend to occupy dissimilar relative positions in their respective distribution.
Scatterplot --- a graph containing a cluster of dots that represents all pairs of observations.
Person correlation coefficient --- A number between –1 and +1 that describes the linear relationship between pairs of quantitative variables.
Linear relationship --- A relationship that can be described with a straight line.
Curvilinear relationship --- A relationship that can be described with a curved line.
Correlation coefficient (r) --- A number between –1 and +1 that describes the relationship between pairs of variables
Correlation matrix --- Table showing correlations for all possible pairs of variables.
Text Review
In previous chapters, we have examined individual data sets representing collections of records or observations of some characteristic that varied among individuals (e.g., height, weight, and IQ scores, or popping time for kernels of corn and average burning time for light bulbs). In statistics, since the values of these characteristics vary among individuals, they are commonly referred to as variables. In chapter 9, we will examine relationships between two (dependent) variables. Therefore, we will see pairs of observations. Please refer to the chart of “Guidelines for Selecting the Appropriate Hypothesis Test,” which is located at the inside page of the book cover. To understand the type (s) of data better, please read Levels of Measurement in Appendix B (p. 483 to 489).
When relative high values of one variable are paired with relatively high values of the other variable, and low values are paired with low values, the relationship is (1)_____________________.
Another way to think of this is that values of one variable increases as values of the other increase, while values of one variable decrease as values of the other decrease. An example of a positive relationship between two variables would be study time and performance in statistics class. As study time increases, performance in class will also increase. Thus, relatively high values of each variable are paired and relatively low values of each are paired.
When pairs of observations occupy dissimilar and opposite relative positions in their respective distributions, the relationship is (2) _______________________. An example of a negative relationship would be auto gas mileage and horsepower. As horsepower is increased, gas mileage should decrease. Conversely, when horsepower is decreased, gas mileage should increase.
It may occur to you that certain variables may exist which would not be related either positively or negatively. This happens to be true. Consider, for example, hat size and IQ. Pairs of these variables would not occupy either similar or dissimilar positions in their respective distributions. If the pairs were graphed on a scatterplot, no pattern would appear. These variables would be said to have no relationship. A calculated correlation coefficient for the two variables would be near zero.
The graph that shows the relationship between variables as a cluster of dots is called a (3)_____
______________. The pattern formed by the dot cluster is significant. If the cluster has a slope from upper right to lower left, it depicts a (4)______________________ relationship. If the slope is from upper left to lower right, the relationship is (5)___________________________. A dot cluster that lacks any apparent slope reflects (6)__________________________. The more closely a dot cluster approximates a straight line, the (7)________________________ the relationship. When a relationship can be described with a straight line, it is described as (8)_________________________. When the dot cluster forms a curved line, the relationship is said to be (9)___________________________.
The relationship between two variables that represent quantitative data is described by a correlation coefficient and designated by the symbol (10)__________. The correlation coefficient ranges in value from (11)_________________ to (12)________________. The sign of r indicates whether the relationship is (13)_______________ or (14)___________________. The value of r
indicates the (15)__________________________ of the relationship. The correlation coefficient is referred to as the (16)___________________ and was named after the British scientist Karl Pearson.
Interpretation of r is related to the direction and strength of the correlation. The direction, either (17)___________________ or (18)___________________, is indicated by the sign of the correlation coefficient. The strength is reflected by the (19)_____________________ of r. An r value of .50 or more in either direction is typical of important relationships in most areas of behavioral and educational research. The value of r cannot be interpreted as a proportion or percent of some perfect relationship.
The Pearson r can be calculated using z score formula, but this is never actually done in practice, partly because of the extra effort required to convert the original data into z scores. The value of the z score formula lies more in aiding with understanding of correlation. The correlation coefficient is actually calculated using the computation formula.
One important concept to keep in mind is that a correlation coefficient never provides information about cause and effect. Cause and effect can only be proved by (20)_________________.
There are other types of correlation coefficients designed for use in various situations. For example, when the data consists of ranks, a (21)________________________ correlation is used. When one variable is quantitative and the other is qualitative, the result is a (22)__________________ correlation coefficient. If both variables represent ordered qualitative data, the resulting correlation coefficient is called (23)__________________________.
When every possible pairing of variables is reported, a correlation (24)___________________ is produced. A correlation matrix is particularly useful when many variables are being studied.
Answer:
(1) Positive relationship Another way to think of this is that values of one variable increases as values of the other increase, while values of one variable decrease as values of the other decrease. An example of a positive relationship between two variables would be study time and performance in statistics class. As study time increases, performance in class will also increase. Thus, relatively high values of each variable are paired and relatively low values of each are paired.
When pairs of observations occupy dissimilar and opposite relative positions in their respective distributions, the relationship is positive relation ship.
2) Negative relation ship
An example of a negative relationship would be auto gas mileage and horsepower. As horsepower is increased, gas mileage should decrease. Conversely, when horsepower is decreased, gas mileage should increase.
It may occur to you that certain variables may exist which would not be related either positively or negatively. This happens to be true. Consider, for example, hat size and IQ. Pairs of these variables would not occupy either similar or dissimilar positions in their respective distributions. If the pairs were graphed on a scatterplot, no pattern would appear. These variables would be said to have no relationship. A calculated correlation coefficient for the two variables would be near zero.
3) The graph that shows the relationship between variables as a cluster of dots is called a Scatter Plot. The pattern formed by the dot cluster is significant. If the cluster has a slope from upper right to lower left, it depicts a (4)Positive relationship. If the slope is from upper left to lower right, the relationship is
(5) Negative Relationship . A dot cluster that lacks any apparent slope reflects
(6)No -correlation. The more closely a dot cluster approximates a straight line, the
(7) Linear the relationship. When a relationship can be described with a straight line, it is described as
(8)Linear Relationship. When the dot cluster forms a curved line, the relationship is said to be
9) Curvlinear The relationship between two variables that represent quantitative data is described by a correlation coefficient and designated by the symbol
10) r The correlation coefficient ranges in value from
11) -1 to (12)_____+1___________. The sign of r indicates whether the relationship is
13) Negative or
14) Positive .The value of r indicates the
15) Linearity of the relation ship .The correlation coefficient is reffered to as the
16) Pearson correlation coefficient and was named afer the british scientist karl pearson. Interpretation of r is related to the direction and strength of the correlation. The direction, either
17) Positive or
18) neagtive is indicated by the sign of the correlation coefficient. Te strength is reflected by the
19) Linear Relationship of r . An r value of 0.50 or more in either direction is typical of important
relationships in most areas of behavioral and educational research. The value of r cannot be interpreted as a proportion or percent of some perfect relationship.
The Pearson r can be calculated using z score formula, but this is never actually done in practice, partly because of the extra effort required to convert the original data into z scores. The value of the z score formula lies more in aiding with understanding of correlation. The correlation coefficient is actually calculated using the computation formula.
One important concept to keep in mind is that a correlation coefficient never provides information about cause and effect. Cause and effect can only be proved by
(20) Simple Linear Regression Analysis There are other types of correlation coefficients designed for use in various situations. For example, when the data consists of ranks, a
(21)Spear man correlation is used. When one variable is quantitative and the other is qualitative, the result is a
22) Odds Ratio correlation coefficient. If both variables represent ordered qualitative data, the resulting correlation coefficient is called
23) Goodman- Kruskal
Gamma
When every possible pairing of variables is reported, a
correlation.
24) Matrix is produced. A correlation matrix is particularly useful when many variables are being studied.