Question

In: Statistics and Probability

1. What is meant when we say that two variables have a strong positive (or negative)...

1. What is meant when we say that two variables have a strong positive (or negative) linear correlation? Is it possible that two variables could be strongly related but have a low linear correlation? Can you give an example?

2. Give a very general description of how the least-squares criterion is involved in the construction of the least squares line.

Expert Solution

1) The correlation coefficient (ρ) is a measure that determines the degree to which two variables' movements are associated. The most common correlation coefficient, generated by the Pearson product-moment correlation, may be used to measure the linear relationship between two variables. However, in a non-linear relationship, this correlation coefficient may not always be a suitable measure of dependence.
Correlation coefficients are used to measure the strength of the relationship between two variables.
Positive correlation is a relationship between two variables in which both variables move in tandem—that is, in the same direction.
Negative correlation or inverse correlation is a relationship between two variables whereby they move in opposite directions.
Negative correlation is a key concept in portfolio construction, as it enables the creation of diversified portfolios that can better withstand portfolio volatility and smooth out returns.
INVESTING FUNDAMENTAL ANALYSIS
What Does it Mean if the Correlation Coefficient is Positive, Negative, or Zero?
FACEBOOK
TWITTER
LINKEDIN
By STEVEN NICKOLAS
Updated Dec 7, 2019
TABLE OF CONTENTS
EXPAND
Understanding Correlation
Calculating ρ
Positive Correlation
Negative Correlation
The Bottom Line
The correlation coefficient (ρ) is a measure that determines the degree to which two variables' movements are associated. The most common correlation coefficient, generated by the Pearson product-moment correlation, may be used to measure the linear relationship between two variables. However, in a non-linear relationship, this correlation coefficient may not always be a suitable measure of dependence.

KEY TAKEAWAYS
Correlation coefficients are used to measure the strength of the relationship between two variables.
Positive correlation is a relationship between two variables in which both variables move in tandem—that is, in the same direction.
Negative correlation or inverse correlation is a relationship between two variables whereby they move in opposite directions.
Negative correlation is a key concept in portfolio construction, as it enables the creation of diversified portfolios that can better withstand portfolio volatility and smooth out returns.
Understanding Correlation
The range of values for the correlation coefficient is -1.0 to 1.0. In other words, the values cannot exceed 1.0 or be less than -1.0 whereby a correlation of -1.0 indicates a perfect negative correlation, and a correlation of 1.0 indicates a perfect positive correlation. Anytime the correlation coefficient, denoted as r, is greater than zero, it's a positive relationship. Conversely, anytime the value is less than zero, it's a negative relationship. A value of zero indicates that there is no relationship between the two variables.
If the correlation coefficient of two variables is zero, it signifies that there is no linear relationship between the variables. However, this is only for a linear relationship; it is possible that the variables have a strong curvilinear relationship.
When the value of ρ is close to zero, generally between -0.1 and +0.1, the variables are said to have no linear relationship or a very weak linear relationship. For example, suppose the prices of coffee and of computers are observed and found to have a correlation of +.0008; this means that there is no correlation, or relationship, between the two variables.
Positive Correlation
A positive correlation, when the correlation coefficient is greater than 0, signifies that both variables move in the same direction or are correlated. When ρ is +1, it signifies that the two variables being compared have a perfect positive relationship; when one variable moves higher or lower, the other variable moves in the same direction with the same magnitude.

The closer the value of ρ is to +1, the stronger the linear relationship. For example, suppose the value of oil prices are directly related to the prices of airplane tickets, with a correlation coefficient of +0.8. The relationship between oil prices and airfares has a very strong positive correlation since the value is close to +1. So if the price of oil decreases, airfares follow in tandem. If the price of oil increases, so does the prices of airplane tickets.
Negative Correlation
A negative (inverse) correlation occurs when the correlation coefficient is less than 0 and indicates that both variables move in the opposite direction. In short, any reading between 0 and -1 means that the two securities move in opposite directions. When ρ is -1, the relationship is said to be perfectly negative correlated; in short, if one variable increases, the other variable decreases with the same magnitude, and vice versa. However, the degree to which two securities are negatively correlated might vary over time and are almost never exactly correlated, all the time.

For example, suppose a study is conducted to assess the relationship between outside temperature and heating bills. The study concludes that there is a negative correlation between the prices of heating bills and the outdoor temperature. The correlation coefficient is calculated to be -0.96. This strong negative correlation signifies that as the temperature decreases outside, the prices of heating bills increase and vice versa.
The list below shows what different correlation coefficient values indicate:
Exactly –1. A perfect negative (downward sloping) linear relationship
–0.70. A strong negative (downward sloping) linear relationship
–0.50. A moderate negative (downhill sloping) relationship
–0.30. A weak negative (downhill sloping) linear relationship
0. No linear relationship
+0.30. A weak positive (upward sloping) linear relationship
+0.50. A moderate positive (upward sloping) linear relationship
+0.70. A strong positive (upward sloping) linear relationship
Exactly +1. A perfect positive (upward sloping) linear relationship

2) A mathematical procedure for finding the best-fitting curve to a given set of points by minimizing the sum of the squares of the offsets ("the residuals") of the points from the curve. The sum of the squares of the offsets is used instead of the offset absolute values because this allows the residuals to be treated as a continuous differentiable quantity. However, because squares of the offsets are used, outlying points can have a disproportionate effect on the fit, a property which may or may not be desirable depending on the problem at hand.

The linear least squares fitting technique is the simplest and most commonly applied form of linear regression and provides a solution to the problem of finding the best fitting straight line through a set of points. In fact, if the functional relationship between the two quantities being graphed is known to within additive or multiplicative constants, it is common practice to transform the data in such a way that the resulting line is a straight line, say by plotting T vs. sqrt(l) instead of T vs. l in the case of analyzing the period T of a pendulum as a function of its length l. For this reason, standard forms for exponential, logarithmic, and power laws are often explicitly computed. The formulas for linear least squares fitting were independently derived by Gauss and Legendre.

For nonlinear least squares fitting to a number of unknown parameters, linear least squares fitting may be applied iteratively to a linearized form of the function until convergence is achieved. However, it is often also possible to linearize a nonlinear function at the outset and still use linear methods for determining fit parameters without resorting to iterative procedures. This approach does commonly violate the implicit assumption that the distribution of errors is normal, but often still gives acceptable results using normal equations, a pseudoinverse, etc. Depending on the type of fit and initial parameters chosen, the nonlinear fit may have good or poor convergence properties. If uncertainties (in the most general case, error ellipses) are given for the points, points can be weighted differently in order to give the high-quality points more weight.

Vertical least squares fitting proceeds by finding the sum of the squares of the vertical deviations R^2 of a set of n data points

from a function . Note that this procedure does not minimize the actual deviations from the line (which would be measured perpendicular to the given function). In addition, although the unsquared sum of distances might seem a more appropriate quantity to minimize, use of the absolute value results in discontinuous derivatives which cannot be treated analytically. The square deviations from each point are therefore summed, and the resulting residual is then minimized to find the best fit line. This procedure results in outlying points being given disproportionately large weighting.

The condition for to be a minimum is that

for , ..., . For a linear fit,

so

These lead to the equations

In matrix form,

so

The matrix inverse is

so

These can be rewritten in a simpler form by defining the sums of squares

which are also written as

Here, is the covariance and and are variances. Note that the quantities and can also be interpreted as the dot products

In terms of the sums of squares, the regression coefficients is given by

and is given in terms of

The overall quality of the fit is then parameterized in terms of a quantity known as the correlation coefficient, defined by

which gives the proportion of which is accounted for by the regression.

Let be the vertical coordinate of the best-fit line with -coordinate , so

then the error between the actual vertical point and the fitted point is given by

Now define as an estimator for the variance in ,

Then can be given by

The standard error for and are

orchestra answered 2 years ago

• What does it mean when we say that there is a relationship between two variables?...

• What does it mean when we say that there is a relationship between two variables? • What kinds of relationships can there be between two variables? • Give an example of two variables that are related. For example, my daughter has an hourly salary. Her paycheck amount is related to how many hours she worked. • Give an example of two variables that are NOT related. • Why might the regression equation you have found NOT be a good...

Think of a pair of variables that has a strong linear relationship, either positive or negative,...

Think of a pair of variables that has a strong linear relationship, either positive or negative, then find data that you think supports your assertion. You need to get real data for this problem, not just make numbers up to fit your hypothesis.

2. What is meant when we say that we want to “Map” a function on a...

2. What is meant when we say that we want to “Map” a function on a list of values? Illustrate with an example. (We are talking about big data)

For each of the following pairs of variables, would you expect a strong negative/positive correlation, a...

For each of the following pairs of variables, would you expect a strong negative/positive correlation, a moderate negative/positive correlation, a weak negative/positive correlation, other association or scattered. 1. The age of a used car and it’s price. 2. The weight of a new car and it’s overall miles per gallon rating. 3. The height of a person and the height of the persons father. 4. The height and IQ of a person.

Let's say that we have two variables X and Y. We calculate their correlation value to...

Let's say that we have two variables X and Y. We calculate their correlation value to be r = -.8012. What is the interpretation of this value?

Give me a real-life example of Two variables that have a strong positive correlation. Use Google...

Give me a real-life example of Two variables that have a strong positive correlation. Use Google or other research methods to find the correlation coefficient to support your claim. What reasons or conditions attribute to the strong correlation? Do some research, but do not make up a numerical example. Two variables that have a strong negative correlation. Use Google or other research methods to find the correlation coefficient to support your claim. What reasons or conditions attribute to the strong...

a. Calculate the covariance between variables X and Y. Is it a positive or negative relationship between the two variables?

Observation x y 1 -22 22 2 -33 49 3 2 8 4 29 -16 5 -13 10 6 21 -28 7 -13 27 8 -23 35 9 14 -5 10 3 -3 11 -37 48 12 34 -29 13 9 -18 14 -33 31 15 20 -16 16 -3 14 17 -15 18 18 12 17 19 -20 -11 20 -7 -22 Answer the following questions a. Calculate the covariance between variables X and Y. Is it a positive...

1. What do we mean when we say to solve a two-player strategy game in a...

1. What do we mean when we say to solve a two-player strategy game in a a. ultra weak sense b. weak sense c. strong sense 2. Compare Depth-First Iterative Deepening method with Depth-First Search and Breadth- First Search. What are the pros and cons for each of these methods

1.2 If we were to say that two variables are positively related, this means that: a)...

1.2 If we were to say that two variables are positively related, this means that: a) The relationship between the two would graph as a line sloping downward. b) The relationship between the two would graph as a horizontal line. c) The relationship between the two would graph as a line sloping upward. d) The relationship between the two cannot be depicted graphically in any simple way. 1.3 If there is an increase in the price of red meat, a...

Why don’t we have “monocentric cities” anymore? What are some of the positive and negative aspects...

Why don’t we have “monocentric cities” anymore? What are some of the positive and negative aspects of this change?