Question

In: Statistics and Probability

This week examines how to use correlation and simple linear regression to test the relationship of two variables.

 

Discussion 1: Searching for Causes

This week examines how to use correlation and simple linear regression to test the relationship of two variables. In both of these tests you can use the data points in a scatterplot to draw a line of best fit; the closer to the line the points are the stronger the association between variables. It is important to recognize, however, that even the strongest correlation cannot prove causation.

For this Discussion, review this week’s Learning Resources and consider a true relationship between variables.

By Day 3

Post a brief explanation of when an observed correlation might represent a true relationship between variables and why. Be specific and provide examples.

Solutions

Expert Solution

With the simple linear regression model

yi=?0+? 1xi+?i

the observed value of the dependent variable yi is composed of a linear function ?0+? 1xi of the explanatory variable xi, together with an error term ?i.

The error terms ? 1 ,…,?n are generally taken to be independent observations from a N(0,?2 ) distribution, for some error variance ?2. This implies that the values y 1 ,…,yn are observations from the independent random variables

Yi ~ N (?0+? 1xi, ?2 )

INTERPRETATION depends on Significance F and P-values of regression model

To check if your results are reliable (statistically significant), => If this value is less than 0.05, you're OK. If Significance F is greater than 0.05, it's probably better to stop using this set of independent variables. Delete a variable with a high P-value (greater than 0.05) and rerun the regression until Significance F drops below 0.05.

CORRELATION

When conducting a statistical test between two variables, it is a good idea to conduct a Pearson correlation coefficient value to determine just how strong that relationship is between those two variables.

FORMULA AND INTERPRETATION ON COEFFICIENT VALUE

In order to determine how strong the relationship is between two variables, a formula must be followed to produce what is referred to as the coefficient value.

The coefficient value can range between -1.00 and 1.00.

1) If the coefficient value is in the negative range => then that means the relationship between the variables is negatively correlated, or as one value increases, the other decreases.

2) If the value is in the positive range => then that means the relationship between the variables is positively correlated, or both values increase or decrease together. Let's look at the formula for conducting the Pearson correlation coefficient value.

Example :-

You were analyzing the relationship between your participants' age and reported level of income.

=> THE formula for correlation is,

so we will find all the calculations as,

After conducting the test,

your Pearson correlation coefficient value is r= +.20.

Therefore, you would have a slightly positive correlation between the two variables,

so the strength of the relationship is also positive and considered strong.

interpretation => you could confidently conclude that, there is strong relationship and positive correlation between one's age and their income . and also can say that as people grow older their income tends to increase as well.

SCATTERPLOT

1) If the data show an uphill pattern as you move from left to right => this indicates a positive relationship between X and Y. As the X-values increase (move right), the Y-values tend to increase (move up).

2) If the data show a downhill pattern as you move from left to right => this indicates a negative relationship between X and Y. As the X-values increase (move right) the Y-values tend to decrease (move down).

3) If the data don’t seem to resemble any kind of pattern (even a vague one) => then no relationship exists between X and Y

EXAMPLE :-

These two scatter plots show the average income for adults based on the number of years of education completed (2006 data). 16 years of education means graduating from college. 21 years means landing a Ph.D.

What type of correlation does each graph represent?

=> Both graphs are positively correlated. As years of education increase, so does income.

Draw a line of best fit for each graph. Then, estimate and compare the earnings for each gender with 11 years of education completed.

Based on these plots it looks like a female who completes 11 years of school can expect to earn around $14,000/year while a male can expect to earn around $23,000/year.

INTERPRETATION :- These graphs show two important things.

First, higher education does lead to a higher income in general.

Second, there is a gender gap in income. While women have begun to close this discrepancy, there is more work to do.


Related Solutions

SELECT ALL THAT APPLY. Regression a. examines relationship between ordinal variables. b. is a linear prediction...
SELECT ALL THAT APPLY. Regression a. examines relationship between ordinal variables. b. is a linear prediction model. c. is bivariate if it involves one independent and one dependent variable. d. is multiple if it involves two or more independent and one dependent variable. 16. SELECT ALL THAT APPLY. In the linear regression equation Y = a + b (X) a. X = the score of the independent variable b. a = the Y-intercept c. b = the slope of the...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance (b) size (c) strength (d) direction 2. 10 pairs of sample data were obtained from a study which looked at household income and the number of people in the household who smoked (cigarettes). The value of the linear correlation coefficient r was computed and a result of - 0.989 was obtained. All of the following (below) are conclusions that can be drawn from the...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance (b) size (c) strength (d) direction 2. 10 pairs of sample data were obtained from a study which looked at household income and the number of people in the household who smoked (cigarettes). The value of the linear correlation coefficient r was computed and a result of - 0.989 was obtained. All of the following (below) are conclusions that can be drawn from the...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance (b) size (c) strength (d) direction 2. 10 pairs of sample data were obtained from a study which looked at household income and the number of people in the household who smoked (cigarettes). The value of the linear correlation coefficient r was computed and a result of - 0.989 was obtained. All of the following (below) are conclusions that can be drawn from the...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance...
1. The linear correlation coefficient r measures of the linear relationship between two variables. (a) Distance (b) size (c) strength (d) direction 2. 10 pairs of sample data were obtained from a study which looked at household income and the number of people in the household who smoked (cigarettes). The value of the linear correlation coefficient r was computed and a result of - 0.989 was obtained. All of the following (below) are conclusions that can be drawn from the...
CASE STUDY 2: Correlation and Regression are investigating the relationship between two continuous variables such as...
CASE STUDY 2: Correlation and Regression are investigating the relationship between two continuous variables such as height and weight, time and speed or the concentration of an injected drug and heart rate. a) In your opinion, discuss the importance of Correlation and Regression as a tools for analysis purposes. b) Find any correlation and regression that have been applied in business from the online platform. From the data you required to: i. State the Independent and Dependent Variable ii. Draw...
What is a simple linear regression model?   What does the value of the linear correlation coefficient...
What is a simple linear regression model?   What does the value of the linear correlation coefficient tell us? Please type the answer as I have diffculties understanding handwritten answers. Thanks.
From the table below, use a simple linear regression analysis to establish the relationship that may...
From the table below, use a simple linear regression analysis to establish the relationship that may exist between a) number of confirmed cases and deaths; b) number of confirmed cases and number of tests performed; c) number of confirmed cases and number of recoveries; and d) number of deaths and number of recoveries. Briefly discuss these relations. Date Total confirmed Death Recoveries Test 1-Apr 195 5 3 12046 2-Apr 204 5 3 12046 3-Apr 205 5 3 12046 4-Apr 214...
1. If the linear correlation coefficient of two variables is zero, then there is no _______________...
1. If the linear correlation coefficient of two variables is zero, then there is no _______________ relationship between the variables. A linear correlation coefficient of 0.92 suggests a ________________ linear relationship than a linear correlation coefficient of -0.86. The value of the ___________________ always lies between -1 and 1, inclusive. If the linear correlation coefficient of the regression line is negative, then the ____________________ of the least squares (linear) regression line must be negative. Give a detailed interpretation of the...
Simple linear regression has how many independent variables? Select one: A. Zero B. One C. Two...
Simple linear regression has how many independent variables? Select one: A. Zero B. One C. Two D. More than two E. Different models have different numbers Which of the following is not an assumption of regression? Select one: A. The residuals are normally distributed B. The expected value of the residuals is one C. The residuals are independent of one another D. The variance of the residuals is constant E. All of them are assumptions of regression Which of the...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT