In: Economics
CASE STUDY 2: Correlation and Regression are investigating the relationship between two continuous variables such as height and weight, time and speed or the concentration of an injected drug and heart rate.
a) In your opinion, discuss the importance of Correlation and Regression as a tools for analysis purposes.
b) Find any correlation and regression that have been applied in business from the online platform. From the data you required to:
i. State the Independent and Dependent Variable
ii. Draw the Scatter plot using Microsoft Excel.
iii. Find the Pearson Product Moment Correlation Coefficient
iv. Find the Regression line
v. Interpret the slope from (iv).
c) Discuss and determine the conclusion that you have from above (b) result.
Objective:
1) To determine the problem solving using the real situation
2) To have the practical knowledge of theoretical and application of statistics
Plagiarism must be below 30 %
a) Any kind of analysis, whether qualitative or quantitative across different fields of study; scientific, social science, business and humanities, requires investigating relation between different variables. The dependent variable is the variable being tested while the independent variables are the ones that capture the factors that are responsible for any change in dependent variable under study. Example: while studying the gender-wage gap, we need to find whether gender plays a significant role in deciding the wage rate and therefore gender is an independent variable while wage rate is the dependent variable.
Correlation and Regression are most commonly used analysis tools for studying the relationship between two variables. Former quantifies the strength of the linear relationship between a pair of variables, whereas latter expresses the relationship in the form of an equation. While correlation coefficient is same whether we study correlation of X on Y or Y on X but the same is not true in case of regression where the value of regression coefficient depend on the choice of dependent and independent variable.
b) Follwing table is data for the distance of a business from a city center (X) versus the amount of product sold per person (Y) and we need to check if there is a correlation between the two variables. Also, we will check whether the distance from a city centre is a significant variable that effects the amount of product sold per person.
Sakau Market | distance/km (x) | mean cups per person (y) |
Upon the river | 3 | 5.18 |
Try me first | 13.5 | 3.93 |
At the bend | 14 | 3.19 |
Falling down | 15.5 | 2.62 |
i) Dependent Variable : Mean cups per person; Independent Variable : DIstance (KM)
ii)
iii) The Pearson product-moment correlation coefficient r tells us how well the data fits a straight line and can be calculated using following command in a spreadsheet:
=CORREL(y-values,x-values)
Pearson product-moment correlation coefficient r | -0.93 |
iv) Regression line can be drawn in spreadsheet using the line graph under insert tab
Regression output:
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.93221259 | |||||||
R Square | 0.86902032 | |||||||
Adjusted R Square | 0.80353047 | |||||||
Standard Error | 0.48999883 | |||||||
Observations | 4 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 3.18600228 | 3.18600228 | 13.2695437 | 0.06778741 | |||
Residual | 2 | 0.48019772 | 0.24009886 | |||||
Total | 3 | 3.6662 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 5.79824873 | 0.61837767 | 9.37654928 | 0.01118357 | 3.13758434 | 8.45891312 | 3.13758434 | 8.45891312 |
Distance | -0.1798477 | 0.04937157 | -3.6427385 | 0.06778741 | -0.3922764 | 0.032581 | -0.3922764 | 0.032581 |
RESIDUAL OUTPUT | ||||||||
Observation | Predicted Y | Residuals | ||||||
1 | 5.25870558 | -0.0787056 | ||||||
2 | 3.37030457 | 0.55969543 | ||||||
3 | 3.28038071 | -0.0903807 | ||||||
4 | 3.01060914 | -0.3906091 |
The slope intercept indicates a negative relationship between the distance of business from city centre and the amount of product sold implying that as the distance from the city centre increases, the mean cups sold per person falls.