In: Statistics and Probability
BMI | Height (cm) | Weight (lbs) |
14 | 170.1 | 90 |
15 | 172.7 | 100 |
16 | 157.4 | 90 |
17 | 177.8 | 110 |
18 | 185.4 | 140 |
19 | 170.1 | 120 |
21 | 172.7 | 140 |
22 | 162.5 | 130 |
24 | 170.1 | 150 |
25 | 152.4 | 130 |
26 | 177.8 | 180 |
27 | 172.7 | 180 |
28 | 180.3 | 200 |
29 | 172.7 | 190 |
31 | 187.9 | 250 |
32 | 149.8 | 160 |
33 | 160 | 190 |
34 | 165.1 | 210 |
36 | 170.1 | 230 |
39 | 167.6 | 240 |
Dependent variable: BMI (y) plus two independent variables, Height (x1) and Weight (x2).
a. State before doing any calculations whether you think they are positively or negatively correlated. What is your rationale?
Example: I test for a correlation between the quantity of coffee that people buy (Y) with the price of coffee (X1) and the household income (X2). I hypothesize that there is a negative correlation between quantity and price because people like to buy goods at lower rather than higher prices. I also hypothesize that there is also a positive correlation between the quantity of coffee and household income because people can buy more coffee when their income increases.
b. Draw a graph of each of the two independent variables with the dependent variable in using Excel. (Do this by inserting a Scatter chart.)
c. Use Excel to do the necessary regression. Give the values for the y-intercept, b1 and b2. Write out the equation. Also show R-square, the F-statistic and its p-value and the t-statistics with their respective p-values. Use a confidence level of 95 %.
d. Test for multicollinearity using the rule that the two independent variables are multicollinear if their correlation coefficient is .70 or greater (implying r-square is .49 or greater). If they are multicollinear, give a brief statement on why do you think that is the case.
e.Pretend that this was an assignment from your manager and communicate your findings to the manager in 100 words or less. You should assume in preparing this memo:
a) Correlation
It can be seen that, Height is slightly negatively correlated
with BMI, but there is high positive correlation between Weight and
BMI, that means BMI increases with increase in weight.
b) Scatter plot
Height VS BMI
Weight VS BMI
It can be easily observed relationship between the variables from
the plot. Height not showing a pattern with BMI, while BMI is
having a clear increasing trend with Weight.
c)
The regression equation is
BMI = 51.2535 - 0.2980*Height +
0.1526*Weight
d)
The multicollinearity is checked using variance inflation factor
(VIF) for the two independant variables. Which can be seen in the
last column of the above output.
The individual models are calculated only with the independent
variables (height and Weight) to find the R2 value, then VIF is
calculated by (1/(1-R2)), the value is near to 1 and it seems there
is no multicollinearity in the independt variables.
e)
Correlation is the value between (-1 & 1) which indicates the
relationship between variables, either in positive or negative
direction, and here it can be seen that a good positive correlation
between weight and BMI.
The R2 value of the model is 0.98, which tells that the model(model
variables) can explain 98% of the variation. The higher R2 means ,
more good the model.
The model is BMI = 51.2535 - 0.2980*Height +
0.1526*Weight, which tells that, a unit increase in the
height can decrease the BMI by -0.2980 units, while a unit
incxrease in the wight can increase the BMI by 0.1526 units.
A small pvalue (< 0.05) indicate that the model and model
variables are significant at 5% significance level, also there
exist no multicollinearity (Correlation between independent
variables)