In: Math
Consider the following data set.
x | 1 | 2 | 3 | 4 | 5 | 6 |
y | 3.00 | 0.21 | 0.61 | 0.70 | 1.13 | 1.17 |
a) plot the data (y versus x). Are there any points that appear to be outliers? If there are, circle them and label as such.
b) produce a regression of y against x. Add the regression line to the plot in a). Do you think that the regression line captures the most important features of the data set reasonably well?
c) using calculations at a 5% significance level, can you say that there is a significant linear relationship between the x and y? That is, can you say with 95% confidence that y linearly depends on x? Does this result agree with the conclusion you made in b)?
d) testing at a 5% significance level, can you say that the intercept (β0) is not zero? How does this conclusion agree with the plot in b)?
e) Assume that the first data point is an outlier (e.g. the value was misrecorded). Remove the outlier, and redo the parts b)-d). Plot the data set and both regression lines (before and after the outlier was removed). Comment on the difference. Also comment on the difference between the results of the tests in c) and d), if any.
a) plot the data (y versus x). Are there any points that appear to be outliers? If there are, circle them and label as such.
Ans: There is a data point appear to be the outlier.
b) produce a regression of y against x. Add the regression line to the plot in a). Do you think that the regression line captures the most important features of the data set reasonably well?
Ans:
We do not think that the regression line captures the most important features of the data set reasonably well because the fitted line does not pass through most of the data points.
c) using calculations at a 5% significance level, can you say that there is a significant linear relationship between the x and y? That is, can you say with 95% confidence that y linearly depends on x? Does this result agree with the conclusion you made in b)?
Ans:
Using calculations at a 5% significance level, we can say that there is not a significant linear relationship between the x and y. No, we can not say with 95% confidence that y linearly depends on x. Yes, this result agrees with the conclusion you made in b).
d) testing at a 5% significance level, can you say that the intercept (β0) is not zero? How does this conclusion agree with the plot in b)?
Ans: The p-value for intercept (β0) is 0.1388. Hence, we can not say that intercept (β0) is not zero. Yes, this conclusion agrees with the plot in b).
e) Assume that the first data point is an outlier (e.g. the value was misrecorded). Remove the outlier, and redo the parts b)-d). Plot the data set and both regression lines (before and after the outlier was removed). Comment on the difference. Also comment on the difference between the results of the tests in c) and d), if any.
Ans
Now, the fitted line passes near to all the data points and the R^2 value becomes 0.9398. It has a better model fitting.