In: Statistics and Probability
Please Answer both parts of the question below
Answer the flowing questions that summarize the chapter.
Part A-
How do confidence intervals and significance tests relate?
Part B-
What are the conditions (or assumptions) that must be checked when performing a significance test for the slope of a regression line?
a)
You can use either P values or confidence intervals to determine whether your results are statistically significant. If a hypothesis test produces both, these results will agree.
The confidence level is equivalent to 1 – the alpha level. So, if your significance level is 0.05, the corresponding confidence level is 95%.
To understand why the results always agree, let’s recall how both the significance level and confidence level work.
b)
Regression can be a very useful tool for finding patterns in data sets. However, your data can’t always be fit to a regression line. By considering the following assumptions and conditions for regression before you run the test:
explanations of above 7 points:
1) You can only perform regression on quantitative variables.
2) Regression lines will be very misleading if your data isn’t approximately linear. The best way to check this condition is to make a scatter plot of your data. If the data looks like it can roughly fit a line, you can perform regression.
3) Outliers can have a dramatic effect on regression lines and the correlation coefficient you get when you run regression analysis. If you do have an outlier in your data, it’s a good idea to run regression analysis twice: Once with the outlier and once without.
4) If your points are following a clear pattern, it might
indicate that the errors are influencing each other. The errors are
the deviations of an observed value from the true function value.
The following image shows two linear regression lines; on the left,
the points are scattered randomly. On the right, the points are
clearly influencing each other.
If you don’t have random errors, you can’t run linear regression as
your predictions won’t be accurate.
5) With homoscedasticity, you basically want your points to look
like a tube instead of a cone. Heteroscedasticity is where, like
independence of errors, you see a trend in the errors but this time
the trend is larger or smaller (as opposed to the errors clearly
influencing each other). In the picture below, the left graph shows
a linear regression line where the errors are getting larger. The
shape is cone-like.
Running linear regression on data that shows heteroscedasticity
will give you poor results.
6)At any point in your x-values, the data points should be normally distributed around the regression line. Your values should be fairly close to the line, evenly distributed with only a few outliers.