In: Statistics and Probability
*** Indicate if statement is true (T) or false (F), and explain why? ***
(a) A 95% prediction interval for a future observation at x0 is wider than the 95% confidence interval for the mean response at x0.
(b) For a simple linear regression model y = β0 + β1x + ε, and using a 95% confidence interval for the slope β1 (-0.0416, 0.8145), we can conclude in a 0.1 significance level that x and y are not significantly linearly related to each other.
(c) The coefficient of determination R^2 is always a good measure of comparison between two models.
(d) The estimator σ^2 = MSE has a normal distribution.
(e) A 95% confidence interval for the slope β1 will be wider if we have a sample size of n = 11 instead of n=7
(f) In a simple linear regression model, where the errors are independent and normally distributed, the least squares estimator β0 has a normal distribution also.
(g) The prediction is trustworthy even if we are in the region where the values of X are extrapolated.
(h) The residual is the difference between the observed value of the dependent variable and the predicted value of the dependent variable.
(i) If the p-value for testing H0 : β1 = 0 Vs H1 : β1 ̸= 0 is less than the significance levelα, then we reject the null hypothesis and conclude that there is no significant linear relationship between x and y.
Ans a)- TRUE-
95% prediction interval for a future observation at x0 is wider than the 95% confidence interval for the mean response at x0.
To illustrate the difference, imagine that we could get perfect estimates of our β coefficients. Then, our estimate of E[y∣x] would be perfect. But we still wouldn't be sure what y itself was because there is a true error term that we need to consider. Our confidence "interval" would just be a point because we estimate E[y∣x] exactly right, but our prediction interval would be wider because we take the true error term into account.
Hence, a prediction interval will be wider than a confidence interval.
Ans b)- False
You can use either P values or confidence intervals to determine whether your results are statistically significant. If a hypothesis test produces both, these results will agree.
The confidence level is equivalent to 1 – the alpha level. So, if your significance level is 0.05, the corresponding confidence level is 95%.
Ans(c)- false
Not always because R-squared cannot determine whether the coefficient estimates and predictions are biased, which is why you must assess the residual plots.
R-squared does not indicate whether a regression model is adequate. You can have a low R-squared value for a good model, or a high R-squared value for a model that does not fit the data.
Ans(d)- false
The MSE assesses the quality of a predictor (i.e., a function mapping arbitrary inputs to a sample of values of some random variable), or an estimator (i.e., a mathematical function mapping a sample of data to an estimate of a parameter of the population from which the data is sampled)
Ans(e)- false
When increase the sample size confidence interval became narrow.
Ans(f)- True
Ans(g)- false
The regression model is “by construction” an interpolation model, and should not be used for extrapolation, unless this is properly justified. extrapolation is a bad idea .
Ans(h)- True
The difference between the observed value of the dependent variable (y) and the predicted value (ŷ) is called the residual (e). Each data point has one residual. Both the sum and the mean of the residuals are equal to zero
Ans(i)- False
If the p-value for testing H0 : β1 = 0 Vs H1 : β1 ̸= 0 is less than the significance levelα, then wereject the null hypothesis and conclude that there is significant linear relationship between x and y.
Note- if like the solution then please appreciate the work.