In: Statistics and Probability
We are interested in examining the relationship between the number of calories people consume and weight. We randomly selected 200 people and presented the data with explanatory (x) variable being the calories consumed and the response (y) variable being their weight.
Regression analysis will be good for this problem because it helps us find trends in data and quantify the results. It can also help us make predictions about data. So as long as the requirements are met for inference on the least-squares regression model, then the regression analysis is good to use for the proposed problem, the requirements are:
We want to determine the Y value because that will allow us to determine the relationship for the X value that we have, and that in turn will allow us to predict values for X where we have no measurement.
If we want to predict a value of Y that’s beyond the highest value of X, we can use our regression analysis as long as we can assume the relationship continues to be linear.
Discuss the meaning of the standard error of the estimate and how it affects the predicted values of Y for that analysis.
If we assume that relationship continues to be linear then we can predict values outside the data range of x. This is known as extrapolation. But always remember the predicted value you get when x is outside the data range it is less reliable then the values predicted which are within the data range you have used to estimate the regression equation.
Standard error provides a measure of accuracy of how precisely a sample estimate predicts the population parameter we are interested in.Technically we can say it is the standard deviation of the sample estimate.
The standard error of the regression provides the absolute measure of the typical distance that the data points fall from the regression line.That means how well your model will fit the data.
A smaller standard error is better as it tells you that on an average your predicted values will deviate less from actual data.