In: Statistics and Probability
Explain what a regression line is, in your own words. Explain, in general terms, how a regression line is determined. In two paragraphs.
A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x.
or,
Let there be two variables: x & y. If y depends on x, then the result comes in the form of simple regression. Furthermore, we name the variables x and y as:
y – Regression or Dependent Variable or
Explained Variable
x – Independent Variable or Predictor or
Explanator
Therefore, if we use a simple linear regression model where y depends on x, then the regression line of y on x is:
y = a + bx
In statistics, you can calculate a regression line for two variables if their scatterplot shows a linear pattern and the correlation between the variables is very strong (for example, r = 0.98). A regression line is simply a single line that best fits the data . Statisticians call this technique for finding the best-fitting line a simple linear regression analysis using the least squares method.
The formula for the best-fitting line (or regression line) is y = mx + b, where m is the slope of the line and b is the y-intercept. This equation itself is the same one used to find a line in algebra; but remember, in statistics the points don’t lie perfectly on a line — the line is a model around which the data lie if a strong linear pattern exists.
The slope of a line is the change in Y over the change in X. For example, a slope of
10 / 3 .
means as the x-value increases (moves right) by 3 units, the y-value moves up by 10 units on average.
The y-intercept is the value on the y-axis where the line crosses. For example, in the equation y=2x – 6, the line crosses the y-axis at the value b= –6. The coordinates of this point are (0, –6); when a line crosses the y-axis, the x-value is always 0.