In: Statistics and Probability
Here is some data from the 2013 California test takers comparing their family income and their average aggregate SAT score.
Test-taker Family Income ------Total Mean SAT Score
$0 – $20,000 ---- 1326
$20,000 – $40,000 -----1402
$40,000 – $60,000 ------1461
$60,000 – $80,000 -----1497
$80,000 – $100,000 -----1535
$100,000 – $120,000 ------1569
$120,000 – $140,000----- 1581
$140,000 – $160,000 ----1604
More than $200,000----- 1714
(a) Enter the data into your graphing utility and determine a line of best fit, a line that comes closest to most of the points. You will not be able to find a line that goes through every point. What equation seemed best? Note: The incomes are given in ranges. Use the midpoint of each range as your input value. For example for $80,000-$100,000 you would use 90,000.
(b) Are there points with coordinates that deviate more than others from the equation? Why do you think there are larger deviations at those places?
(c) Do you think that SAT scores correlate well with family income? In other words, can you approximately predict one from the other? Why should or shouldn't this be?
First, find out the midpoints of the income range and then plug theSAT scores in one of the columns of excel and then the midpoints of income in another column.
a. First, draw the scatter plot and then fit the linear regression line to the data set and check whether it fits best or not.
Steps for excel:
First select scores and midpoints simultaneously and then go in Insert. In that, there is an option for scatter plot just click and select the first option that is "Scatter" so that you will get the scatter plot. After that click on any one of the point and then right click and select Add trendline option you will see one window, in that there are various options, just select Linear and at below there are 3 options just click the second that is "display equation on chart" and close the window, so that it will give the scatter plot with linear equation.
That will look as follows.
b. Yes, the points deviate more. Because there is more variation in the income range.
c. Yes, according to the above scatter plot the SAT scores correlate well with the family income. Since that points are in increasing and the linear regression line covers almost all the points. We can say there is a correlation between both the variables as SAT scores increase the incomes of the family will increases.
Therefore, we can approximately predict one from the other, since there is a positive relationship between both the variables.