Question

In: Statistics and Probability

Here is data with y as the response variable. (x,y): 59.2 48.5, 64.5 46.8, 47.4 13.8,...

Here is data with y as the response variable.

(x,y): 59.2 48.5, 64.5 46.8, 47.4 13.8, -37.3 10, 71.4 84.9, 71.6 18.3, 64.4 28.3, 71.1 89.6, 56.2 6.2, 56.7 11.7, 69.1 23.8

Make a scatter plot of this data. Which point is an outlier? Enter as an ordered pair. For example (a,b) - with parenthesis.

1. Find the regression equation for the data set without the outlier. Enter as an equation of the form y = a + b x . Rounded to three decimal places. Do not include the hat in y-hat. Find the regression equation for the data set with the outlier.

2. Enter as an equation of the form y = a + b x . Rounded to three decimal places. Do not include the hat in y-hat.

3. Is this outlier an influential point?

a. Yes, the outlier appears to be an influential point.

b. No, the outlier does not appear to be an influential point.

Solutions

Expert Solution

Scatter plot:

Outlier : (-37.3, 10)

Without Outlier:

X Y XY
59.2 48.5 2871.2 3504.64 2352.25
64.5 46.8 3018.6 4160.25 2190.24
47.4 13.8 654.12 2246.76 190.44
71.4 84.9 6061.86 5097.96 7208.01
71.6 18.3 1310.28 5126.56 334.89
64.4 28.3 1822.52 4147.36 800.89
71.1 89.6 6370.56 5055.21 8028.16
56.2 6.2 348.44 3158.44 38.44
56.7 11.7 663.39 3214.89 136.89
69.1 23.8 1644.58 4774.81 566.44
Ʃx = 631.6
Ʃy = 371.9
Ʃxy = 24765.55
Ʃx² = 40486.88
Ʃy² = 21846.65
Sample size, n = 10
x̅ = Ʃx/n = 631.6/10 = 63.16
y̅ = Ʃy/n = 371.9/10 = 37.19
SSxx = Ʃx² - (Ʃx)²/n = 40486.88 - (631.6)²/10 = 595.024
SSyy = Ʃy² - (Ʃy)²/n = 21846.65 - (371.9)²/10 = 8015.689
SSxy = Ʃxy - (Ʃx)(Ʃy)/n = 24765.55 - (631.6)(371.9)/10 = 1276.346

Slope, b = SSxy/SSxx = 1276.346/595.024 =    2.145032805

y-intercept, a = y̅ -b* x̅ = 37.19 - (2.14503)*63.16 =    -98.29027199

Regression equation :   

y = -98.290 + (2.145) x  

--------------------

With Outlier:

X Y XY
59.2 48.5 2871.2 3504.64 2352.25
64.5 46.8 3018.6 4160.25 2190.24
47.4 13.8 654.12 2246.76 190.44
-37.3 10 -373 1391.29 100
71.4 84.9 6061.86 5097.96 7208.01
71.6 18.3 1310.28 5126.56 334.89
64.4 28.3 1822.52 4147.36 800.89
71.1 89.6 6370.56 5055.21 8028.16
56.2 6.2 348.44 3158.44 38.44
56.7 11.7 663.39 3214.89 136.89
69.1 23.8 1644.58 4774.81 566.44
Ʃx = 594.3
Ʃy = 381.9
Ʃxy = 24392.55
Ʃx² = 41878.17
Ʃy² = 21946.65
Sample size, n = 11
x̅ = Ʃx/n = 594.3/11 = 54.02727273
y̅ = Ʃy/n = 381.9/11 = 34.71818182
SSxx = Ʃx² - (Ʃx)²/n = 41878.17 - (594.3)²/11 = 9769.761818
SSyy = Ʃy² - (Ʃy)²/n = 21946.65 - (381.9)²/11 = 8687.776364
SSxy = Ʃxy - (Ʃx)(Ʃy)/n = 24392.55 - (594.3)(381.9)/11 = 3759.534545

Slope, b = SSxy/SSxx = 3759.53455/9769.76182 =    0.384813327

y-intercept, a = y̅ -b* x̅ = 34.71818 - (0.38481)*54.02727 =    13.92776727

Regression equation :   

y = 13.928 + (0.385) x

-------------------------

3. Answer: a. Yes, the outlier appears to be an influential point.


Related Solutions

Here is data with y as the response variable. x y 71.4 25.1 82.8 34.4 81...
Here is data with y as the response variable. x y 71.4 25.1 82.8 34.4 81 75.3 77.1 72.8 84.6 63.4 64.8 75.8 85.4 117.6 -14.1 173.3 76.7 51.2 63.3 140.8 Make a scatter plot of this data. Which point is an outlier? Enter as an ordered pair, e.g., (x,y). (x,y)= Find the regression equation for the data set without the outlier. Enter the equation of the form mx+b rounded to three decimal places. ˆywo= Find the regression equation for...
Here is data with y as the response variable. x y 43.2 52.4 52.8 58.7 52.5...
Here is data with y as the response variable. x y 43.2 52.4 52.8 58.7 52.5 48.6 189.8 112.4 64.6 49.8 47.6 57.2 31.4 36.4 66.6 60.1 Make a scatter plot of this data. Which point is an outlier? Enter as an ordered pair. For example (a,b) - with parenthesis. Find the regression equation for the data set without the outlier. Enter as an equation of the form y=a+bxy=a+bx. Rounded to three decimal places. For this WAMAP question, do not...
Here is data with y as the response variable. x y 57.8 47.7 65.3 42.7 61...
Here is data with y as the response variable. x y 57.8 47.7 65.3 42.7 61 30.8 54.5 26.4 -70.7 -338.8 63 45.7 38.9 -46.1 70.2 35.5 Make a scatter plot of this data. Which point is an outlier? Enter as an ordered pair. For example (a,b) - with parenthesis. Find the regression equation for the data set without the outlier. Enter as an equation of the form y = a + b x . Rounded to three decimal places....
Consider data regarding a response y and an explanatory variable x, both numeric x y 1...
Consider data regarding a response y and an explanatory variable x, both numeric x y 1 2.8 0.7 2 2.6 1.3 3 6.8 -1.1 4 3.0 0.2 5 4.7 1.1 6 5.0 -0.1 7 5.0 0.9 8 2.9 1.0 9 7.0 -0.2 10 3.7 0.8 The null hypothesis that the expected value of the response is constant for all values of the explanatory variable is: Select one: a. Not rejected with a significance level of 5%. b. Rejected with a...
Collect data on one response (dependent or y) variable and two different explanatory (independent or x)...
Collect data on one response (dependent or y) variable and two different explanatory (independent or x) variables. This will require a survey with three questions. For example: To predict a student’s GPA (y), you might collect data on two x variables: SAT score and age. So we would be trying to determine if there was a linear correlation between someone’s SAT score and their GPA, as well as their age and their GPA. (Note: students may not choose GPA as...
Collect data on one response (dependent or y) variable and two different explanatory (independent or x)...
Collect data on one response (dependent or y) variable and two different explanatory (independent or x) variables. This will require a survey with three questions. For example: To predict a student’s GPA (y), you might collect data on two x variables: SAT score and age. So we would be trying to determine if there was a linear correlation between someone’s SAT score and their GPA, as well as their age and their GPA. (Note: students may not choose GPA as...
a)By​ hand, draw a scatter diagram treating x as the explanatory variable and y as the response variable
x 4 5 6 8 9 y 6 8 11 14 16 a)By​ hand, draw a scatter diagram treating x as the explanatory variable and y as the response variable b)Find the equation of the line containing the points (44​,66​) and​ (9,16). c)Graph the line found in part​ (b) on the scatter diagram. d)By​ hand, determine the​ least-squares regression line. e)Graph the​ least-squares regression line on the scatter diagram. f)Compute the sum of the squared residuals for the line found...
Consider a binary response variable y and an explanatory variable x. The following table contains the...
Consider a binary response variable y and an explanatory variable x. The following table contains the parameter estimates of the linear probability model (LPM) and the logit model, with the associated p-values shown in parentheses. Variable LPM Logit Constant −0.78 −5.90 (0.03 ) (0.03 ) x 0.04 0.28 (0.07 ) (0.02 ) a. Test for the significance of the intercept and the slope coefficients at the 5% level in both models. Coefficients LPM Logit Intercept Slope b. What is the...
Consider a binary response variable y and an explanatory variable x. The following table contains the...
Consider a binary response variable y and an explanatory variable x. The following table contains the parameter estimates of the linear probability model (LPM) and the logit model, with the associated p-values shown in parentheses. Variable LPM Logit Constant −0.70 −6.60 (0.03 ) (0.03 ) x 0.04 0.18 (0.04 ) (0.03 ) a. Test for the significance of the intercept and the slope coefficients at the 5% level in both models. coefficient LPM Logit intercept slope b. What is the...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y...
Suppose we have the following data on variable X (independent) and variable Y (dependent): X Y 2 70 0 70 4 130 a.) By hand, determine the simple regression equation relating Y and X. b.) Calculate the R-Square measure and interpret the result. c.) Calculate the adjusted R-Square. d.) Test to see whether X and Y are significantly related using a test on the population correlation. Test this at the 0.05 level. e.) Test to see whether X and Y...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT