Question

In: Statistics and Probability

Observation x1 y1 x2 y2 x3 y3 x4 y4 1 10 8.04 10 9.14 10 7.46...

Observation	x1	y1	x2	y2	x3	y3	x4	y4
1	10	8.04	10	9.14	10	7.46	8	6.58
2	8	6.95	8	8.14	8	6.77	8	5.76
3	13	7.58	13	8.74	13	12.74	8	7.71
4	9	8.81	9	8.77	9	7.11	8	8.84
5	11	8.33	11	9.26	11	7.81	8	8.47
6	14	9.96	14	8.1	14	8.84	8	7.04
7	6	7.24	6	6.13	6	6.08	8	5.25
8	4	4.26	4	3.1	4	5.39	19	12.5
9	12	10.84	12	9.13	12	8.15	8	5.56
10	7	4.82	7	7.26	7	6.42	8	7.91
11	5	5.68	5	4.74	5	5.73	8	6.89

Fit a simple linear regression model to each set of (x, y) data, i.e., one model fit to (x₁, y₁), one model fit to (x₂, y₂), one model fit to (x₃, y₃), and one model fit to (x₄, y₄).
Write down the estimated regression equation for each fitted model, together with the values of the coefficient of determination, r2, and the standard error of the estimate, s=MSE‾‾‾‾‾√.
For each set of (x, y) data, create a scatterplot of y (vertical) versus x (horizontal) with the estimated regression line added to the plot.
For each set of (x, y) data, create a scatterplot of the residuals (vertical) versus (horizontal). Based on each plot, do the zero mean and constant variance assumptions about the simple linear regression model error seem reasonable?
For each set of (x, y) data, create a normal probability plot of the standardized residuals. Based on each plot, does the normality assumption about the simple linear regression model error seem reasonable?
For each set of (x, y) data, are there any outliers?
For each set of (x, y) data, are there any high leverage points?
For each set of (x, y) data, are there any influential points?

Post a summary of your group’s analysis. What important “big picture” conclusions can you draw from your analysis?

Expert Solution

I used R software to solve this question.

For data (x1,y1)

R codes and output:

> x1=scan('clipboard');x1
Read 11 items
[1] 10 8 13 9 11 14 6 4 12 7 5
> y1=scan('clipboard');y1
Read 11 items
[1] 8.04 6.95 7.58 8.81 8.33 9.96 7.24 4.26 10.84 4.82 5.68
> plot(x1,y1)
> fit=lm(y1~x1)
> summary(fit)

Call:
lm(formula = y1 ~ x1)

Residuals:
Min 1Q Median 3Q Max
-1.92127 -0.45577 -0.04136 0.70941 1.83882

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.0001 1.1247 2.667 0.02573 *
x1 0.5001 0.1179 4.241 0.00217 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.237 on 9 degrees of freedom
Multiple R-squared: 0.6665, Adjusted R-squared: 0.6295
F-statistic: 17.99 on 1 and 9 DF, p-value: 0.00217

> res=fit$residuals
> plot(x1,res)
> std_res=rstandard(fit) # standardized residuals
> qqnorm(std_res)
> qqline(std_res)

Estimated regression equation:

Y1 = 3.0001 + 0.5001 X1

Coefficient of determination: R2 = 0.6665

Standard error of the estimate = 1.237

Scatter plot:

Residual plot:

Residual plot shows random pattern hence residuals are independent.

Normal probability plot for standardized residuals:

Points des not lie on straight line, hence normality assumption is not satisfied.

orchestra answered 1 year ago

Anscombe's Data Observation x1 y1 x2 y2 x3 y3 x4 y4 1 10 8.04 10 9.14...

Anscombe's Data Observation x1 y1 x2 y2 x3 y3 x4 y4 1 10 8.04 10 9.14 10 7.46 8 6.58 2 8 6.95 8 8.14 8 6.77 8 5.76 3 13 7.58 13 8.74 13 12.74 8 7.71 4 9 8.81 9 8.77 9 7.11 8 8.84 5 11 8.33 11 9.26 11 7.81 8 8.47 6 14 9.96 14 8.1 14 8.84 8 7.04 7 6 7.24 6 6.13 6 6.08 8 5.25 8 4 4.26 4 3.1 4...

Let X1, X2, X3 be independent having N(0,1). Let Y1=(X1-X2)/√2, Y2=(X1+X2-2*X3)/√6, Y3=(X1+X2+X3)/√3. Find the joint pdf...

Let X1, X2, X3 be independent having N(0,1). Let Y1=(X1-X2)/√2, Y2=(X1+X2-2*X3)/√6, Y3=(X1+X2+X3)/√3. Find the joint pdf of Y1, Y2, Y3, and the marginal pdfs.

Let x,y ∈ R3 such that x = (x1,x2,x3) and y = (y1,y2,y3) determine if <x,y>=...

Let x,y ∈ R3 such that x = (x1,x2,x3) and y = (y1,y2,y3) determine if <x,y>= x1y1+2x2y2+3x3y3 is an inner product

Let X1,X2,X3 be i.i.d. N(0,1) random variables. Suppose Y1 = X1 + X2 + X3, Y2...

Let X1,X2,X3 be i.i.d. N(0,1) random variables. Suppose Y1 = X1 + X2 + X3, Y2 = X1 −X2, Y3 =X1 −X3. Find the joint pdf of Y = (Y1,Y2,Y3)′ using : Multivariate normal distribution properties.

Let Y1 < Y2 < Y3 < Y4 < Y5 be the order statistics of a...

Let Y1 < Y2 < Y3 < Y4 < Y5 be the order statistics of a random sample of size 5 from a continuous distribution with median m. What is P(Y2 < m < Y4)?

Let Y1 < Y2 < Y3 < Y4 be the order statistics of a random sample...

Let Y1 < Y2 < Y3 < Y4 be the order statistics of a random sample of size n = 4 from a distribution with pdf f(x) = 3X2, 0 < x < 1, zero elsewhere. (a) Find the joint pdf of Y3 and Y4. (b) Find the conditional pdf of Y3, given Y4 = y4. (c) Evaluate E(Y3|y4)

Let Y1 < Y2 < Y3 < Y4 < Y5 denote the order statistics of a...

Let Y1 < Y2 < Y3 < Y4 < Y5 denote the order statistics of a random sample of size 5 from a distribution having pdf f(x) = e−x, 0 < x < ∞, zero elsewhere. show that Y4 and Y5 – Y4 are independent. Hint: First find the joint pdf of Y4 and Y5.

Let U = {(x1,x2,x3,x4) ∈F4 | 2x1 = x3, x1 + x4 = 0}. (a) Prove...

Let U = {(x1,x2,x3,x4) ∈F4 | 2x1 = x3, x1 + x4 = 0}. (a) Prove that U is a subspace of F4. (b) Find a basis for U and prove that dimU = 2. (c) Complete the basis for U in (b) to a basis of F4. (d) Find an explicit isomorphism T : U →F2. (e) Let T as in part (d). Find a linear map S: F4 →F2 such that S(u) = T(u) for all u ∈...

1. Let ρ: R2 ×R2 →R be given by ρ((x1,y1),(x2,y2)) = |x1 −x2|+|y1 −y2|. (a) Prove...

1. Let ρ: R2 ×R2 →R be given by ρ((x1,y1),(x2,y2)) = |x1 −x2|+|y1 −y2|. (a) Prove that (R2,ρ) is a metric space. (b) In (R2,ρ), sketch the open ball with center (0,0) and radius 1. 2. Let {xn} be a sequence in a metric space (X,ρ). Prove that if xn → a and xn → b for some a,b ∈ X, then a = b. 3. (Optional) Let (C[a,b],ρ) be the metric space discussed in example 10.6 on page 344...

The prices of inputs (x1,x2,x3,x4) are (4,1,3,2): (a) If the production function is given by f(x3,x4)...

The prices of inputs (x1,x2,x3,x4) are (4,1,3,2): (a) If the production function is given by f(x3,x4) =min⁡{x1+x2,x3+x4} what is the minimum cost of producing one unit of output? (b) If the production function is given by f(x3,x4)=x1+x2 +min⁡{x3+x4} what is the minimum cost of producing one unit of output?