Question

In: Statistics and Probability

Predict Y based on the datasets in X. X Y 31 65 39 55 41 32...

Predict Y based on the datasets in X.

X

Y

31

65

39

55

41

32

44

60

47

78

48

59

55

61

65

60

15

23

19

52

a) Construct a scatter plot. Describe the relation between the two variables.

b) Calculate and interpret the correlation coefficient value.

c) Find the equation of the least-squares regression line.

d) What would you predict when independent value is 25?

e) Find and interpret the value of r2.

Solutions

Expert Solution

Sol:

with lm function in R to get the regression of y on x

coeffcient fucntion to get the coeffcients

plot function to get scatterplot

Rcode:

df1 =read.table(header = TRUE, text ="
x y
31 65
39 55
41 32
44 60
47 78
48 59
55 61
65 60
15 23
19 52

"
)
df1
plot(y ~ x, data = df1, xlab = "x", ylab = "y",pch=16)

From scatterplot we observe that there exists a positive linear relationship between x and y

as x increases ,y increases and vice versa

b) Calculate and interpret the correlation coefficient value.

x y xbar ybar x_xbar y_ybar x_xbary_ybar x_xbarsq y_ybarsq
1 31 65 40.4 54.5 -9.4 10.5 -98.7 88.36 110.25
2 39 55 40.4 54.5 -1.4 0.5 -0.7 1.96 0.25
3 41 32 40.4 54.5 0.6 -22.5 -13.5 0.36 506.25
4 44 60 40.4 54.5 3.6 5.5 19.8 12.96 30.25
5 47 78 40.4 54.5 6.6 23.5 155.1 43.56 552.25
6 48 59 40.4 54.5 7.6 4.5 34.2 57.76 20.25
7 55 61 40.4 54.5 14.6 6.5 94.9 213.16 42.25
8 65 60 40.4 54.5 24.6 5.5 135.3 605.16 30.25
9 15 23 40.4 54.5 -25.4 -31.5 800.1 645.16 992.25
10 19 52 40.4 54.5 -21.4 -2.5 53.5 457.96 6.25

r=(x-xbar)*(y-ybar)/sqrtsum((x-xbar)^2)*sum((y-ybar)^2))

r= 1180/sqrt(2126.4* 2290.5)

r=0.534680

Intrepretation:

There exists a moderate positive relationship between x and y.

As x increasess,Y increases and vice vers

sOLUTIONC

Rcode:

df1 =read.table(header = TRUE, text ="
x y
31 65
39 55
41 32
44 60
47 78
48 59
55 61
65 60
15 23
19 52

"
)
df1


plot(y ~ x, data = df1, xlab = "x", ylab = "y",pch=16)
cor(df1$x,df1$y)
linreg=lm(y~x,data=df1)
coefficients(linreg)
abline(coef(linreg)[1:2])

## rounded coefficients for better output
cf <- round(coef(linreg), 4)

## sign check to avoid having plus followed by minus for negative coefficients
eq <- paste0("y = ", cf[1],
ifelse(sign(cf[2])==1, " + ", " - "), abs(cf[2]), " x "
)

## printing of the equation
mtext(eq, 3, line=-2)

Output from regression of y on x

summary(linreg)

Call:
lm(formula = y ~ x, data = df1)

Residuals:
Min 1Q Median 3Q Max
-22.8330 -6.5139 0.7797 7.9072 19.8375

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 32.0809 13.3185 2.409 0.0426 *
x 0.5549 0.3101 1.790 0.1113
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 14.3 on 8 degrees of freedom
Multiple R-squared: 0.2859,   Adjusted R-squared: 0.1966
F-statistic: 3.203 on 1 and 8 DF, p-value: 0.1113

From regression equation

(Intercept) x
32.0808879 0.5549285

y^= 32.0808879 +0.5549285 *x

Solution-d:

d) What would you predict when independent value is 25?

For x=25

y^= 32.0808879 +0.5549285 *x

=32.0808879 +0.5549285 *25

= 45.9541

e) Find and interpret the value of r2.

R sq=0.2859

28.59% variation in Y is explained by X

Nota good modela s explained variance by model is less


Related Solutions

43 77 36 40 47 47 39 33 51 43 34 41 31 32 31 23...
43 77 36 40 47 47 39 33 51 43 34 41 31 32 31 23 50 41 43 45 50 44 41 45 40 33 42 25 41 36 38 33 26 54 44 49 21 26 37 43 32 45 38 40 25 62 41 62 45 37 44 43 41 33 37 25 37 40 32 42 56 34 47 52 39 47 41 44 49 34 43 48 49 41 31 48 First, sort the data....
) Let X ={35, 45, 39, 41, 41, 44, 46, 48, 49, 34, 12, 50, 20,...
) Let X ={35, 45, 39, 41, 41, 44, 46, 48, 49, 34, 12, 50, 20, 38, 40}(a.) Insert X into R. (a.) Find the mean of X. (b.) Depict X in a boxplot. (NOTE: This question should be answered entirely using code for R.)
X 28 39 32 37 44 22 40 Y 83 108 97 108 107 74 114...
X 28 39 32 37 44 22 40 Y 83 108 97 108 107 74 114 The standard error of the estimate for the above bivariate data is: Question 3 options: 5.45 5.65 5.85 6.05
X and Y want to predict the outcome of the elections in their congressional district. X...
X and Y want to predict the outcome of the elections in their congressional district. X has been travelling to all neighborhoods of the district to ask people’s opinions about their favorite candidate. X has interview 2000 people and has found that 40% of them support the democrat candidate. Y has created a website where people can express their political opinions and he has advertised his website by putting some posters near his house. Y has found that 4000 people...
Find x/y given 2x + y = 35 and 3x + 4y = 65?
Find \( \frac{x}{y} \) given 2x + y = 35 and 3x + 4y = 65?
Sort 101, 55, 64, 23, 12, 09, 45, 32, 19, 65, 89 using Quick sort. Write...
Sort 101, 55, 64, 23, 12, 09, 45, 32, 19, 65, 89 using Quick sort. Write the algorithm. Please show steps.
Determine the solution to the initial value differential equation; y′=0.0015y(1100−y), y(0)=32 1. y(x) = ? 2....
Determine the solution to the initial value differential equation; y′=0.0015y(1100−y), y(0)=32 1. y(x) = ? 2. What is the maximum value of this function. In other words, evaluate: lim x-> inf y(x) 3. Determine x for which y(x) reaches 86% of its maximum value.
A. Assume b0 = 12.953, and b1 = -2.5. For x = 25, predict y. Round...
A. Assume b0 = 12.953, and b1 = -2.5. For x = 25, predict y. Round to 3 digits. B. if x is increased by 10 units, how much does y-hat change? Round to 3 digits. Use the points to answer the following: x y 12 17 21 15 28 22 8 19 20 24 C. How much correlation is there between x and y? Round to four decimals and use leading zeros if necessary. D. How much of the...
Consider the following data for two variables, x and y. x 9 32 18 15 26...
Consider the following data for two variables, x and y. x 9 32 18 15 26 y 10 19 22 17 23 (a) Develop an estimated regression equation for the data of the form ŷ = b0 + b1x. (Round b0 to two decimal places and b1 to three decimal places.) ŷ = Comment on the adequacy of this equation for predicting y. (Use α = 0.05.) The high p-value and low coefficient of determination indicate that the equation is...
Consider the following data for two variables, x and y. x 9 32 18 15 26...
Consider the following data for two variables, x and y. x 9 32 18 15 26 y 10 19 21 17 23 (a)Develop an estimated regression equation for the data of the form ŷ = b0 + b1x. (Round b0 to two decimal places and b1 to three decimal places.) ŷ =____ (b)Develop an estimated regression equation for the data of the form ŷ = b0 + b1x + b2x2. (Round b0 to two decimal places and b1 to three...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT