In: Statistics and Probability
EX1 | EX2 | Ex3 | FINAL |
73 | 80 | 75 | 152 |
93 | 88 | 93 | 185 |
89 | 91 | 90 | 180 |
96 | 98 | 100 | 196 |
73 | 66 | 70 | 142 |
53 | 46 | 55 | 101 |
69 | 74 | 77 | 149 |
47 | 56 | 60 | 115 |
87 | 79 | 90 | 175 |
79 | 70 | 88 | 164 |
69 | 70 | 73 | 141 |
70 | 65 | 74 | 141 |
93 | 95 | 91 | 184 |
79 | 80 | 73 | 152 |
70 | 73 | 78 | 148 |
93 | 89 | 96 | 192 |
78 | 75 | 68 | 147 |
81 | 90 | 93 | 183 |
88 | 92 | 86 | 177 |
78 | 83 | 77 | 159 |
82 | 86 | 90 | 177 |
86 | 82 | 89 | 175 |
78 | 83 | 85 | 175 |
76 | 83 | 71 | 149 |
96 | 93 | 95 | 192 |
The following data provides 3 ex scores and 1 final ex score. Using the data you are to create a multiple linear regression line to predict final ex scores.
#### a
What is the correct model for all three tests?
#### b
How accurate is the model and interpret the R^squared value.
#### c
Check the conditions for a linear regression.
#### d
Interpret the intercept and discuss its implications.
#### e
Interpret the EX3 esimate in context.
#### f
Right a results sentence reporting your findings
The given data contains three ex score and final ex
score. We can fit the multiple regression model by considering final score as a independent variable and ex 1,2 and 3 scores as independent variables.
Now we can fit regression model to predict the final ex score. For this first we import scores data in R.
> #import the data in R
> score=read.csv(file="C:/Users/shree/Desktop/score.csv",header = T)
> #fit the model
> model=lm(score$final~., data=score)
> summary(model)
Call:
lm(formula = score$ï..final ~ ., data = score)
Residuals:
Min 1Q Median 3Q Max
-3.7452 -1.6328 -0.2984 0.8046 7.3111
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.3361 3.7642 -1.152 0.26230
ex1 0.3559 0.1214 2.932 0.00796 **
ex2 0.5425 0.1008 5.379 2.46e-05 ***
ex3 1.1674 0.1030 11.333 2.08e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.614 on 21 degrees of freedom
Multiple R-squared: 0.9897, Adjusted R-squared: 0.9882
F-statistic: 670.1 on 3 and 21 DF, p-value: < 2.2e-16
In the above table we can see that at 5% level of significance all the three variables are significant.
Also we get adjusted R squre value 98%. Which indicate that the above model fitted very well.
a)
The correct model for all three tests is
Y = b0 + b1 x1 + b0 + b1 x2+b3x3.
That is,
final_score= -4.33+0.355*ex2+0.54*ex2+1.16*ex3
b)
The p vlue is less than 0.05 and the value of R- suare and adjusted R-squre is 0.98.Which indicate that 98% variation in response variable (final_score) explained by independent variables(ex1,ex2 and ex3).
c)
Now the conditions for a linear regression.
Error are uncorrelated. and constant variance.
No multicollinearity present in data.(Means independent variable are uncorrelated).
d)
The value of intercept in the model is -4.33 and its negative. This means that the expected value of final scores will be zero when all independent variables (ex1, ex2 and e3) are set to be zero.
R-Code
#import the data in R
score=read.csv(file="C:/Users/shree/Desktop/score.csv",header = T)
str(score) #check structure of data
head(score)
summary(score)
#fit the model
model=lm(score$ï..final~ex1+ex2,data=score)
summary(model)