In: Statistics and Probability
Consider the following data to be used in a regression. xi yi 1 0 2 10 3 25 4 30 5 35 (a) Find the values of b0 and b1. (b) Find the Coefficient of Determination. (c) Find the estimated standard deviation of b1 and the corresponding t-statistic. At the 1% level of significance, can you reject the null hypothesis? Make sure you state the null and alternative hypotheses. (d) Find the F-statistic. Is the equation significant at the 1% level? Make sure you state the null and alternative hypotheses. Use the p-value approach.
following data to be used in a regression.
we will use R-software to obtaine regression output
{ note if any Software output is not required only mannual output is required , then you can tell it in comment box }
Now regresssion model is
y = b0 + b1 x
where estimate of b0 and b1 are given by
= cov(x,y) / var(x)
= - *
where mean of yi and is mean of xi
From R
First we will import data into R
> x=c(1,2,3,4,5)
> y=c(0,10,25,30,35)
> data.frame(x,y)
x y
1 1 0
2 2 10
3 3 25
4 4 30
5 5 35
# to fit regreesion model in R we use command " lm() "
> fit=lm(y~x)
> summary(fit)
Call:
lm(formula = y ~ x)
Residuals:
1 2 3 4 5
-2 -1 5 1 -3
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
-7.000 3.830 -1.828
0.1650
x
9.000 1.155
7.794 0.0044 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.651 on 3 degrees of freedom
Multiple R-squared : 0.9529,
Adjusted R-squared: 0.9373
F-statistic: 60.75 on 1 and 3 DF, p-value: 0.004395
(a) Find the values of b0 and b1. (b)
Thus estimate of b0 is , = -7.0
estimate of b1 is , = 9.0
(b) Find the Coefficient of Determination.
The coefficient of determination, R2, is used to analyze how differences in one variable can be explained by a difference in a second variable.
From R- output
Multiple R-squared : 0.9529
Thus Coefficient of Determination is 95.29% ( or 0.9529 )
{
General formula to find coefficient of determination R2 is
R2 = ( r )2
where r = cov(x,y) / ( x * y )
here x is standard deviation of xi and y is standard deviation of yi
> r = cov(x,y)/
(sd(x)*sd(y))
# corelation between xi and yi
> r
[1] 0.9761871
> R2=r^2
>
R2
# Coefficient of Determination
[1] 0.9529412
}
(c) Find the estimated standard deviation of b1 and the corresponding t-statistic. At the 1% level of significance, can you reject the null hypothesis
From R Output we have obtained this-
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
-7.000 3.830 -1.828
0.1650
x
9.000 1.155
7.794 0.0044 **
Now standard error of b1 and the corresponding t-statistic 1.155
Let standard deviation -SD , standard error - SE
Now standard error SE = SD /
here n = 5
Thus SD = SE * = 1.155 * = 2.582659
Thus estimated standard deviation of b1 and the corresponding t-statistic 2.582659
Now the null and alternative hypotheses are
H0 : b1 = 0 ( term b1 do not contributes significantly to the model)
H1 : b1 0 ( term b1 contributes significantly to the model)
Here Test Statistics = = = 7.792208
Thus Test Statistics = 7.792208
We reject null hupothsis if Test Statistics |TS | value is greater than
Here = =
i.e t-distribution with n-(1+1) = 5-2 = 3 degree of freedom an at 1 % of level of significance
Now t-distributed t-table can be obtaine from statistical book or from any software like R
From R
>
qt(1-0.01/2,3)
#t-distribution t-table value
[1] 5.840909
Here Test Statistics |TS | value = 7.792208
and t-table value = 5.840909
Thus |TS | > t-table value
So we reject null hypothesis at 1% level of significance
At the 1% level of significance, we reject the null hypothesis, and hence we conclude that term b1 contributes significantly to the model .
(d) Find the F-statistic. Is the equation significant at the 1% level? Make sure you state the null and alternative hypotheses. Use the p-value approach.
Null and Alternative hypothesis are
H0 : bj = 0 ( given model is not significantly )
H1 : bj 0 ( given model is significantly )
We have obtain ANOVA in R-Output which was as follow
> anova(fit)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value
Pr(>F)
x
1 810
810.00 60.75
0.004395 **
Residuals 3
40
13.33
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
F-statistics is given by
F-statistics = MSR / MSRes = ( 810 / 1) / (40 / 3 ) = 810 / 13.33 = 60.75
Thus
F-statistics = 60.75
Using P-value Approch
Also Given P-value is 0.004395376 { Pr(>F) = 0.004395 ** }
We reject null hypothsis is P-value is less than 0.01
{ here 0.01 because we are given 1% level of significance }
Here P-value =0.004395376 < 0.01
Thus we reject null hypothesis at 1% of level of significance .
And hecne conclude that model is significant .