In: Statistics and Probability
Describe how to use a simple (bivariate) regression model to carry out a difference in the means test, to estimate a descriptive statistic, and to estimate an unbiased (or less biased) causal effect.
Solution ) SIMPLE BIVARIATE REGRESSION
Here we consider a Case Study to describe bivariate Regression Analysis in SPSS.
we have 2 variables one independent and one dependent Variable.
Years= independent variable
Salary=dependent variable.
H0 = there is not significance difference of independent variable on dependent variable.
H1=there is significance difference of independent variable on dependent variable.
After entering data in spss in Variable view
name = years , salary
type= numeric
measure= scale
to get data in output of SPSS
Analyse - Reports - Case Summaries
then select both variables years and salary in variable group then click Ok
we get,
Case Processing Summarya |
||||||
Cases |
||||||
Included |
Excluded |
Total |
||||
N |
Percent |
N |
Percent |
N |
Percent |
|
Number of years in employement |
50 |
100.0% |
0 |
0.0% |
50 |
100.0% |
Salary in rupees |
50 |
100.0% |
0 |
0.0% |
50 |
100.0% |
a. Limited to first 100 cases. |
Case Summariesa |
|||
Number of years in employement |
Salary in rupees |
||
1 |
1.00 |
35000.00 |
|
2 |
1.00 |
15000.00 |
|
3 |
1.00 |
26000.00 |
|
4 |
1.00 |
37000.00 |
|
5 |
1.00 |
38000.00 |
|
6 |
2.00 |
27000.00 |
|
7 |
2.00 |
45000.00 |
|
8 |
2.00 |
50000.00 |
|
9 |
2.00 |
36000.00 |
|
10 |
2.00 |
40000.00 |
|
11 |
3.00 |
45000.00 |
|
12 |
3.00 |
40000.00 |
|
13 |
3.00 |
38000.00 |
|
14 |
3.00 |
6000.00 |
|
15 |
3.00 |
46000.00 |
|
16 |
4.00 |
30000.00 |
|
17 |
4.00 |
32000.00 |
|
18 |
4.00 |
62000.00 |
|
19 |
4.00 |
45000.00 |
|
20 |
4.00 |
21000.00 |
|
21 |
4.00 |
55000.00 |
|
22 |
4.00 |
47000.00 |
|
23 |
5.00 |
50000.00 |
|
24 |
5.00 |
39000.00 |
|
25 |
5.00 |
46000.00 |
|
26 |
5.00 |
50000.00 |
|
27 |
6.00 |
44000.00 |
|
28 |
6.00 |
46000.00 |
|
29 |
6.00 |
3000.00 |
|
30 |
7.00 |
55000.00 |
|
31 |
7.00 |
56000.00 |
|
32 |
7.00 |
46000.00 |
|
33 |
8.00 |
57000.00 |
|
34 |
8.00 |
40000.00 |
|
35 |
9.00 |
55000.00 |
|
36 |
9.00 |
53000.00 |
|
37 |
9.00 |
44000.00 |
|
38 |
10.00 |
80000.00 |
|
39 |
10.00 |
65000.00 |
|
40 |
12.00 |
69000.00 |
|
41 |
12.00 |
65000.00 |
|
42 |
13.00 |
82000.00 |
|
43 |
14.00 |
85000.00 |
|
44 |
14.00 |
80000.00 |
|
45 |
14.00 |
65000.00 |
|
46 |
15.00 |
57000.00 |
|
47 |
17.00 |
59000.00 |
|
48 |
19.00 |
70000.00 |
|
49 |
20.00 |
96000.00 |
|
50 |
22.00 |
95000.00 |
|
Total |
N |
50 |
50 |
a. Limited to first 100 cases. |
|||
Assumptions : (1) Linearity=Relationship between dependent and independent variable should be linear.
(2) Constant Variance of Error terms .
(3)Error terms should be Independent of each Other.
(4) Error terms should be normally distributed.
First of all we have to check linearity of given data.
this looks like positive correlation between salary and number of years.
now,we have to perform regression.
Now interpret ,OUTPUT of SPSS
we know ,
regression equation is given as : Y = a + b X
where Y = dependent variable
a= Intercept , b= Slope , X=Independent variable.
Descriptive Statistics |
|||
Mean |
Std. Deviation |
N |
|
Salary in rupees |
49360.0000 |
20003.83637 |
50 |
Number of years in employement |
7.0400 |
5.40959 |
50 |
Correlations |
|||
Salary in rupees |
Number of years in employement |
||
Pearson Correlation |
Salary in rupees |
1.000 |
.780 |
Number of years in employement |
.780 |
1.000 |
|
Sig. (1-tailed) |
Salary in rupees |
. |
.000 |
Number of years in employement |
.000 |
. |
|
N |
Salary in rupees |
50 |
50 |
Number of years in employement |
50 |
50 |
Model Summaryb |
||||
Model |
R |
R Square |
Adjusted R Square |
Std. Error of the Estimate |
1 |
.780a |
.608 |
.599 |
12660.07854 |
a. Predictors: (Constant), Number of years in employement |
||||
b. Dependent Variable: Salary in rupees |
ANOVAa |
||||||
Model |
Sum of Squares |
df |
Mean Square |
F |
Sig. |
|
1 |
Regression |
11914195742.022 |
1 |
11914195742.022 |
74.335 |
.000b |
Residual |
7693324257.978 |
48 |
160277588.708 |
|||
Total |
19607520000.000 |
49 |
||||
a. Dependent Variable: Salary in rupees |
||||||
b. Predictors: (Constant), Number of years in employement |
Coefficientsa |
||||||
Model |
Unstandardized Coefficients |
Standardized Coefficients |
t |
Sig. |
||
B |
Std. Error |
Beta |
||||
1 |
(Constant) |
29067.173 |
2957.252 |
9.829 |
.000 |
|
Number of years in employement |
2882.504 |
334.329 |
.780 |
8.622 |
.000 |
|
a. Dependent Variable: Salary in rupees |
from above table , we have seen
a=29067.173
b=2882.504
and p-value =0.000<0.05 reject H0.
means there is statistically significance difference .
and conclude that Independent variable is sufficient to explain dependent variable .
Y=29067.173 + 2882.504*(number of years)
now we can predict any number of year salary.
Residuals Statisticsa |
|||||
Minimum |
Maximum |
Mean |
Std. Deviation |
N |
|
Predicted Value |
31949.6758 |
92482.2578 |
49360.0000 |
15593.16683 |
50 |
Residual |
-43362.19531 |
22107.78906 |
.00000 |
12530.22815 |
50 |
Std. Predicted Value |
-1.117 |
2.765 |
.000 |
1.000 |
50 |
Std. Residual |
-3.425 |
1.746 |
.000 |
.990 |
50 |
a. Dependent Variable: Salary in rupees |