In: Math
To test whether extracurricular activity is a good predictor of college success, a college administrator records whether students participated in extracurricular activities during high school and their subsequent college freshman GPA.
Extracurricular Activity |
College Freshman GPA |
---|---|
Yes | 3.57 |
Yes | 3.32 |
Yes | 3.86 |
Yes | 3.72 |
No | 2.93 |
No | 3.88 |
No | 3.46 |
No | 2.71 |
No | 3.86 |
No | 2.84 |
(a) Code the dichotomous variable and then compute a point-biserial correlation coefficient. (Round your answer to three decimal places.)
Y : extracurricular activities
X: College Freshman GPA
Code the dichotomous variable Y: extracurricular activities as '1' for Yes and '0' for No
By coding dichotomous variable Y: extracurricular activities we get
Y: Extracurricular | Coded : Y: Extracurricular | X: CollegeFreshman GPA |
Yes | 1 | 3.57 |
Yes | 1 | 3.32 |
Yes | 1 | 3.86 |
Yes | 1 | 3.72 |
No | 0 | 2.93 |
No | 0 | 3.88 |
No | 0 | 3.46 |
No | 0 | 2.71 |
No | 0 | 3.86 |
No | 0 | 2.84 |
Now , dichotomous variable Y has the two values 0 and 1. If we divide the data set into two groups, group 1 which received the value "1" on Y and group 2 which received the value "0" on Y
point-biserial correlation coefficient
where sn is the standard deviation used when data are available for every member of the population:
M1 being the mean value on the continuous variable X for all data points in group 1, and M0 the mean value on the continuous variable X for all data points in group 2. Further, n1 is the number of data points in group 1, n0 is the number of data points in group 2 and n is the total sample size.
n = 10 ; total sample size
Y: Extracurricular | Coded : Y: Extracurricular | X: CollegeFreshman GPA |
Yes | 1 | 3.57 |
Yes | 1 | 3.32 |
Yes | 1 | 3.86 |
Yes | 1 | 3.72 |
No | 0 | 2.93 |
No | 0 | 3.88 |
No | 0 | 3.46 |
No | 0 | 2.71 |
No | 0 | 3.86 |
No | 0 | 2.84 |
Total | 34.15 |
X: | (x-![]() |
(x-![]() |
|
3.57 | 0.155 | 0.024025 | |
3.32 | -0.095 | 0.009025 | |
3.86 | 0.445 | 0.198025 | |
3.72 | 0.305 | 0.093025 | |
2.93 | -0.485 | 0.235225 | |
3.88 | 0.465 | 0.216225 | |
3.46 | 0.045 | 0.002025 | |
2.71 | -0.705 | 0.497025 | |
3.86 | 0.445 | 0.198025 | |
2.84 | -0.575 | 0.330625 | |
Total | 34.15 | 1.80325 |
Group 1 for Y=1
Y: Extracurricular | Coded : Y: Extracurricular | X: |
Yes | 1 | 3.57 |
Yes | 1 | 3.32 |
Yes | 1 | 3.86 |
Yes | 1 | 3.72 |
Total | 14.47 |
n1 is the number of data points in group 1 = 4
M1 mean value on the continuous variable X for all data points in group 1
group 2 for value "0" on Y
Coded Y :Extracurricular | X |
0 | 2.93 |
0 | 3.88 |
0 | 3.46 |
0 | 2.71 |
0 | 3.86 |
0 | 2.84 |
Total | 19.68 |
n0 is the number of data points in group 2 = 5
M0 mean value on the continuous variable X for all data points in group 2
point-biserial correlation coefficient
point-biserial correlation coefficient = 0.3894