In: Statistics and Probability
The data set EF1.SAV contains percentages of fractional fraction, made by 34 subjects in an imaginary clinical study. The efficacy fraction (EF) was calculated before and after treatment, given with the baseline and post variables, while ID is patient number.
1- Check if Baseline is a normally distributed variable.
2- Calculate average and standard deviation for Baseline, and find a 95% confidence interval for the Baseline average.
3- Repeat Points 1 and 2 for the Post variable.
4- Is there a significant difference between the Baseline and Post average? Make an analysis based on both confidence and p-value.
5- Write a short summary of what you found.
A researcher who has not had any course in statistics analyzes the same data using a two-sample method. He considers the data from before and after treatment as if they came from two random sample of patients.
Here you can use the dataset ef2.sav; The data represents the ejaculation fraction before and after a treatment. The measurements are the same as in ef1.sav, but the data structure is slightly different and the ID variable has been removed.
6- Perform the test performed by the researcher. What conclusion do you get? Compare these conclusions you received when analyzing the data in point 4.
7- What's wrong with this method? Why do you get different conclusions when using the two different methods? Consider this based on the effectiveness goals and the uncertainty of the effect goals you get when using the two methods.
EF1 SPSS FILES
ID
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
BASELINE
55
54
57
47
54
57
62
54
51
51
59
47
54
54
53
54
56
59
55
57
58
51
42
58
55
61
54
58
55
57
54
60
48
55
Post baseline
60
54
59
48
54
59
64
53
52
50
61
45
54
55
54
57
57
62
57
57
59
55
42
61
57
64
56
59
57
60
55
59
49
55
THESE THREE SPSS VARIABLES ID, BASELINE AND POST BASELINE, ARE PART OF EF1 FILE
EF2 SPSS FILE
TIME
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
EF (ejection fraction)
54
57
47
54
57
62
54
51
51
59
47
54
54
53
54
56
59
55
57
58
51
42
58
55
61
54
58
55
57
54
60
48
55
60
54
59
48
54
59
64
53
52
50
61
45
54
55
54
57
57
62
57
57
59
55
42
61
57
64
56
59
57
60
55
59
49
55
THESE TWO VARIABLES, TIME AND EF(ejection fraction) ARE PART OF EF2 SPSS FILE.
Result:
Answered for part 1
The data set EF1.SAV contains percentages of fractional fraction, made by 34 subjects in an imaginary clinical study. The efficacy fraction (EF) was calculated before and after treatment, given with the baseline and post variables, while ID is patient number.
Tests of Normality |
||||||
Kolmogorov-Smirnova |
Shapiro-Wilk |
|||||
Statistic |
df |
Sig. |
Statistic |
df |
Sig. |
|
Baseline |
0.209 |
34 |
0.001 |
0.935 |
34 |
0.044 |
a. Lilliefors Significance Correction |
Kolmogorov-Smirnov test value 0.209, P=0.001 which is < 0.05 level. The test is significant. The data is not normally distributed.
2- Calculate average and standard deviation for Baseline, and find a 95% confidence interval for the Baseline average.
Descriptives |
||||
Statistic |
Std. Error |
|||
Baseline |
Mean |
54.59 |
0.723 |
|
95% Confidence Interval for Mean |
Lower Bound |
53.12 |
||
Upper Bound |
56.06 |
|||
5% Trimmed Mean |
54.78 |
|||
Median |
55.00 |
|||
Variance |
17.765 |
|||
Std. Deviation |
4.215 |
|||
Minimum |
42 |
|||
Maximum |
62 |
|||
Range |
20 |
|||
Interquartile Range |
4 |
|||
Skewness |
-0.911 |
0.403 |
||
Kurtosis |
1.429 |
0.788 |
95% CI = (53.12, 56.06).
3- Repeat Points 1 and 2 for the Post variable.
Descriptives |
||||
Statistic |
Std. Error |
|||
Post baseline |
Mean |
55.88 |
0.848 |
|
95% Confidence Interval for Mean |
Lower Bound |
54.16 |
||
Upper Bound |
57.61 |
|||
5% Trimmed Mean |
56.13 |
|||
Median |
57.00 |
|||
Variance |
24.471 |
|||
Std. Deviation |
4.947 |
|||
Minimum |
42 |
|||
Maximum |
64 |
|||
Range |
22 |
|||
Interquartile Range |
5 |
|||
Skewness |
-0.859 |
0.403 |
||
Kurtosis |
1.037 |
0.788 |
95% CI= (54.16, 57.61)
Tests of Normality |
||||||
Kolmogorov-Smirnova |
Shapiro-Wilk |
|||||
Statistic |
df |
Sig. |
Statistic |
df |
Sig. |
|
Post baseline |
0.146 |
34 |
0.064 |
0.945 |
34 |
0.089 |
a. Lilliefors Significance Correction |
Kolmogorov-Smirnov test value 0.146, P=0.064 which is > 0.05 level. The test is not significant. The data is approximately normally distributed
4- Is there a significant difference between the Baseline and Post average? Make an analysis based on both confidence and p-value.
Paired Samples Statistics |
|||||
Mean |
N |
Std. Deviation |
Std. Error Mean |
||
Pair 1 |
Baseline |
54.59 |
34 |
4.215 |
0.723 |
Post baseline |
55.88 |
34 |
4.947 |
0.848 |
Paired Samples Test |
|||||||||
Paired Differences |
t |
df |
Sig. (2-tailed) |
||||||
Mean |
Std. Deviation |
Std. Error Mean |
95% Confidence Interval of the Difference |
||||||
Lower |
Upper |
||||||||
Pair 1 |
Baseline - Post baseline |
-1.294 |
1.528 |
0.262 |
-1.827 |
-0.761 |
-4.938 |
33 |
0.000 |
5- Write a short summary of what you found.
Paired sample t test is used. Calculated t=-4.938, P=0.000 which is < 0.05 level. Ho is rejected. We conclude that there is a significant difference between the Baseline and Post average.
A researcher who has not had any course in statistics analyzes the same data using a two-sample method. He considers the data from before and after treatment as if they came from two random sample of patients.
Since data are paired, we have to use dependent sample method. Paired sample t test is appropriate here to compare difference between the Baseline and Post average.