In: Statistics and Probability
The p-value was slightly above conventional threshold, but was described as “rapidly approaching significance” (i.e., p =.06). An independent samples t test was used to determine whether student satisfaction levels in a quantitative reasoning course differed between the traditional classroom and on-line environments. The samples consisted of students in four face-to-face classes at a traditional state university (n = 65) and four online classes offered at the same university (n = 69). Students reported their level of satisfaction on a fivepoint scale, with higher values indicating higher levels of satisfaction. Since the study was exploratory in nature, levels of significance were relaxed to the .10 level. The test was significant t(132) = 1.8, p = .074, wherein students in the face-to-face class reported lower levels of satisfaction (M = 3.39, SD = 1.8) than did those in the online sections (M = 3.89, SD = 1.4). We therefore conclude that on average, students in online quantitative reasoning classes have higher levels of satisfaction. The results of this study are significant because they provide educators with evidence of what medium works better in producing quantitatively knowledgeable practitioners.
How do I evaluate the sample size?
How do I evaluate the statements for meaningfulness?
How do I evaluate the statistical significance?
How do I provide an explanation of the implications for social change?
Sample size: A sample size of five percent of the whole population is already enough for the analysis. In this case, the population of the school is not given. However, it is reasonable to presuppose that 132 represents more than or equal to a 5% population. If the study is hypothesis-driven, the sample size (n=65 and n=69) in both control and experimental setups, respectively, was too big for the statistical test used that was the t-test. Since the sample size is larger than 30, Z-test may be used instead of a t-test, if the normal distribution is applied.
Statements for meaningfulness & Statistical significance: Instead of mentioning “rapidly approaching significance”, it is more appropriate to say that the P-value of 0.06 is on the edge of significance. The former statement suggests that the P-value is reducing and to make it even worse-sounding, the word “rapidly” has been used, which is not actually statistically correct. Statistical significance: Using a t-test must determine if there was a significant difference between the traditional classroom (control group) and on-line environments (experimental group). However, it was noted that the study is exploratory and hypothesis testing is not necessary. Hypotheses statements are not used since the research is not descriptive. With the absence of hypotheses statements, p-values were not really meaningful and only served as a rough guide. P-values and their critical counterparts are used in descriptive and correlation forms of research. While it is all right to conduct exploratory analyses on sample plots positioned in accordance with a thorough, purposive sampling design, such meticulous placement is not compulsory. In exploratory research, the researcher is only interested in providing details where very little is known about the phenomenon. It may use various procedures such as trial studies, interviews, group discussions, experiments or other tactics for the purpose of gaining information. Therefore the statement, “Since the study was exploratory in nature, levels of significance were relaxed to the .10 level”, is not statistically sound. The statement, “The test was significant t (132) = 1.8, p = .074, wherein students in the face-to-face class reported lower levels of satisfaction (M = 3.39, SD = 1.8) than did those in the online sections (M = 3.89, SD = 1.4)”, is not just statistically incorrect, but also confusing and misleading. A t-test would not be applicable here as we have two groups that are further subdivided into 4 subgroups each. Samples are to be taken from each subgroup and an Analysis of Variance (ANOVA) may be performed. Furthermore, the traditional threshold of 0.05 is always recommended for use.
Implications for social change: The first scenario showed a great depth of confirmation of how p-values are generally misunderstood and misused. With more stress and importance on the subject matter, these errors can be avoided in the future. Positive social changes will have a great impact in appreciating and understanding the true issue. The result of the study is important in determining how the investigator can optimally explain or describe the variation in the data set through data-diving. The investigator may find patterns in the students’ answers that may propose evidence to educators on what medium works more effectively in the classroom.