In: Math
My question is that should I accept or reject the null hypothesis?
Variable 1: General Health / Ordinal
Variable 2: Body Mass Index / Continuous
Null hypothesis: There is no statistically significant relationship between general health and body mass index. (In other words, correlation coefficient is equal or close to zero.)
Research/Alternative hypothesis: There is a statistically significant relationship between general health and body mass index. (In other words, correlation coefficient is different/far from zero.)
Statistical analysis used: Spearman’s (rank-order) correlation analysis
Key statistics:
Correlation coefficient: r = 0.248 (weak positive)
Statistical significance: p = 0.000 (statistically significant at level 0.01)
Coefficient of determination/Shared variance/Effect size/Practical significance (r2) = 0.248 * 0.248 = 0.062 (small)
0.062 *100= 6.2%. This means that 6.2% of the change/variance/variability in variable #1 (General Health)explains change/variance/variability in variable #2 (Body Mass Index). In other words, these two variables share 6.2% variance. This also means that 93.8% of variance in each of these variables remain unexplained/unaccounted for.
Assumptions:
1.Outliers: Scatterplot and boxplot show some significant outliers in the data.
2.Normality: Normality was assessed using Shapiro-Wilk’s test, and the distribution of the two variables are statistically significantly different from normal distribution (p = 0.000, p < .05).
3.Linearity: Scatterplot doesn’t demonstrate any linear relationship between these two variables (General Health and Body Mass Index). The scatterplot shows vertical lines with each category.
Accept or reject the null: Accept the null hypothesis and reject the research hypothesis. Although the p value show that data is statistically significant, the correlation coefficient is weak and effect size is small. Moreover, the data didn’t meet any of the assumptions.
My question is that should I accept or reject the null hypothesis?
Designing a hypothesis and conducting analysis to check for its statistical significance is based largely on factor involving the data, such as Nature of the Data, Population Homogenity, Sample Homogenity, Sample Size, Sampling technique, Objective of analysis and so on.
But more often than not, the data is secondary and other conditions are not favorable (as in this case).
Considering the case that the sample and process used to conduct the above analysis cannot be altered, we thus make the conclusion that in light of the given data, we shall REJECT the null hypothesis under 0.01 level of significance as observed.
Considering the case that the analysis can be conducted after
allowing to make changes in the data and the analysis, I would
recommend using a bigger sample with Non-parametric regression
analysis as a part of the total analysis and then arrive to a
relevant conclusion.
If you feel this answer is not sufficient. please let me know of
your specific queries, I shall explain it further.