In: Statistics and Probability
Explain how missing data in randomised controlled trial will affect the conclusion and what methods can be used to minimise the effect? (max. 200 words)
RCT is employed in order to avoid selection bias. Missing data usually negates this advantage. If the analysis is run without the data, the results are insufficient with wider confidence results. If some data is assumed, chances of biased estimates and standard errors become high. This also results in wrong p-values and confidence intervals. Finally, it results in the loss of statistical power (the probability of rejection of the null hypothesis when it is false). Hence, missing data may distort the conclusions and make the trials invalid.
Assuming that the missing data can't be avoided, the following are a few methods to minimize its effects:
1. Complete case analysis : Exclusion of the whole rows of the missing data.
2. Imputation : Replacement of the missing data with estimated values. It includes mean substitution, regression imputation and multiple imputation.
3. Maximum likelihood : Assuming that the observed data are a sample drawn from a multivariate normal distribution, the required parameters are estimated and then the missing data are estimated based on these parameters.