In: Statistics and Probability
Scientific Reasoning with Philosophy and Statistics
We learned about what is commonly referred to as classical or frequentist statistics and discussed Bayesian statistics in the Romero paper and the Aeon article with the Papineau article. These three papers provided reasons for favoring Bayesian statistics over frequentists statistics, explain a brief summary of some of those reasons from these discussions.
Both approaches rely on likelihoods; the Bayesian approach simply augments the likelihood. I will give some pros and cons as I see them:
Pros: Many frequentist methods are special cases of Bayesian methods; as an important and widely-used concrete example the Mann-Whitney-Wilcoxon two-sample test is a limiting case of a Bayesian nonparametric test based on the Dirichlet process (Ferguson, 1973). The Bayesian approach has the added benefit of assuming a valid, coherent probability model. Bayesian methods can handle generalized additive mixed models with ease; the dimensionality of the random effects vector hardly matters to MCMC; frequentist methods rely on approximations or numerical quadrature and are not easily generalized to non-normal random effects. Similarly, any nonlinear hierarchical model (with non-normal random effects) can be fit with basically the same machinery. Experts have real prior information! Doctors have an excellent idea of the prognosis of certain diseases from firsthand experience and to ignore this information can bias results. Every probability sample is imperfect and has implicit bias, and we only get one sample. An expert can help improve inferences and also determine if a sample is really bad or at least suggest taking another sample to be sure. P-values have come under scrutiny lately and Bayesian methods have helped lower the “scientific standard” of Type I error from 0.05 to 0.005, see Redefine statistical significance
Cons: I don’t agree with the subjectivity of priors argument (choosing a model in and of itself is subjective, different variable selection techniques give different models, and frequentists essentially assume an asinine flat prior pleading complete ignorance). Although MCMC can take awhile to complete (but how long did it take to collect the data?), newer methods such as INLA (an R package using clever Laplace approximations…but only applicable to hierarchical models with normal random effects) and variational Bayes (available in the STAN software for free) are as fast as the usual frequentist approaches. Acutally VB in STAN is more accurate than the frequentist approach in SAS’ PROC GLIMMIX with only a very small increase in computing time. Also BayesX and te R package R2BayesX can compute GAMM estimates incredibly fast using a mixed-model representation of penalized B-splines and REML.
Honestly, I don’t see a lot of cons anymore except that Minitab, SAS, and R have mostly built-in frequentist estimates and diagnostics. STAN and R2BayesX are helping to make fast, accurate Bayesian approaches available to the masses, and ultimately they will decide which is “better.”