In: Statistics and Probability
Describe the deviance information criterion to fit bayesian model.
The Deviance Information Criterion, just like the Akaike Information Criterion is used for linear model selection, is used for Bayesian model selection.
As the number of features in a bayesian model increase, the number of possible models also increase. And therefore, it becomes difficult to select the best model. The DIC helps in selecting the best model based on the Markov Chain Monte Carlo output.
Deviance
D(theta) = - 2 log ( P ( y | theta )) + C
Where,
y = data
theta = parameters
P ( y | theta ) = likelihood function
C = constant
The effective number of parameters
Pd = expected(D(theta)) - D(expected(theta))
The larger the effective number of parameters, the easier it is for the model to fit the data, and so the deviance needs to be penalized
Deviance Information Criteria
DIC = Pd + expected(D(theta)) or
DIC = 2* Pd + D(expected(theta))
The models with smaller DIC should be preferred to models with larger DIC. Models are penalized both by the value of deviance which favors a good fit, but also (similar to AIC) by the effective number of parameters Pd. Since deviance will decrease as the number of parameters in a model increases, the Pd compensates for this effect by favoring models with a smaller number of parameters.
An advantage of DIC over other criteria in selecting a Bayesian model is that the DIC is easily calculated from the output generated by a Markov Chain Monte Carlo simulation.
AIC requires calculating the likelihood at its maximum over theta, which is not readily available from the MCMC simulation. But to calculate DIC, simply compute deviance and DIC follows.