Question

In: Statistics and Probability

how does to apply Nested Laplace Approximation (INLA) to survey data in small area estimation

Expert Solution

Integrated Nested Laplace approximation:

NLA is a deterministic paradigm for Bayesian inference in latent Gaussian models (LGMs) introduced in Rue et al. (2009).

INLA relies on a combination of analytical approximations and efficient numerical integration schemes to achieve highly accurate deterministic approximations to posterior quantities of interest.

The main benefit of using INLA instead of Markov chain Monte Carlo (MCMC) techniques for LGMs is computational; INLA is fast even for large, complex models. Moreover, being a deterministic algorithm, INLA does not suffer from slow convergence and poor mixing.

Small Area Estimation (SAE):

In small area estimation (SAE) one investigates how to obtain these area specific characteristics from survey data covering more than only the area of interest by using spatial smoothing methods.

Obtaining reliable estimates about health outcomes for areas or domains where only few to no samples are available is the goal of small area estimation (SAE).

Here describe a spatial predictive model-based approach to SAE for a binary health outcome in a complex survey with given sampling weights.

Lets assume that the sampling weights on the sampled individuals are the only information available about the survey design.

The goal is to estimate the prevalence of the health outcome for all small areas in the spatial domain.

A hierarchical Bayesian model is used in which the health outcomes are regressed on the sampling weights.

A nonparametric regression on the weights is used to minimise possible bias of the regression function.

Additionally, both unstructured and structured spatial random effects are introduced to model the geographical distribution of the health outcomes.

The population distribution of the sampling weights is unknown as well, hence we must model the weights themselves to be able to perform predictions.

Notation:

Let Y_ik be a binary health outcome for individual i in small area k (i = 1, …, N_k and k = 1, …, K) with N_k the population size in area k.

Lets assume that N_k is known for each area. A sample of size n_k is drawn from each area k, where some of the n_k could be zero. Denote the sampled values by y_ik.

Let and represent the total population and sample size, respectively. We shall focus on estimating the true prevalence, P_k, in each area k, namely

...........................................................(1)

Let R_ik denote the binary variable indicating whether the ith individual in area k is sampled (R_ik = 1) or not (R_ik = 0). We use s_k to indicate the set of sampled individuals in area k and s′k for those that are not sampled.

To reflect the sampling design, weights w_ik are attached to each respondent’s outcome.The weights are proportional to the inverse probability of inclusion in the sample for unit i in area k. These weights can reflect both or a combination of the complex survey design and post-stratification adjustments.

Lets further assume that all sampled individuals respond to the survey. A typical dataset will have the structure as presented in Table 1. Throughout this article, we use the normalized weights, denoted by , defined by

......................................................(2)

The weights are called normalized because they sum up to the sample size n_k in area k.

Table 1

Structure of datasets used in this article.

Response	Area	Sample weight
y₁₁	1	w₁₁
y₂₁	1	w₂₁
⋮	⋮	⋮
y₁₂	2	w₁₂
⋮	⋮	⋮

Proposed Methods:

The Bayesian hierarchical model for the outcomes and the multinomial model described in Below are fitted using the integrated nested Laplace approximations (INLA) approach by Rue et al. (2009).

Hierarchical model:

A predictive model-based approach proposed by Royall (1970) is used to specify an estimator for P_k. The estimator is given by

.................................................(3)

Now extend these ideas to small area estimation.

The normalized sampling weights are used as a covariate in the model for the observed outcomes y_ik. We employ Bayesian hierarchical models consisting of three stages.

Implementation:

INLA yields a computationally convenient alternative to Markov chain Monte Carlo (MCMC) techniques. This method combines Laplace approximations and numerical integration in a very efficient manner to carry out a Bayesian analysis.

Sampling using this constraint is achieved by considering the intrinsic Gaussian Markov random field representation of the ICAR model for which, in addition, a linear constraint is assumed.

Sampling from the posterior distributions obtained from INLA is done via the inla.posterior.sample() function.

orchestra answered 3 years ago

Question

how does to apply Nested Laplace Approximation (INLA) to survey data in small area estimation

Solutions

Expert Solution

Related Solutions

how does one build up a hierachical bayesian equation in small area estimation of unemployment rates...

Discuss the estimation process of survey data in a research

i would like to do a study in “small area estimation on labour force indicators” using...

How does the ‘law of small numbers’ affect the interpretation of data?

List down and describe how sonar survey is applied, and how depth and area are measured.

How does the principle of parsimony apply to model selection?

Research the medical term homeostasis. What does it mean and how does it apply to the...

How does assess data stewardship considerations related to data? And how does data related issues are...

How does Galileo Galilei's theories apply to a projectile problem?

how does sun tzu the war apply to you personally?