Bayesian Spatial Modeling

Introduction to Bayesian Spatial Modeling

Bayesian methodology is an approach to statistical inferences that has existed for a long time. However, its applications had been limited until recent advancements in computation and simulation methods (Congdon, 2001). Bayesian spatial modeling refers to the application of Bayesian methodology to spatial models, such as spatial autoregressive models and conditional autoregressive models. The concept underlying Bayesian spatial modeling is Bayes’ theorem, a theorem that considers both the distributions of the data and the unknown coefficient estimates (LeSage and Pace, 2009). Specifically, the simplest Bayes’ theorem expresses the relationship between the conditional probabilities of two events (A and B) as follows (Carlin and Louis, 2000):

where P(A|B) indicates the conditional probability, or posterior probability, of the event A given that event B is observed. P(A) and  P(B) represent the prior probabilities, or marginal probabilities, of event A and B occurring, respectively. It should be noted that P(B|A) is not necessarily equal to P(A|B). Equation (1) could be applied to a spatial modeling framework. By replacing B with D to reflect the observed spatial data and spatial weight matrix, and A with Θ to represent all parameters to be estimated in a spatial model, Equation (1) could be rewritten as:

Several points must be noted: (1) Bayesian methodology assumes that each parameter in P(Θ) has a prior distribution that captures uncertainties and reflects existing knowledge before data are observed. If researchers have little knowledge about a parameter, a vague probability distribution should be used. (2) P(D|Θ) is the likelihood of obtaining data D under this spatial model that contains the parameters Θ. (3) In practice, P(D) is usually set to an unknown constant as it does not involve any parameters in Θ (LeSage and Pace, 2009). (4) P(Θ|D)  is the posterior distribution of Θ after considering both empirical data and uncertainty.

Spatial modeling is essentially concerned with three issues: estimation and inference of parameter estimates, model specification and comparison, and prediction. It has been demonstrated that Equation (2) (Bayesian spatial modeling) can address these issues and have more attractive features in contrast to the conventional approach (i.e., frequentist) to spatial modeling (Banerjee, Carlin, and Gelfand, 2003; LeSage and Pace, 2009; Wasserman, 2003). For example, the Bayesian spatial modeling approach offers a more solid foundation as the uncertainties and/or existing knowledge of unknown parameters are taken into account. In addition, the statistical inference of the posterior distribution of Bayesian spatial modeling is more intuitive and directly corresponds to the concept of probability. The major difference between the frequentist and Bayesian approach is how the unknown parameters are treated. Specifically, the frequentist’s approach assumes that the observed data is from a specific likelihood model and the unknown parameters are “fixed and unknowable” (Carlin and Louis, 2000; Congdon, 2001). By contrast, the Bayesian approach assumes the unknown parameters follow prior distributions and uses these prior distributions to obtain the posterior distributions of unknown parameters.

Despite the advantages discussed above, the computational challenge had prevented the Bayesian approach or Bayesian spatial modeling from being popularized. Explicitly, to obtain the posterior distributions of parameters, Equation (2) requires integrations but the integrations are generally not tractable in closed form and must be approximated numerically, which may not be easily completed (Banerjee et al. 2003). As discussed previously, the recent development of computation and simulation methods has helped researchers to overcome this challenge. More specifically, Markov chain Monte Carlo (MCMC) integration methods, such as the Metropolis-Hastings algorithm (Hastings 1970; Metropolis et al., 1953) and the Gibbs sampler (Gelfand and Smith, 1990; Geman and Geman, 1984) have been incorporated into readily available software programs—such as WinBUGS (Lunn et al., 2000)—making it simpler to implement the Bayesian spatial models. Therefore, recent decades have experienced a rapid growth in the application of Bayesian spatial modeling to epidemiology (Best, Richardson, and Thomson, 2005), demography (Borgoni and Billari, 2003; Sparks, Sparks, and Campbell, 2012), and environmental health research (Best, Ickstadt, and Wolpert, 2000), among other disciplines.

Bayesian spatial modeling embraces most, if not all, spatial models in the literature, such as the spatial lag model, the spatial error model, and geographically weighted regression, as long as the statistical model can be estimated with Bayesian methodology. More importantly, Bayesian spatial modeling does not require a Gaussian spatial process and is more flexible in generalized linear modeling (Banerjee et al., 2003). The goal of this document is to provide an introduction to an audience that is interested in learning Bayesian spatial modeling. Useful links, papers, and books that include a more detailed description of Bayesian methodology and spatial modeling are provided below.

Further Reading:

1. Banerjee, S., B.P. Carlin, and A.E. Gelfand. 2003. Hierarchical Modeling and Analysis for Spatial Data. Boca Raton: Chapman & Hall/CRC. Chapter 4, pp. 99–125.

2. LeSage, J. and R.K. Pace. 2009. Introduction to Spatial Econometrics. Boca Raton: Chapman & Hall/CRC. Chapter 5, pp.123–154.

3. Lawson, A. B. 2009. Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology: Chapman & Hall/CRC.

4. Wasserman, L. 2003. All of Statistics: A Concise Course in Statistical Inference. New York: Springer.

Software program for Bayesian spatial modeling:

1. WinBUGS: http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml

2. GeoBUGS: http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/geobugs.shtml

3. spBayes, a package in R: http://cran.r-project.org/web/packages/spBayes/index.html

4. geoR, a package in R: http://cran.r-project.org/web/packages/geoR/index.html

5. arm, a package in R: http://cran.r-project.org/web/packages/arm/

6. ngspatial, a package in R: http://cran.r-project.org/web/packages/arm/

There are many more packages in R that allow users to implement Bayesian spatial models.

References

Banerjee, S., B.P. Carlin, and A.E. Gelfand. 2003. Hierarchical Modeling and Analysis for Spatial Data. Boca Raton: Chapman & Hall/CRC.

Best, N., S. Richardson, and A. Thomson. 2005. A comparison of Bayesian spatial models for disease mapping. Statistical Methods in Medical Research 14(1): 35–59.

Best, N.G., K. Ickstadt, and R.L. Wolpert. 2000. Spatial Poisson regression for health and exposure data measured at disparate resolutions. Journal of the American statistical association 95(452): 1076–1088.

Borgoni, R. and F.C. Billari. 2003. Bayesian spatial analysis of demographic survey data: An application to contraceptive use at first sexual intercourse. Demographic Research 8(3): 61–92.

Carlin, B.P. and T.A. Louis. 2000. Bayes and Empirical Bayes Methods for Data Analysis. Boca Raton: Chapman and Hall/CRC Press.

Congdon, P. 2001. Bayesian Statistical Modelling. New York: Wiley.

Gelfand, A.E. and A.F.M. Smith. 1990. Sampling-based approaches to calculating marginal densities. Journal of the American statistical association 85(410): 398–409.

Geman, S. and D. Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on(6): 721–741.

Hastings, W.K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1): 97–109.

LeSage, J. and R.K. Pace. 2009. Introduction to Spatial Econometrics. Boca Raton: Chapman & Hall/CRC.

Lunn, D.J., A. Thomas, N. Best, and D. Spiegelhalter. 2000. WinBUGS-a Bayesian modelling framework: concepts, structure, and extensibility. Statistics and Computing 10(4): 325–337.

Metropolis, N., A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller. 1953. Equation of state calculations by fast computing machines. The Journal of Chemical Physics 21: 1087–1092.

Sparks, J.P., C.S. Sparks, and J.J.A. Campbell. 2012. An application of Bayesian spatial statistical methods to the study of racial and poverty segregation and infant mortality rates in the US. GeoJournal: 1–17.

Wasserman, L. 2003. All of Statistics: A Concise Course in Statistical Inference. New York: Springer.