Geographically Weighted Regression

Introduction to Geographically Weighted Regression

The conventional spatial analysis techniques (e.g., spatial econometrics modeling), use a single equation to assess the overall relationships between the dependent and independent variables across space—known as a global analytic approach. One important assumption underlying this approach is that the relationships of interest are stationary or homogeneous spatially. While the global perspective is effective in handling spatial dependence and generating less unbiased estimates (than the non-spatial modeling), it is not capable of exploring spatial non-stationarity (or heterogeneity) or identifying place-specific associations. To fill this gap, Fotheringham et al. (1997) developed geographically weighted regression (GWR). This approach includes locational information and smoothing techniques into regression models. In contrast to the global approach, GWR has proved to be a useful local spatial analysis tool that helps researchers to generate nuanced insights into existing literature (Brunsdon et al., 1998; Fotheringham et al., 2002).

There are several reasons to adopt a local spatial analysis perspective (Fotheringham, 1997; Brunsdon et al., 1998): (1) From the analytic perspective, it is inevitable that random sampling variations will contribute to the spatial associations estimated by the global models. While most researchers are not interested in this source of spatial non-stationarity, random sampling variations will affect the significance testing of the coefficient estimates and, in turn, lead to incorrect conclusions. (2) Some associations between the dependent and independent variables may inherently vary over space due to local culture, values, attitudes, and behaviors. This echoes an emerging concept of studying the impacts of locality and residential neighborhood on social and human outcomes (Matthews et al., 2009; Casagrande et al., 2009). (3) Extending from the previous points, a global model may be a model mis-specification and the explanatory variables in the analysis may not be well represented by the correct function form (Fotheringham, 1997). GWR has been designed to address these issues and to detect spatial non-stationarity (Fotheringham et al., 2002).

GWR is a refinement to classical regression modeling, namely the ordinary least squares (OLS) regression (Fotheringham et al., 1998; Foody, 2003). The fundamental framework of GWR for continuous dependent variables could be summarized as follows: Let Yi, i = 1,…,n, be the response observations collected from location i in space. The corresponding covariate vector is Xi = (1, Xi1, Xi2,…, Xip)t of dimension (p+1) including the constant 1 for intercept. A standard GWR can be defined as equation (1):

where β(ui,vi) indicates the vector of the location-specific parameter estimates, (ui,vi), represents the geographic coordinates of location i in space, and is the error term with mean zero and common variance σ2 . It should be noted that excluding the geographic coordinates, (ui,vi), will make Equation (1) a multiple regression model and currently the GWR analytic framework has been extended to discrete dependent variables (i.e., binary and count data).

GWR uses kernel-based and geographically weighted least squares on a point-wise basis to estimate these parameters. That is, for a given location (u0,v0), the β’s are locally computed by minimizing equation (2):

where K is a kernel function, usually a symmetric probability density function and h is the bandwidth which controls the smoothness of the estimates. K(di0/h) indicates the geographical weight assigned locally to the values of (Xi, Yi) for location i, and depends on the distance di0 between the given location (u0,v0) and the ith designed location (ui,vi). Explicitly, the weight is determined by a kernel function that places more weight on observations closer to  (u0,v0) than those further away. It is clear that the coefficient estimates could vary by locations and reflect the spatial heterogeneity (Fotheringham et al., 2002; Chen and Yang, 2012).

There are two types of weighting schemes that can be used to calculate spatial weights used in GWR. One is the fixed-kernel routine, which assumes that the bandwidth, h, at each location is constant across the geographical region of interest. The other is the adaptive-kernel technique, where the bandwidth is selected so that the number of observations with nonzero weights is the same at each location. In principle, the adaptive-kernel technique should be used when data points are distributed sparsely in certain areas across research region. In addition, the bandwidth, h, controls the smoothness and choosing a bandwidth h is necessary and crucial in GWR. The common selection process is to search the bandwidth that minimizes the criterion value. Specifically, the procedure first defines a pre-determined set of values for h, conducts the corresponding GWR over this set, calculates the criterion value, and finally selects the value of h that minimizes the criterion value. The widely used criteria include cross-validation (CV) score, Akaike Information Criterion (AIC), Akaike Information Criterion Corrected (AICc), and Bayesian Information Criterion (BIC) (Fotheringham et al., 2002).

Further Reading

GWR has been increasingly used in many demographic research topics, such as health and crimes, to get a more comprehensive picture of how the population phenomena change spatially. The following material should provide the audience more detailed discussions:

1. Fotheringham, A.S., C. Brunsdon, M.E. Charlton. 2002. Geographically Weighted Regression: The Analysis of Spatially Varying Relationship. Wiley: New York.

2. Wheeler, D., and M. Tiefelsdorf. 2005. Multicollinearity and correlation among local regression coefficients in geographically weighted regression. Journal of Geographical Systems 7: 161–187.

3. Nakaya, T., A.S. Fotheringham, C. Brunsdon, and M.E. Charlton. 2005. Geographically weighted Poisson regression for disease association mapping, Statistics in Medicine 24: 2695–2717.

4. Griffith, D.A. 2008. Spatial-filtering-based contributions to a critique of geographically weighted regression (GWR). Environment and Planning A 40: 2751–2769.

Software Programs for GWR

The following websites contain the information related to the software programs that can implement GWR:

1. GWR standalone program: http://www.st-andrews.ac.uk/geoinformatics/gwr/gwr-software/

2. Spgwr package in R: http://cran.r-project.org/web/packages/spgwr/index.html

3. GWR SAS® macro: http://sas-for-gwglm.blogspot.com/

References

Brunsdon C., A.S. Fotheringham, and M.E. Charlton. 1998. Spatial nonstationarity and autoregressive models. Environment and Planning A 30(6): 957–973.

Casagrande, S.S., M.C. Whitt-Glover, K.J. Lancaster, A.M. Odoms-Young, and T.L. Gary. 2009. Built environment and health behaviors among African Americans: A systematic review. American Journal of Preventive Medicine 36(2): 174–181.

Chen, V.Y.J., and T.C. Yang. 2012. SAS macro programs for geographically weighted generalized linear modeling with spatial point data: Applications to health research. Computer Methods and Programs in Biomedicine 107: 262–273.

Foody, G.M. 2003. Geographical weighting as a further refinement to regression modelling: An example focused on the NDVI-rainfall relationship. Remote Sensing of Environment 88(3): 283–293.

Fotheringham, A.S. 1997. Trends in quantitative methods I: Stressing the local. Progress in Human Geography 21(1): 88–96.

Fotheringham, AS, Brunsdon, C, Charlton, ME. 2002. Geographically Weighted Regression: the Analysis of Spatially Varying Relationship. Wiley: New York.

Fotheringham, A.S., M.E. Charlton, and C. Brunsdon. 1998. Geographically weighted regression: A natural evolution of the expansion method for spatial data analysis. Environment and Planning A 30: 1905–1927.

Fotheringham, A.S., M.E. Charlton, and C. Brunsdon. 1997. Two techniques for exploring non-stationarity in geographical data. Geographical Systems 4: 59–82.

Matthews S.A., A.V. Moudon, and M. Daniel. 2009. Work group II: Using geographic information systems for enhancing research relevant to policy on diet, physical activity, and weight. American Journal of Preventive Medicine 36(4, Supplement 1): S171–S176.