1 Background

According to the March of Dimes Birth Defects Foundation, a birth defect is any functional or structural anomaly, whether inherited or acquired, that presents in infancy or later in life and is induced by events preceding birth. Varying from minor cosmetic irregularities to life-threatening disorders, birth defects are the major cause of infant mortality and a leading cause of disability (Carmona 2005). Neural tube defects (NTDs) are one of the most common forms of birth defects, often occurring between the 3rd and 4th weeks of gestation. They result in structural defects that can occur anywhere along the neuroaxis from the developing brain to the sacrum and often result in the exposure of neural tissue (Frey and Hauser 2003).

Birth defects have a substantial impact on public health in terms of mortality, morbidity, disability, and health care costs. Fortunately, they can often be prevented or their effects minimized with early intervention. Yet the etiology of the vast majority of birth defects, specifically NTDs, is still unknown. Advances in geographical information systems (GIS) technology and risk assessment methods now provide opportunities for people to more easily analyze spatial relationships and disease risk factors to facilitate policy planning and implementation (Wiwanitkit 2008; Canales and Leckie 2007). It can be used to visualize spatial patterns in the geographical distribution of disease, usually for explorative or descriptive purposes, and can provide information for further research. It can also be used to gain important clues about the etiology of a disease (Sankoh et al. 2002).

Numerous risk assessment methods of birth defects from simple to complicated ones have been proposed (Wang et al. 2010). Initially an epidemiological measure is often used as a measure of relative risk, and in particular the standardized morbidity ratio. However, disease rates can not be calculated when birth data are missing or no births have occurred in many of the geographical areas. Some methods of mapping relative risk are based on regression models of relative risk and use information about geographical locations and established risk factors. However, spatial epidemiological investigations of NTDs were mostly exploratory, with limited knowledge about the putative risk factors. Moreover, often the primary motivation of the analysis is to identify unknown risk factors of NTDs that may vary by geographical location (Berke 2005). Geostatistical smoothing methods such as Kriging (Carrat and Valleron 1992) have also been applied to interpolating and predicting continuous surfaces of disease risks. But those methods do not involve mechanisms that account for the relevant physical laws, and because they are not based on deductively valid principles they often lead to nonsensical inferences and uninformative maps (Christakos et al. 2002). Some intelligent algorithms such as rough set theory also were used to explore spatial decision rules in NTDs and identify the NTDs risk in villages yet unseen (Bai et al. 2010). Because of the limitation of the algorithms, there were probably undefined cases in the identified result. In addition, the result is only the ordinal data about risk levels. It is therefore important to understand the hypotheses and theory behind alternative methods and their applicability to mapping disease risk before applying these methods in real-world situations.

Hierarchical Bayesian model (Wu et al. 2004; Ismaila et al. 2007; Hemmi 2008) and spatial filtering method (Rushton and Lolonis 1996; Liao et al. 2009) are two of the most commonly used methods for assessing risk of NTDs. Hierarchical Bayesian model involves a multiple, iterative analysis and produces parameter estimates for each individual unit of analysis by “borrowing” information from all other units (Li et al. 2006a). Spatial filtering can be used to estimate unknown risk or relative risk without making distributional assumptions (Anselin et al. 2006). In this method, the risk in a given location is determined by computing the crude disease rate of a series of circulars that cover the study area.

The purpose of the present study was to compare and contrast the assumptions and performance of these two methods and to make recommendations as to which method to use and when, and how best to interpret the outcome. Heshun county which has the highest rate of NTDs in China, was selected as the region of interest. Both methods were used to produce a risk map of NTDs for rural region of Heshun for 1998–2001. The results indicated that the methods differed in their ability to map the spatial relative risk of NTDs. The risk map from hierarchical Bayesian model did not remove all the random spatial noise in the rude disease rate, whereas that from the spatial filtering method smoothed the crude rate and revealed some spatial clusters of NTDs in Heshun.

2 Methods

2.1 Study site

China has a high occurrence of NTDs, which account for up to one-third of stillbirths and a quarter to a third of neonatal deaths (Li et al. 2006b). Data collected from a hospital-based surveillance system revealed a prevalence of NTDs of approximately 27.4 per 10,000 births in 1987. Shanxi province in the north has the highest rate of NTDs: 105.5 per 10,000 births in 1987 and 60.88 per 10,000 births in 1996–2002 (Gu et al. 2007). Thus, we selected Heshun, a county in Shanxi province, as the study area. Heshun is also one of the pilot regions for the National Birth Defect Intervention Project launched by the State Family Planning Commission.

Heshun lies in the Tai Hang Mountain Region at 37°03′E and 113°05′N (see Fig. 1). It consists of 326 administrative villages and has an area of 2,250 km2. Most resident are farmers whose living environments seldom change, and there has been no large-scale movement of inhabitants in the history of the region. Most types of birth defects designated by the World Health Organization can be found in Heshun, and NTDs predominate (Wu et al. 2004). During 1998–2001, 97 of the 3,706 births in rural Heshun involved NTDs. The inherited and congenital causes are similar among the NTDs in this region, and these factors explain only a small fraction of all NTDs.

Fig. 1
figure 1

Heshun

2.2 Data sources

The present analysis included all live births and stillbirths occurring in Heshun from January 1, 1998, to December 31, 2001. Births occurred at the hospital or at home, and mothers were residents of the county during that time period. Also included were all therapeutic abortions performed on residents of the area whose estimated delivery date fell within the time period of interest. All NTD cases, regardless of pregnancy outcome, were verified by doctors in the hospital. Records of NTD cases were collected from the local family planning department. NTDs included anencephaly, spina bifida, encephalocele, holoprosencephaly, and hydrocephalus, among others.

The local planning department declined to provide identifiers to link substantiated NTD cases to births, so we were unable to conduct the study at the individual level. Instead, we aggregated the NTD cases by villages of mother’s residence. The 326 villages were located using ArcGIS 9.2i. Because there were no births in 24 villages during the study period, we changed the denominator in the NTD ratio to reflect the overall population and not births. In 2001, the total population of the 326 villages was 104,655.

2.3 Hierarchical Bayesian model

Hierarchical Bayesian model was originally developed for use in pattern recognition but it has since been used in ecological studies and in spatial epidemiology for mapping disease. Unlike other Bayesian model, hierarchical Bayesian model considers spatial dependence. The estimate for the parameter of interest for any given sub-area in the study region in the region is obtained by “borrowing” strength from the other sub-areas in the region. In addition, observed outcomes are modeled as conditional on a set of parameters that are themselves given a probabilistic in terms of other parameters—hence the term hierarchical (Haining 2003).

We used hierarchical Bayesian model and WinBUGS software to calculate the ratio of NTDs. In the model, the observed number of NTD cases in each village was treated as a binomial random variable with parameter P i . P i was the probability of a live birth in village i having a birth defect. The standard rate (i.e., the number of observed birth defect cases divided by all births) was the maximum likelihood estimate of P i . As environmental conditions and socioeconomic status were similar, P i was assumed to be constant within a village (Wu et al. 2004). The model decomposed the log odds of P i into an intercept μ and two heterogeneity parameters, one displaying spatial structure (v i ) and the other displaying unstructured heterogeneity (ε i ) a priori:

$$ \log it\left( {P_{i} } \right) = \log \left[ {{{P_{i} } \mathord{\left/ {\vphantom {{P_{i} } {\left( {1.0 - P_{i} } \right)}}} \right. \kern-\nulldelimiterspace} {\left( {1.0 - P_{i} } \right)}}} \right] = \mu + v_{i} + \varepsilon_{i} $$
(1)

The spatially structured parameter v i was an intrinsic Gaussian autoregression (Haining 2003). More specifically, the conditional distribution of v i , given v j , i ≠ j, was assumed to be normal, with mean equal to the average of the v j in adjacent units and variance equal to an unknown variance parameter σ 2 v divided by the number of adjacent regions of unit i. Spatial units were considered adjacent if they shared a border or corner. The unstructured effect ε i was assumed to be a prior independent with mean 0 and variance σ 2 ε . For both the spatial and the non-spatial heterogeneity variance parameters σ 2 ε and σ 2 v , the model used commonly applied inverse gamma priors (Besag et al. 1995) with a high prior massed on very small values, but highly dispersed. Noted that σ 2 v was a spatial smoothing parameter, determining the variability of the spatial heterogeneity component v i (Staubach et al. 2002). In this study, we assumed that the hyperparameters σ 2 ε and σ 2 v were independently distributed a priori as gamma distributions: σ 2 ε  ∼ Gamma (0.5, 0.0005), σ 2 v  ∼ Gamma (0.5, 0.0005).

A computational challenge in applying a Bayesian model is that, for most realistic problems, the integrations required to make inferences are generally not tractable in closed form and thus must be approximated numerically (Short et al. 2002). Here, a modern Monte Carlo Markov Chain (MCMC) method based on Gibbs sampling was used to obtain the marginal posterior distributions for the parameters without analytical approximations. Before performing statistical inference in WinBUGS, we input the model, specified the data and initialized values for the MCMC method. And then the parameters needed to make inferences were set as monitoring parameters in WinBUGS. The inference process was based on a sample of 1,000,000 values after a burn-in of 10,000 iterations from a single chain. For each parameter, we calculated the posterior mean, standard deviation, Monte Carlo error, median, 2.5th percentile, and 97.5th percentile (with the last two items being the 95% credible interval). The posterior mean of a parameter value would be a sensible point estimate of this parameter.

2.4 Spatial filtering method

Spatial filtering method, which uses non-parametric statistical techniques for exploratory spatial data analysis tool, can estimate unknown risk or relative risk without making distributional assumptions (Anselin et al. 2006). This method has been used to study spatial interpolation and clusters of congenital anomalies, infant mortality, and other forms of birth morbidity (Esra et al. 2005; Rushton and Lolonis 1996; Talbot et al. 2000). In the present study, a spatial filtering method was used to map spatial variations in NTD morbidity.

The statistical software DMAP (www.uiowa.edu/~gishlth/DMAP4/) was used in the analysis; it is a publicly available software package that uses the Rushton spatial filtering method. This method first creates a series of spatial filter circles centered on point locations on a regular grid that covers the study area (see Fig. 2). These circles represent the area from which the estimate of the NTD rate is made. For each of the circles, the rate is computed as the crude rate of NTD events to all births or, alternatively, as the ratio of cases to controls. After this rate is estimated for the entire grid of points, the NTD rate can be interpolated as a continuous spatial distribution. Neighboring grid points share overlapping circular patterns and thus observations. Isarithmic maps with a constant range of values were constructed in GIS after the estimated rates were assigned to grid points. Given the births at these point locations, the Monte Carlo method was then used to test how likely the spatial distribution of NTDs was due to chance alone. Areas with significantly high NTD rates were regarded as hot spots.

Fig. 2
figure 2

Regular lattice grid and spatial filter areas used to measure NTD rates in Heshun

DMAP requires four files for input: the grid, numerator (case) events, denominator (all births/controls) events and event probability. The grid file was automatically created using the endpoints of the rectangle and the distance between each grid point. The numerator events file was created from relative information from villages with NTD cases during the study period. Similarly, the denominator events file was created from information from all villages in Heshun. In addition, the maximum window radius needed to be input during the course of analysis. DMAP provided information on NTD rate and the statistical significance of the observed rates in each grid. Grids whose significance was greater than a specified value were selected as hot spots.

In this study, by matching addresses from birth records to points in the village, we were able to compute the NTD rate for each location on a grid covering the entire Heshun county at a resolution of approximately 4.8 km. As shown in Fig. 2, the 4.8-km interval between grid intersections resulted in 204 grid points. In order to make general conclusions about the results of spatial filtering, we selected 4 km as the filter size. Then we estimated meaningful NTD rates for grid points that had at least 40 persons.

For each village, we generated a random number from a uniform distribution in the range of 1–999. For each of the 204 grid points, 999 Monte Carlo simulations were made. The simulated and real rates were then rank-ordered. The percentage of the simulated rates at each grid point that were less than the observed rate for that same grid point were computed and the levels of statistical significance portrayed as isolines. Because testing the rates against 999 simulations is a form of exploratory spatial analysis, the methods for representing the results are discretionary, and the investigator can adjust the results based on the level of significance.

3 Results

3.1 Risk map from hierarchical Bayesian model

The distribution of NTD risk as revealed by hierarchical Bayesian model (see Fig. 3b) was basically consistent with the real distribution of NTDs in rural Heshun. High-incidence areas were located mainly in the middle and eastern areas of the county. However, there were no zero incidences after rates were adjusted.

Fig. 3
figure 3

Crude NTDs rate map (a) and relative risk maps from hierarchical Bayesian model (b) and spatial filtering method (c, d)

3.2 Risk map from spatial filtering method

Figure 3c shows the NTDs risk map obtained using spatial filtering method. Figure 3d shows the statistical significance of the adjusted NTDs rate in each village. Several clusters of NTDs are obvious.

4 Discussion

The crude NTD rate for a village is an unbiased estimator of the positive area-specific relative risk of NTDs in that village. However, because the crude rate treats each village independently, its precision depends on the size of the at-risk population in that village. Thus, the interpretation of maps is difficult when different villages have very different population sizes (Haining 2003). Villages with smaller populations, and correspondingly fewer births, have larger variances for calculating the ratio of NTD occurrence. Simply dividing the number of NTDs by the number of all population may cause an error in risk mapping (Wu et al. 2004). A better estimator would be one with a more uniform—and preferably higher—level of precision across all villages (Haining 2003).

Hierarchical Bayesian model assumes that risks factors for a disease are random variables and that each risk factor can be drawn from a prior probability distribution. It divides the random effects of relative risk into spatially structured and unstructured random variations (Mollie 1995). These two types of random effects can be used to “stabilize” maps of relative risk by “borrowing” strength from estimates elsewhere on the map. The information from other sub-areas on the map is formalized as specified by the prior distribution. Thus, a risk map from hierarchical Bayesian model considers both the spatial structure of the data and the uncertainty about disease prevalence. The map in Fig. 3b implies a rather strong smoothing effect of the observed NTDs rate.

MCMC method based on Gibbs sampling was used to obtain the marginal posterior distributions for the parameters. This inference method produces an ensemble of (statistically dependent) realizations from the target posterior distribution. It relies on simulating from a Markov chain that has been carefully designed so that its stationary distribution coincides with the target posterior distribution. It follows that, after a burn-in, the generated realizations of the chain comprise draws from the posterior distribution (Cressie et al. 2007). Its advantages include ease of implementation and flexibility. It can avoid numerical integration, a hurdle for inference in random-effects models. It may also greatly improve estimator precision.

Hierarchical Bayesian model has two primary advantages over other methods: (1) It coherently manages uncertainty in the framework of probability theory, and (2) it makes explicit the assumptions on which the model relies (Demichelis et al. 2006). However, much expertise is needed to determine the distribution of hyperparameters in the model as this affects the accuracy of the adjusted rates. Moreover, the type of hierarchical Bayesian model used in the present study cannot be used to predict relative risk in areas with no data.

Unlike hierarchical Bayesian model, spatial filtering method does not make any distributional assumptions. It presumes that the disease rate of the whole study area is continuously distributed in space. The input data are simply the number and location of cases and the population. Spatial filtering method is comparatively easy to perform. It involves first creating a series of circular moving windows or disks centered on point locations on a regular grid that covers the study area. For each of the disks, the rate is computed as the crude rate of events to population or, alternatively, as the ratio of cases to controls. The resulting values of each grid point are then represented in smooth form by means of contour plots, a three-dimensional surface, or similar techniques (Rushton and Lolonis 1996). Moreover, the significance levels of each adjusted NTD rate can be obtained using MCMC methods.

Spatial filtering method has three main advantages: (1) For diseases whose etiology is imperfectly understood, it incorporates characteristics of individual cases in a GIS database so that expert investigators can “walk the street” in a GIS environment and examine the characteristics of cases that occur in close geographical proximity; (2) it is suitable for mapping rare diseases such as birth defects because it can evaluate the risk of a specified grid based on its neighbors; and (3) the isarithmic maps of disease rates together with spatial analyses of their patterns, can provide valuable decision-making clues for further investigations. The grid interval and filter size are particularly relevant to computing birth defect rates. The window radius should not be so small that no births will be counted, but it should be large enough to ensure the sharing of observations between neighboring grid locations (Rushton and Lolonis 1996).

However, spatial filtering method is prone to over-smoothing and can hide meaningful spatial heterogeneity. The relative risk maps from both hierarchical Bayesian model and spatial filtering method revealed NTD hot spots in the middle and east regions of Heshun. Although the regional risk levels in the hierarchical Bayesian model map are obvious, the map did not remove all the random spatial noise in the rude disease rate. In contrast, the spatial filtering method map does not show the differences in relative risk among most villages, especially those villages in the middle region. Some spatial clusters of NTDs appeared on the map. Thus, spatial filtering method may be used to detect clusters of events.

5 Conclusion

Mapping incidence rates or mortality rates (relative risks) from disease is of prime importance in the field of epidemiology. The usual procedure is to map the crude disease rate across different geographical regions. But the crude rate may not be meaningful, particularly for small areas, as it does not take into account spatial patterns or the high variability for different population sizes (Maiti 1998). In this study, hierarchical Bayesian model and spatial filtering method were used to map the relative risk of NTDs in Heshun. The study had some limitations: It did not measure the uncertainty associated with the estimates of relative risk, nor did it compare the efficiency of the two models in multiple contexts. Despite these limitations, our study provides useful information about using hierarchical Bayesian modeling and spatial filtering to map relative risk of disease. We conclude that different factors, such as distributional assumption of relative risk and the target of the risk assessment, should be taken into consideration when choosing which method to use.