Introduction

Type II diabetes, or non-insulin-dependent diabetes mellitus [MIM 125853], has undergone an explosive increase in prevalence during the past two decades and is now considered one of the main threats to human health worldwide.1 The diabetes epidemic is primarily owing to the spread of a sedentary lifestyle and obesity, which become pronounced under modernization, urbanization and industrialization.2 Although this epidemic is apparent worldwide, it is most pronounced in some recently modernized traditional societies such as Native Americans,3 Pacific Islanders4 and Australian Aborigines.5

The prevalence of type II diabetes in several Pacific populations ranks among the highest in the world.4 For example, on the Pacific island of Nauru type II diabetes was virtually unknown 50 years ago but is now present in 40% of adults.6, 7 This is the second highest prevalence recorded in the literature, after that of the Pima Indians.3 This heightened susceptibility, however, is not equally severe across all Pacific populations and it has been suggested that the extreme susceptibility genotype was introduced by the Austronesian-speaking ancestors of present-day Polynesians,4, 8 who likely originated in Taiwan sometime before 3000 bp and reached the furthest islands of Polynesia by 800 bp.9 For example, Austronesian-speakers from Fiji,10 East New Britain11 and coastal Papua New Guinea12 have moderate to high susceptibilities to type II diabetes, whereas there is a notable absence of type II diabetes in both traditional living and partly urbanized non-Austronesians from highland Papua New Guinea8, 11 and the Solomon Islands.13 These latter groups are thought to be primarily derived from earlier migrations.14, 15, 16 In the Solomon Islands and New Caledonia, where Austronesian and non-Austronesian populations share the same environment, type II diabetes prevalence is higher in Austronesians than in non-Austronesians,17, 18 suggesting that the varying genetic susceptibility to type II diabetes in this region of the world may be ascribed to the relative contribution of Austronesian genetic admixture.

The loci that contribute to within-population phenotypic variation may be different from those that are responsible for between-population phenotypic variation and thus traditional within-population association and linkage analyses may have limited power to detect the loci that contribute to phenotypes that differ greatly between populations. For example, although variation in the MC1R gene is associated with skin pigmentation variation within Europeans,19, 20, 21 allele frequencies at this locus do not differ greatly between Europeans and other populations and it is, therefore, unlikely that this locus contributes to differences in skin pigmentation between Europeans and other populations.22 The alleles that underlie large between-population phenotypic differences are expected to show large frequency differences between populations. Thus, from a set of type II diabetes-associated SNPs, those at which the susceptibility allele is at unusually high frequency in Polynesians, compared to neighboring populations, are candidates to account for the high prevalence of type II diabetes in Polynesians. To identify such candidates, we genotyped and calculated Fst, a measure of genetic differentiation, for 10 type II diabetes-associated SNPs in three human populations (Polynesians, Han Chinese and highland Papua New Guineans) and compared these values to a distribution of Fst values from 100 000 SNPs genotyped in these same three populations.

Materials and methods

DNA samples included 23 Polynesians (nine Cook Islanders, eight Western Samoans, four Tongans and two Nuie Islanders), 23 highland New Guineans and 19 Han Chinese; samples were collected with ethical approval from the participating institutions and DNA was extracted according to standard protocols. Samples from Polynesia and New Guinea are described elsewhere,14, 23 whereas the Han Chinese are from Beijing. Type II diabetes-associated alleles were defined as variants with replicated evidence for association with type II diabetes (in ABCC8, ADRB2, CAPN10 (2 SNPs), GYS1, IRS1, KCNJ11, PPARG, PPARGC1A, SLC2A1) as suggested by two meta-analyses.24, 25 The recently described variants in TCF7L2 were not typed because this discovery was made after genotyping had been initiated for this project.26, 27 Genomic regions surrounding the SNPs of interest were amplified by PCR and restriction enzyme analyses were used to detect alleles at each SNP site. For SNPs that did not present a natural restriction site, we employed PIRA PCR.28 For the SNP in GYS1 (rs8103451), all individuals were monomorphic for the allele that is not recognized by the restriction enzyme and this was confirmed by sequencing.

An unbiased estimator of Fst was calculated for each population pairwise comparison for each type II diabetes-associated SNP and for each SNP from the Affymetrix GeneChip® Human Mapping 100 K Set according to equation (1) in Weir and Cockerham.29 The genotyping of samples with the 100 K SNP set followed previously described methods30 and included genotype data for 116 197 SNPs. However, an allele that is monomorphic in the two populations being compared receives an undefined Fst value according to Weir and Cockerham,29 and SNPs that contained genotype information from fewer than 50% of the individuals in a population were omitted in that population. Thus the number of Fst values from the 100 K set for each population pairwise comparisons were as follows: China-New Guinea=92783, China-Polynesia=93294, New Guinea-Polynesia=89012.

To determine whether our geography-based assignment of samples to populations reflects the underlying genetic structure of the samples, we examined population structure using STRUCTURE v 2.1.31 Owing to program limitations, 10 000 SNPs were randomly selected for the analyses. Two groups were assumed under the admixture model without any prior population assignment; 10 000 burn-in cycles and 50 000 replicates were used in each run. All runs were performed under the l=1 option and repeated five times. The ln of the probability of observing the data was −335837.0 for K=2. Chinese and New Guinea populations were clearly differentiated whereas the Polynesian population appeared as intermediate between them. The 95% credible intervals of the admixture proportions overlap in all Polynesian individuals. However, we observed significantly higher Chinese ancestry (13%) in one New Guinea individual when compared to other New Guineans (0.2%) and this individual was therefore excluded from further analyses (data not shown).

The samples from the 100 K set were identical to the samples typed for the type II diabetes-associated SNPs except seven Cook islanders and six Han Chinese from the 100 K set were replaced by seven different Cook islanders and five different Han Chinese for the diabetes SNP typing. Results were similar and conclusions unchanged when analyses were conducted on the subset of samples typed for both the 100 K set and the diabetes-associated SNPs (data not shown). The type II diabetes-associated SNP in PPARG was not typed by RFLP because it was included in the 100 K SNP set. Empirical P-values were generated by comparing the observed value to the Fst distribution from the 100 K set. To correct for multiple comparisons, for each population pairwise comparison we randomly sampled 10 SNPs 10 000 times from the 100 K set and generated corrected P-values by counting the number of times out of 10 000 one or more of the 10 Fst values was equal to or greater than the observed Fst value for each diabetes-associated SNP.

Results and discussion

A list of the sites of interest, the primers and the restriction enzymes are provided in Table 1. None of the 10 type II diabetes-associated SNPs was significantly out of Hardy-Weinberg equilibrium (data not shown). The frequencies of the type II diabetes-susceptibility alleles in the Chinese, New Guineans and Polynesians and the Fst values for each population pairwise comparison are provided in Table 2.

Table 1 Information on the nine type II diabetes-associated SNPs typed in the present study
Table 2 Frequencies in the three populations and population pairwise Fst values for 10 type II diabetes-associated SNPs. Fst values of particular interest are in bold with P-values in parantheses

The susceptibility allele at SLC2A1 is at high frequency in Polynesians when compared to the Chinese (Fst=0.254, P=0.052), but not when compared to the New Guineans (Fst=0.059, P=0.500). The prevalence of type II diabetes in China is presently at 6–7%32, 33 and with increasing economic development the prevalence is on the rise.34, 35, 36 However, it is not known whether, under similar environmental conditions, type II diabetes prevalence in the Chinese is substantially lower than the prevalence in Polynesians. Thus, it remains unclear whether the high frequency of the SLC2A1 susceptibility allele contributes to the high prevalence of diabetes in Polynesians. The similar frequency of this allele between New Guineans and Polynesians could reflect genetic contributions from the former to the latter, as has been hypothesized to have occurred during the migrations of the Polynesian ancestors along the northern coast of New Guinea.37, 38

The PPARGC1A SNP exhibits the largest allele frequency difference between the populations in this study (Table 2). Especially striking is the high frequency of the susceptibility allele in Polynesians (0.72), compared to its complete absence in the New Guineans. Figure 1 shows the Fst values of the PPARGC1A SNP compared to the empirical distribution from the 100 K SNP set for all three population pairwise comparisons. In the Polynesia – New Guinea comparison, the Fst value for the PPARGC1A SNP is 0.703, which lies in the top 0.7% of the empirical Fst distribution: only 608 out of the 89012 SNPs from the 100 K set have greater Fst values. After correcting for multiple comparisons, this observation remains unusual (P=0.0655). Compared to the frequency in the Chinese (0.37), the frequency of the PPARGC1A susceptibility allele is relatively high in Polynesians: the Polynesia – China Fst is 0.194 (P=0.091). The Fst value for the China – New Guinea comparison is 0.369 (P=0.115). Although the Chinese have a frequency of the PPARGC1A susceptibility allele that is intermediate between the New Guineans and Polynesians, it is unclear how this may relate to the susceptibility of type II diabetes in the Chinese because comparisons between studies of the prevalence of type II diabetes in different populations under different environmental conditions are difficult to interpret. Nevertheless, as Polynesians have a much higher prevalence of type II diabetes than New Guineans living under similar environmental conditions, the unusually large difference in frequency between these two groups for the type II diabetes-susceptibility allele in the PPARGC1A gene supports the notion that this SNP, or SNPs in close LD to this SNP, play an important role in type II diabetes etiology in this area of the world.

Figure 1
figure 1

Fst distributions for each population pairwise comparison generated from the 100 K SNP set. The Fst values of the PPARGC1A susceptibility allele is indicated by a dotted line and is shown in boxes along with the respective P-values.

PPARGC1A, or peroxisome proliferator-activated receptor-γ coactivator 1α, is a transcriptional co-activator that regulates the transcription of genes involved in adaptive thermogenesis, adipogenesis and oxidative metabolism39 and also regulates hepatic glucose output through the control of gluconeogenesis.40, 41 The susceptibility allele at PPARGC1A changes a glycine to serine at codon 482 and transfection assays have demonstrated that this substitution affects the protein's efficiency as a coactivator on the Tfam promoter, which may result in altered mitochondrial function and insulin resistance.42

Linkage analyses have also identified the PPARGC1A genomic region as a candidate locus for type II diabetes in the Pima Indians.43 However, as the frequency of the susceptibility allele in the Pima is only 0.18,44 which is lower than in Europeans (0.37),45 it is unlikely that this allele accounts for the observed high prevalence of type II diabetes in the Pima compared to European-Americans.3

It has previously been suggested that the Polynesians underwent strong selection pressures for energetic efficiency during their settlement of the Pacific, which required long open ocean voyages in the face of cold stress and starvation.46, 47, 48, 49 According to the thrifty gene hypothesis,50 such conditions may have led to positive selection for a thrifty metabolism in Polynesians and driven type II diabetes-susceptibility alleles to high frequency. Thus, past positive selection may at least partially explain the high prevalence of type II diabetes among contemporary Polynesian populations when compared to their neighbors in New Guinea without the heightened susceptibility. Several studies have demonstrated that a useful approach to detecting the signature of positive selection in populations under different selective pressures is to identify SNPs with high Fst values.51, 52, 53, 54 Thus, the unusually high frequency of the PPARGC1A susceptibility allele in Polynesians may reflect past positive selection.

Regardless of whether drift or selection is responsible for the high frequency of this allele in Polynesians, the fact remains that Polynesians exhibit both unusually high frequencies of type II diabetes, and unusually high frequencies of a known type II diabetes susceptibility allele. These two results indicate that the PPARGC1A susceptibility allele is a strong candidate for explaining the high frequency of type II diabetes in Polynesians and merits further investigation. Moreover, our results indicate that searching in candidate genes for alleles that exhibit large frequency differences between populations is a useful approach for identifying potential candidates for the genetic basis of phenotypic traits that vary greatly between populations.