SNPs and haplotypes for analysis in previous studies were taken from the HapMap databases available at the time of study (http://www.hapmap.org). These antedated the release and availability of more extensive and detailed DNA diversity information from complete genomic sequences currently available through the 1000 Genomes Project (http://www.1000genomes.org), which in particular provide an enriched sampling of SNPs that are rare in Europe compared to the reported HapMap SNPs, and therefore especially useful for our purposes. Therefore, upon availability of the March 2010 release of the 1000 Genomes Project, we analyzed 119 whole genome sequences of which 60 are of European origin (CEU), and 59 are of West-African origin (YRI), yielding a total of 7,479 SNPs in the 1.55 Mbp chromosome 22 interval surrounding MYH9 and spanning nucleotide positions 34,000,000–35,550,000 (NCBI36). We then applied filtering criteria to identify candidates for further consideration and analysis based on (1) low allele frequency in CEU but not in YRI and (2) linkage disequilibrium (LD) patterns with the previously identified leading MYH9 risk variants (see supplementary material). Of the 250 variants that met these criteria, four are coding region nonsynonymous mutations (Table 1), none of which were reported in HapMap. The first two SNPs (rs73885319 and rs60910145) are missense mutations in the last exon of the APOL1 gene (S342G and I384M) which is the neighboring gene, located 14 kbp 3′ downstream from MYH9. A third SNP (rs11089781) is a nonsense mutation (Q58X) in the APOL3 gene located 110 kbp further 3′ downstream. The fourth SNP (rs56767103) is a missense mutation (R71C) in the gene FOXRED2 located 100 kbp upstream to the 5′ side of MYH9 (Supplementary Fig. 1). Of note, the two variants located 128 bp apart in APOL1 are in almost perfect LD (237 out of 238 chromosomes from the 1000 Genomes Project).

Table 1 Association with nondiabetic ESKD of nonsynonymous SNPs in APOL1, APOL3, and FOXRED2 in the MALD peak and comparison with leading MYH9 SNPs

These four variants were genotyped in a previously reported composite sample set of 955 subjects taken from two different populations, namely African American and Hispanic American cases and controls (Behar et al. 2010) (supplementary material). In this composite sample set, subjects with ESKD etiologies designated as MYH9-associated nephropathies as defined above, and notably excluding diabetic nephropathy, which was not previously found to be associated with MYH9 (Behar et al. 2010; Kao et al. 2008; Kopp et al. 2008), were designated as cases (n = 430). Subjects without known kidney disease and a creatinine concentration below 1.7 at age 55 or greater were designated as controls (n = 525). Associations of the candidate mutations with these ESKD phenotypes, previously attributed to MYH9, was determined using logistic regression, with correction for global and local ancestry, and considering three modes of inheritance as previously reported (Behar et al. 2010). It should be noted that the four variants were genotyped for the two population sources using two different techniques each conducted in a separate laboratory, and with sequence validation of the genotyping method (Supplementary Fig. 6). We chose to use two different population sources, African American and Hispanic American, with markedly differing degrees of global African ancestry admixture (Bedoya et al. 2006; Behar et al. 2010), rather than two different collections from the same population, in order to test whether associations were robust to differing global background genomic composition.

Table 1 shows the association results for the combined dataset of the African American and Hispanic American cohorts. For comparison, we include the results from the two previously reported leading MYH9 risk variants as noted above (S-1 rs5750250, F-1 rs11912763) (Nelson et al. 2010). We found that the APOL1 missense variants (rs73885319 and rs60910145) are more strongly associated with ESKD risk than the leading MYH9 risk variants, both in terms of OR and p values (Table 1). In contrast, the lower allele frequency and OR, with a higher p value for the APOL3 nonsense variant, and the weak association for the FOXRED2 missense mutation, render these variants to be very unlikely candidates to explain the risk attributed to this genomic region. For the highly associated APOL1 missense mutations, the risk allele frequency in the African American control cohort is 21% in contrast to 37% in the cases (corresponding values for Hispanic Americans are 6 and 23%, respectively). The most striking difference is in the frequency of the homozygote risk state, with only 3% in controls compared to 18% in cases for African Americans (corresponding values for Hispanic Americans are 0.5 and 11%, respectively) (see also Supplementary Fig. 5). We also show that the results for combined and meta-analysis of the two separate cohort-based results are congruent (Supplementary Table 1). As also evident from Table 1 and Supplementary Table 1, the modes of inheritance for which the associations of the disease risk phenotype with the APOL1 missense mutations are significant is consistent with an additive effect, wherein carrying the missense mutations on a single parental allele is sufficient to confer significantly increased risk, but with a still greater jump in risk conferred by carrying the missense mutations on both parental alleles. As noted above, the two missense mutations are in nearly perfect LD, and therefore, on a population genetics basis alone they can be considered as designating a “missense risk haplotype”. Examination of the predicted effect on protein structure and functional studies with artificial constructs which separate the two missense mutations will determine whether disease risk relates to either or both together as a functional missense haplotype. Therefore, for further analysis in the current study, we go on to use APOL1 SNP rs73885319 as tagging the “missense risk haplotype”. Analysis of deviance of the combined logistic regression indicates that LD with APOL1 SNP rs73885319 accounted for most of the statistical association previously attributed to the leading MYH9 variants with ESKD (supplementary material). In this regard, we also examined two noncoding variants in the APOL1 region which are in high LD with the APOL1 missense mutations, and as expected, both showed significant disease risk association (Supplementary Fig. 2; Supplementary Table 1).

HIVAN has been considered as the most prominent of the nondiabetic forms of kidney disease within what has been termed the MYH9-associated nephropathies (Kopp et al. 2008; Winkler et al. 2010). We have reported absence of HIVAN in HIV infected Ethiopians, and attributed this to host genomic factors (Behar et al. 2006). Therefore, we examined the allele frequencies of the APOL1 missense mutations in a sample set of 676 individuals from 12 African populations, including 304 individuals from four Ethiopian populations (Supplementary Table 2). We coupled this with the corresponding distributions for the African ancestry leading MYH9 S-1 and F-1 risk alleles. A pattern of reduced frequency of the APOL1 missense mutations and also of the MYH9 risk variants was noted in northeastern African in contrast to most central, western, and southern African populations examined (Supplementary Fig. 3). Especially striking was the complete absence of the APOL1 missense mutations in Ethiopia. This combination of the reported lack of HIVAN and observed absence of the APOL1 missense mutations is consistent with APOL1 being the functionally relevant gene for HIVAN risk and likely the other forms of kidney disease previously associated with MYH9.

The APOL1 gene encodes apolipoprotein L-1, whose known activities include powerful trypanosome lysis (Vanhamme et al. 2003), autophagic cell death (Wan et al. 2008), lipid metabolism, cellular senescence, as well as vascular and other biological activities (Monajemi et al. 2002; Vanhollebeke and Pays 2006). Of note, in humans, APOL1 is one of six closely spaced and related APOL genes, respectively, encoding six gene products apolipoproteins L-1 through L-6 (Duchateau et al. 1997, 2001). While APOL3 is thought to have arisen first in genomic evolution of the region, with the others having arisen as a result of duplication events (Monajemi et al. 2002), only apolipoprotein L-1 has a signal peptide which enables it to be both a circulating and intracellular protein (Vanhollebeke and Pays 2006). This latter capacity is of crucial importance to the protective and lytic activity of human serum to many species of trypanosoma. Trypanosoma brucei rhodesiense transmitted by tsetse flies causes human African trypanosomiasis as a result of the expression of serum resistance associated protein (SRA), which interacts with the C-terminal domain of apolipoprotein L-1 and inactivates its lytic function (Lecordier et al. 2009). Apolipoprotein L-1 protein structure is divided into three distinctive structural and functional domains: (1) an anionic pore-forming domain which is thought to be involved in organellar permeation and cell death, (2) a membrane addressing-domain consisting of a pH-sensitive hairpin bridging two alpha helices which facilitates association with the circulating HDL particle at neutral pH and intracellular organellar localization at acidic pH, and 3) a C-terminus amphipathic alpha helix with a leucine zipper for protein–protein interaction (Vanhollebeke and Pays 2006). It should be noted that the APOL1 S342G variant, powerfully associated with kidney disease risk in the current study, is predicted to modify the binding site of the C-terminus domain of the APOL1 gene product (Supplementary Fig. 4).

With respect to kidney disease risk, apolipoprotein L-1 is also prominently involved in autophagic pathways (Zhaorigetu et al. 2008), and a recent study has provided compelling evidence for the role of well-preserved autophagy in the integrity of renal glomerular podocytes (Hartleben et al. 2010). It is thus possible that variation in the C-terminus domain of endogenously expressed or endocytosed apolipoprotein L-1 modifies interaction with an as yet unidentified renal intracellular protein, which regulates the availability of apolipoprotein L-1 in its pore-forming or other functions. Given the numerous known functions of apolipoprotein L-1 noted above, a number of other mechanisms for kidney disease risk are possible, including those related to lipid metabolism or vascular integrity (Monajemi et al. 2002). The involvement of other classes of apolipoproteins in nephropathy has been well documented (Takemura et al. 1993). Moreover, apolipoproteins have also been identified as circulating inhibitors of glomerular proteinuria (Candiano et al. 2001). Remarkably, in this latter study, the amino acid sequence of the apolipoprotein L fraction isolated and studied corresponds exactly to what we now know to be apolipoprotein L-1.

Functional assays of the effect of the APOL1 missense variants described herein in appropriate experimental model systems will be needed to link the strong and biologically plausible association to a functional pathogenic pathway in kidney disease and to the possible selective factors which contributed to the observed African allele frequency distribution.

It is noteworthy that this entire region of the genome shows a high degree of LD and evidence of strong evolutionary selection, which may well explain why MYH9 yielded such a strong association signal in prior studies due to a hitchhiking effect of MYH9 with variants in the APOL gene family that have conferred adaptive advantage (Grossman et al. 2010; Smith and Malik 2009; Stephan et al. 2006). It should also be noted that the expression of apolipoproteins L in vascular and immune cells is greatly increased by viral infections, interferons, and inflammatory mediators (Vanhollebeke and Pays 2006), any or all of which might constitute the basis for a second “trigger” needed to induce the actual clinical manifestation of nephropathy in a genotypically “at risk” individual. Moreover, given the close evolutionary relationship of the six members of the human apolipoprotein L family and their manifold and overlapping functions, it is conceivable that additional rare or common variants might also interact and be involved in kidney disease risk (Supplementary Table 3).

The current findings strongly suggest that the intensive efforts (http://www3.niddk.nih.gov/fund/other/MYH9KidneyDisease/) currently underway to identify the ESKD disease phenotype risk causative variant in the chromosome 22 MALD peak should certainly be extended beyond the MYH9 locus. In particular, a strong emphasis should also be placed on APOL1 mutations with strong association and functional plausibility.

Methods and supplementary information and any associated references are available on the Human Genetics website: http://www.springer.com/biomed/human+genetics/journal/439.