Introduction

Major histocompatibility complex (MHC) represents a multigene family that plays a crucial role in the generation of adaptive immune responses in vertebrate species. A key feature of the system is that some of its genes display abundant polymorphism at the population level. In addition, the number of Mhc class I or II genes may differ significantly between species, as well as between individuals of a species (Kelley et al. 2005). MHC polymorphisms may have a profound impact on several features such as disease susceptibility, organ transplantation, and reproductive success (Lagaaij et al. 1989; Goulder and Watkins 2004; Bontrop and Watkins 2005; Ziegler et al. 2005; Smith et al. 2006). The MHC systems of various primate species, including humans, have been studied extensively (Watkins 1995; Antunes et al. 1998; Bontrop et al. 1999; Adams and Parham 2001; de Groot et al. 2002; Lafont et al. 2004; Middleton et al. 2004; Marsh et al. 2005; Penedo et al. 2005; Abbott et al. 2006; Huchard et al. 2006). For example, the MHC of the rhesus macaque (MhcMamu), an Old World primate species, has been shown to share many similarities with the human leukocyte antigen (HLA) system in humans. The evolutionary orthologs of the HLA-DP, -DQ, and -DR genes are also present in rhesus monkeys, and as in humans, these loci are polymorphic (Bontrop et al. 1999). The number of Mamu-DRB region configurations appears to be expanded in comparison with humans, and some of these regions seem to harbor an extended number of genes (Doxiadis et al. 2000). Subsequent cDNA studies have illustrated, however, that in humans and rhesus macaques, comparable numbers of DRB genes are transcribed (de Groot et al. 2004).

Apart from Mamu-E (Knapp et al. 1998), -F (Otting and Bontrop 1993), -G (Boyson et al. 1996a; Castro et al. 1996), and -AG (Boyson et al. 1997), rhesus macaques also possess a B-like sequence, designated Mamu-I, which appears to be present in each haplotype (Urvater et al. 2000). As observed in humans, these nonclassical genes display low levels of polymorphism. An ortholog of the HLA-C gene appears to be absent in the rhesus macaque, whereas evolutionary equivalents of the classical HLA-A and -B genes have been described (Boyson et al. 1996b). The Mamu-A and -B genes have been subjected to several rounds of duplication as was shown by genomic sequencing (Daza-Vamenta et al. 2004; Kulski et al. 2004). Analysis of a panel of rhesus macaques, mainly originating from the Indian subcontinent, illustrated that the number and combination of Mamu-A and -B genes that are expressed per haplotype may differ at the population level (Otting et al. 2005). In the same study, marked differences in expression levels were shown. At this stage it is not known whether rhesus macaques originating from other geographic areas have unique Mhc class I alleles and/or region configurations. Comparative studies have illustrated that many Mhc loci and lineages predate speciation events. The sharing of Mhc alleles between two primate species seems to be a rare event, and only a few cases have been documented (Cooper et al. 1998; Evans et al. 1998). An exception is provided by rhesus and cynomolgus macaques, which seem to share a high number of Mhc class II alleles as was defined by exon 2 sequencing (Blancher et al. 2006; Doxiadis et al. 2006). Whether this sharing is because of introgression or purifying selection remains to be elucidated. Thus far, about 50 Mafa-A sequences have been published for the cynomolgus macaque (Uda et al. 2004; Krebs et al. 2005). The absence of pedigreed material did not allow us to define Mafa-A loci or region configurations. For that reason, a panel of pedigreed cynomolgus macaques was incorporated in the present study. Comparison with rhesus macaque Mhc class I sequences obtained from different populations enabled us to draw conclusions on the evolutionary stability of A region variations.

Materials and methods

Animals and cell lines

The Biomedical Primate Research Centre houses a self-sustaining colony of approximately 1,000 rhesus macaques, mainly of Indian origin, that have been pedigreed based on the segregation of serologically defined MHC allotypes (Bontrop et al. 1999; Penedo et al. 2005; Doxiadis et al. 2006). Furthermore, a large collection of DNA samples as well as B-cell lines is available. In this study, B-cells derived from rhesus macaques of Chinese origin and from cynomolgus monkeys were used to isolate RNA. Most cynomolgus monkey samples originate from a pedigreed group housed at the campus of the University of Utrecht.

cDNA cloning and sequencing

RNA was isolated from B cells (Rneasy kit, Qiagen) and subjected to One-step reverse transcriptase polymerase chain reaction (RT-PCR), as recommended by the supplier (Qiagen). The primers (5′MAS) AATTCATGGCGCCCCGAACCCTCCTCCTGG, and (3′MAS) CTAGACCACACAAGGCGGCTGTCTCAC were used, which are specific for class I A transcripts in macaques. The final elongation step was extended to 30 min to generate a 3′dA overhang. The RT-PCR products were cloned using the InsT/Aclone kit (Fermentas). After transformation, 16 to 32 colonies were picked for plasmid isolations. The separate Mamu-A loci show differences in expression levels, as indicated by the numbers of clones, picked from each animal. The alleles with high expression levels (majors) may be associated with serotypes and thus involved in classical antigen presentation, whereas those with low expression levels (minors) are considered nonclassicals and may exhibit more specialized types of functions (Otting et al. 2005).

Sequencing reactions were performed using the BigDye terminator cycle sequencing kit, and samples were run on an automated capillary sequencing system (ABI Genetic Analyzer 3100) as has been described previously (Otting et al. 2005).

Locus-specific PCR reactions

To test the presence of Mamu- and Mafa-A4*14 alleles, 1 μl of the RT-PCR samples was taken for a single-specific-primer PCR (SSP-PCR) reaction in advance of the cloning step. The relevant primers, (5′A*14) GGGACCCGACGGGCGCCTCCAA and (3′A*14) GGCCCTCCAGGTAGACTCTGTC have annealing sites in exon 3. Amplifications were carried out starting with 2 min at 94°C, followed by 25 cycles at 94°C, 65°C, and 72°C for 1 min each. The PCR products were subjected to direct sequencing, and the reactions were performed as described above.

Phylogenetic analysis and nomenclature

The sequences were analyzed with the Sequence Navigator Software version 1.0.1 (Applied Biosystems), and alleles are based on at least three clones with identical full-length sequences. To define loci and lineages, alignments of the sequences were made using the MacVector™ version 8.1.1 (Oxford Molecular Group), followed by manual adjustments. Phylogenetic analyses on the full-length (1,065 bp) sequences were also performed with the MacVector software. Neighbor-joining trees were constructed with the Kimura 2 parameter method. Bootstrap analyses were performed based on 1,000 replications.

In total, 130 unreported Mamu-A and Mafa-A sequences were submitted to the European Bioinformatics Institute and European Molecular Biology Laboratory (EBI-EMBL) database. Relevant information such as accession number and a reference cell-line are provided (Table 1). Moreover, all novel Mamu-A and Mafa-A sequences are named in accordance with a generally accepted nomenclature proposal (Klein et al. 1990; Robinson et al. 2003; Ellis et al. 2006). For example, Mamu-A1*0101 defines a Mhc allele in the rhesus macaque, which is encoded by one of the class I loci: namely, A1. The first two digits after the asterisk define the lineage, whereas the third and fourth digits define the allele number. These allele numbers are arbitrary, as they reflect the order in which the alleles were discovered. A fifth and sixth digit are used to mark a synonymous basepair difference between two sequences.

Table 1 Summary of Mamu-A and Mafa-A alleles detected in this study

Results and discussion

Polymorphism and diversity of the Mamu-A region: comparison of Chinese and Indian rhesus macaques

In a previous communication, five different Mamu-A region configurations were defined in a population of Indian rhesus macaques (Otting et al. 2005). These configurations display diversity with regard to the number and combination of distinct Mamu-A genes present per chromosome. The loci have been designated Mamu-A1, -A2, -A3, and -A4, respectively (Fig. 1). As can be seen, each region configuration comprises a Mamu-A1 gene characterized by high transcription levels (major) combined with one or two other Mamu-A genes characterized by lower transcription levels (minors). The Mamu-A1 gene is probably responsible for executing the classical antigen presentation function (Evans et al. 1999; Sidney et al. 2000; Sette et al. 2005), whereas the others can be considered as nonclassicals and may interact, for instance, with the KIR gene family present on NK cells. Two complete MHC haplotypes in the rhesus macaque are sequenced (Daza-Vamenta et al. 2004; Kulski et al. 2004), and they represent two-region configurations. One haplotype contains the -A1/-A2 combination, whereas in the other, the -A1/-A4 pair is observed. Both region configurations are present in the animals of Indian origin (Fig. 1).

Fig. 1
figure 1

Schematic representation of different Mamu-A region configurations observed in Indian rhesus macaques. The exact order and physical distances of the loci on the genome are still unknown. The relative levels of polymorphism and transcription of the loci are indicated. Two Mamu-A region configurations (1 and 5) are confirmed by the sequencing of the complete rhesus MHC region (Daza-Vamenta et al. 2004; Kulski et al. 2004)

Within the present panel of 42 Chinese animals, 59 different full-length Mamu-A cDNAs were detected that can be grouped into various loci and lineages (Fig. 2). A complete listing of the alleles detected in the Chinese macaques is provided as electronic supplementary material (Table 5). Because of codominant expression, rhesus macaques can be heterozygous for the Mamu-A1 gene and thus can express up to two allotypes. Nine animals were found to express three different Mamu-A1-like sequences. These animals shared one of these sequences in separated clades in phylogenetic tree (Fig. 2; -A5, -A6, and -A7). The triplets of the three animals are indicated by asterisks. The clustering of the three alleles, obtained from one animal, in clades distinct from -A2, -A3, and -A4 indicates that one allele belongs to a separate locus. The newly detected loci are named Mamu-A5, -A6, and -A7, according to the generally accepted nomenclature system (Klein et al. 1990; Robinson et al. 2003; Ellis et al. 2006). Representatives of two loci, A5 and A6, are also observed in cynomolgus macaques that express more than two A1-like sequences.

Fig. 2
figure 2

Phylogenetic tree of Mamu-A and Mafa-A gene/alleles detected in this study. The -A2 locus is represented by four alleles only. The tree is based on full-length sequences, although analyses on only exon 2 and 3 showed no significant differences. The nomenclature of the new alleles is based on a more extended tree (not shown) containing all the Mamu-A and Mafa-A alleles now available. Asterisk, three alleles of animal Ri081, of which one represents the locus Mamu-A5. Double asterisks, three alleles of animal Ri145, of which one represents the locus Mamu-A6. Triple asterisks, three alleles of animal Ri078, of which one represents the locus Mamu-A7

Thirty-three new Mamu-A1 alleles were detected in the Chinese rhesus macaques. Although some of the alleles in this population group into lineages that are also present in Indian animals, sharing of alleles for the Mamu-A1 locus is rare and was observed only once. The Mamu-A1*26 allele is detected in both populations, whereas the Mamu-A1*03 and for -A1*04, alleles are highly similar and differ for one synonymous basepair substitution. Hence, it is concluded that most allelic polymorphism observed for the Mamu-A1 locus was probably generated after the rhesus macaque populations were separated. Sequence comparisons illustrated that most of the variations map at the contact residues of the peptide-binding site (data not shown). Thus, polymorphism at the highly divergent Mamu-A1 locus, characterized by high expression levels, must have resulted from positive Darwinian selection (Hughes and Nei 1988; Borghans et al. 2004).

The Mamu-A2 gene displays differential haplotype distribution in the Indian population of rhesus macaques (Fig. 1). This also appears to be the case for Chinese animals, as Mamu-A2 cDNAs were detected in 39 out of 42 of them. In this panel, 16 alleles that differ from the ones earlier detected in Indian animals were defined. In the Indian population, low expression levels characterize the Mamu-A2 gene, which appears also to be the case in Chinese animals.

Five Chinese macaques possess the Mamu-A3*1307 sequence, which is always observed in combination with Mamu-A1*1102. In concordance with the Indian animals, the number of Mamu-A3 clones detected also reflects low expression levels. A full-length Mamu-A4*1404 cDNA was detected in only three animals. In contrast to the Mamu-A4 allele in the Indian macaque, the Chinese allele has a stopcodon at the last triplet of exon 5 encoding the transmembrane part of the class I protein. This suggests that the corresponding gene product may settle in the membrane, but the signal transduction is impaired. As was observed earlier in Indian animals, this locus is characterized by extremely low expression levels (Fig. 1). For that reason, the presence of Mamu-A4 cDNAs in the other animals was tested with SSP on the RT-PCR samples, and 17 out of 42 Chinese animals appeared to be positive for this locus. The Mamu-A5, -A6, and -A7 genes are also characterized by low expression levels and are considered to represent minors.

In contrast to the previously studied Indian population, the Chinese animals were selected randomly from a large population. Because of the lack of pedigree data and segregation profiles, it was not possible to firmly establish the segregation of different Mamu-A genes present per chromosome. Haplotypes can only be deduced based on sharing of sequences between animals, and such haplotypes are listed in Table 2. At this stage, it is impossible to determine in which region configurations the Mamu-A5, -A6, and -A7 genes are present.

Table 2 Chinese Mamu-A haplotypes deduced by sharing of alleles grouped into two-region configurations

Mafa-A region polymorphism and diversity in cynomolgus monkeys

Sixty-two alleles were detected in the 96 cynomolgus macaques analyzed. As observed in the rhesus macaques, most animals had two polymorphic Mafa-A1 sequences, in combination with orthologs of the Mamu-A2, -A3, -A4, -A5, and/or -A6 genes. In total, 38 Mafa-A1 alleles were detected, illustrating the high level of polymorphism of this locus. These alleles were compared with sequences reported earlier by two other research groups (Uda et al. 2004; Krebs et al. 2005) that have used different nomenclature systems. Only four alleles -A1*3101, -A1*3201, -A1*3801, and -A5*3001 detected in our panel are identical to the earlier described sequences A*310101, A*320101, A*380101, and A*300101.

A total of 79 out of 96 animals have at least one Mafa-A2 sequence, and 18 alleles could be distinguished (Table 1). The oligomorphic A2 locus is also present in the pigtailed macaque (Macaca nemestrina) as was recently described (Pratt et al. 2006; Lafont et al. 2007). In these studies, 90% of the animals possess the gene, and lower transcription levels were also observed in comparison to the other Mane-A alleles. The Mafa-A3 gene is observed in only two animals, whereas the Mafa-A4 locus is detected in 41 individuals. Two Mafa-A4 alleles were found that differ in only one basepair (nonsynonymous) substitution (Fig. 2). As in the Chinese rhesus macaques, the Mafa-A4 alleles have a stopcodon at the end of exon 5. The Mafa-A5 and -A6 alleles were detected in, respectively, eight and four animals and display low levels of polymorphism. Phylogenetic analyses illustrate that apart from loci rhesus and cynomolgus monkeys also share lineages (Fig. 2). As found in rhesus macaques, the Mafa-A1 alleles are characterized by high transcription levels, whereas the other loci display moderate or low expression levels.

Because most of the cynomolgus macaques are pedigreed, the combination of sequences that are inherited on one chromosome could be defined. As an example, the kinship tree of one breeding group is provided (Fig. 3). The haplotypes are grouped based on the combination of loci (Table 3). Seven combinations of loci or region configurations are recognized (Fig. 4), and four of them (two to five) are also observed in the Indian rhesus macaques (Fig. 1), which indicates that they predate speciation. Mafa-A5*3001 segregates in combination with Mafa-A1*6001 and -A1*7201 but not exclusively. The exact region configurations of Mafa-A6 containing haplotypes are not yet known.

Fig. 3
figure 3

Pedigree of a cynomolgus macaque family showing segregation of Mafa-A alleles. The animals analyzed are indicated by shading. A question mark indicates that the sire has not been identified

Table 3 Mafa-A haplotypes as defined by segregation grouped into seven-region configurations
Fig. 4
figure 4

Schematic representation of different Mafa-A region configurations in cynomolgus macaques. Four configurations (2–5) are shared with the Indian rhesus macaques (Fig. 1). One region configuration lacks the Mafa-A1 gene. It is not yet known whether the locus is really absent or missed because of PCR failure. Should a Mafa-A1 gene be found in extended studies than this region, configuration has to be deleted from the list

Sharing of -A1 and -A2 sequences between different macaque species

A comparison of all available Mamu-A to Mafa-A sequences showed that six full-length cDNA transcripts are shared between both species of macaque. This is the case for five -A1 pairs and one -A2 pair (Table 4). One of these cynomolgus monkey alleles is identical to an Indian rhesus allele, whereas the other five are shared with animals of Chinese origin. This observation was unexpected as the two rhesus macaque populations investigated share only one allele. The fact that cynomolgus macaques share more alleles with Chinese rhesus macaques than with Indian individuals may be explained by the fact that there is an overlap in the geographic areas inhabited by both species in Indochina (eastern Asia). It is known that rhesus macaques and cynomolgus monkeys can interbreed and produce offspring (Tosi et al. 2002). Sharing of alleles was not observed for the minors controlled by the A3A7 loci.

Table 4 Identical Mhc class I sequences detected in the two species of macaque

Different modes of selection operating on the Mhc class I and II genes in macaques

Although Mamu-A and Mafa-A sequences are interspersed in the phylogenetic tree, the vast majority of the alleles are species unique, and for rhesus macaque, most of them appear to be population specific. This is in sharp contrast to the situation observed for the macaque class II region, where sharing for different exon 2 sequences at the Mhc-DR, but especially the -DQ and -DP genes, is far more common (Blancher et al. 2006; Doxiadis et al. 2006). About half of the Mafa-DPB1, -DQA1, and -DQB1 and one third of the Mafa-DRB sequences are identical to rhesus orthologs. As animals of the same populations have been used to study Mhc class I and II sequences, the present results exclude the possibility that this high sharing of Mhc class II sequences is because of introgression. A more likely explanation is that in macaques, the exon 2 sequences of the Mhc class II genes, which encode the peptide-binding site, have been subjected to purifying selection. As a consequence, many Mhc class II alleles in these two macaque species predate speciation processes. A databank search indicated that the phenomenon also extends to other macaque species (Robinson et al. 2003).

The Mhc class I alleles in macaques are largely unique, illustrating that selection has favored diversity. Mhc class I proteins are involved in the presentation of intracellular pathogens such as viruses and parasites that are known to evolve at high mutation rates. Recognition of Mhc class I molecules combined with a foreign peptide may result in the lysis of an infected cell. The Mhc class II molecules select peptides, originating from extracellular pathogens such as bacteria and fungi, for binding. Mhc class II-mediated activation may result in antibody production and/or providing help to cytotoxic T cells. The present results illustrate that, probably because of coevolution with intracellular pathogens, macaques have generated a highly complex and divergent Mhc class I repertoire. The fact that Mhc class I alleles evolve quickly, even within a species, has been documented in humans for instance (Belich et al. 1992; Watkins et al. 1992). The high degree of sharing of Mhc class II sequences seems to be unique for macaques and has not been observed in any other group of vertebrate species.