Main

The yersiniae are Gram-negative bacteria that belong to the family Enterobacteriaceae. They consist of 11 species that have traditionally been distinguished by DNA–DNA hybridization and biochemical analyses1,2,3,4. Three of them are pathogenic to humans: Yersinia pestis and the ENTEROPATHOGENIC yersiniae, Yersinia pseudotuberculosis and Yersinia enterocolitica. All three species target the lymph tissues during infection and carry a 70-kb virulence plasmid (pYV), which is essential for infection in these tissues, as well as to overcome host defence mechanisms1,2,3,4.

Elegant population genetics studies have indicated that Y. pestis recently evolved from Y. pseudotuberculosis — just 1,500–20,000 years ago5. However, Y. pestis has a very different ecology and epidemiology to the enteropathogenic yersiniae, and causes a markedly different disease. From a mammalian enteropathogen that is widely found in the environment, it has rapidly transformed into a blood-borne pathogen that is also able to parasitize insects and cause systemic disease. The varied ecology, pathogenicity and host range of these related species, together with the availability of mouse models for both gastroenteritis and plague that mimic human disease, and the relative ease of constructing defined mutants, make the yersiniae a model genus in which to study the genetics and evolution of bacterial pathogens.

This review covers emerging themes from the genome sequences of the pathogenic yersiniae and discusses how this information is guiding hypotheses on the evolution of this genus.

Yersinia pestis — the causative agent of plague

Y. pestis has been responsible for three human PANDEMICS — the Justinian plague (fifth to seventh centuries), the Black Death (thirteenth to fifteenth centuries) and modern plague (1870s to the present day)1,3. The Black Death alone is estimated to have claimed the lives of one-third of the European population, and has shaped the development of modern civilization. Plague is still with us, circulating in various mammalian species on most continents (Fig. 1). The recent identification of multidrug resistant strains6 and the possibility that it could be used as an agent of biological warfare, mean that plague still poses a significant threat to human health. An effective vaccine that induces long-lived immunity against bubonic and pneumonic plagues is still not available.

Figure 1: World distribution of plague, 1998.
figure 1

Modified from Ref. 5 © (1999) National Academy of Sciences.

Y. pestis has a complex life cycle involving a mammalian reservoir (primarily rodents) and a flea vector1,3 (Fig. 2). The bacterium ensures transmission by forming a cohesive aggregate that blocks the foregut of infected fleas7. This results in futile attempts to feed on a new host as a blocked flea regurgitates infected blood back into the bite site, effectively injecting the bacteria under the animal's skin7. On infection, some of the bacilli are engulfed by macrophages and carried to the regional lymph nodes, draining the infection site. The bulk of the bacteria, protected by plasminogen activator, penetrate tissue, eventually accessing the lymphatic vessels that drain regional lymph nodes8, where they multiply, giving rise to the classical symptoms of infection and then BACTERAEMIA1,3. This is known as bubonic plague. If the infection then progresses to the lungs, pneumonic plague develops, which is highly infectious and rapidly fatal.

Figure 2: Steps in the transmission of the pathogenic yersiniae in humans.
figure 2

Y. pestis has a rodent reservoir. The rodent's fleas acquire Y. pestis from a meal of infected blood, and transmit the bacterium primarily to other rodents, or occasionally to humans — causing bubonic plague in humans. Human-to-human transmission can occur through human fleas. Pneumonic plague is transmitted from human to human through respiratory droplets, or possibly by artificially generated Y. pestis aerosols. Y. enterocolitica and Y. pseudotuberculosis are ingested, and in contrast to Y. pestis, enter the lymphatic system through the M cells of the small intestine.

There are three recognized subgroups, or BIOVARS, of Y. pestis: Antiqua, Mediaevalis and Orientalis. They are differentiated by their abilities to ferment glycerol and to reduce nitrate, but these differences do not seem to correlate with virulence. Based on epidemiological observations and historical records, each biovar has been associated with one of the three pandemics. Biovar Antiqua is resident in Africa and is descended from bacteria that caused the Justinian plague; biovar Mediaevalis is resident in central Asia and is descended from bacteria that caused the Black Death; and biovar Orientalis, which at present is widespread, is associated with modern plague9. The factors influencing the rise and fall of plague EPIDEMICS remain obscure, but it is possible that severe epidemics could be preceded by subtle genetic changes in Y. pestis that result in a highly virulent strain. One of the aims of the Y. pestis CO92 biovar Orientalis and Y. pestis KIM10 biovar Mediaevalis genome projects10,11 was to identify these key genetic changes.

Enteropathogenic yersiniae

The enteropathogenic yersiniae are found widely in the environment — for example, in soil — and are a common cause of animal infections, affecting several mammalian and avian species2,4. In humans, the infection causes gastroenteritis after the consumption of contaminated food or water. After ingestion, the bacteria pass into the small intestine, where they translocate across the intestinal epithelium through Peyer's patches (Fig. 2). They then migrate to the mesenteric lymph nodes and are subsequently found in the liver and spleen, where they replicate externally to the host cells12. After multiplication, rapid inflammation ensues, which gives rise to the symptoms that are associated with gastroenteritis, such as mesenteric lymphadentis and terminal ileitis.

Y. enterocolitica comprises a biochemically and genetically heterogeneous collection of organisms that has been divided into six biogroups — known as 1A, 1B, 2, 3, 4 and 5 — that can be differentiated by biochemical tests. These can be placed into three lineages: a non-pathogenic group (biogroup 1A); a weakly pathogenic group that is unable to kill mice (biogroups 2 to 5); and a highly pathogenic, mouse-lethal group (biogroup 1B). Biogroup 1A lacks the Yersinia virulence plasmid pYV and seems to be distantly related to the other biogroups, whereas biogroup 1B forms a geographically distinct group of strains13 that are frequently isolated in North America (the so-called 'New-World' strains) and biogroups 2 to 5 are predominantly isolated in Europe and Japan ('Old-World' strains). Y. pseudotuberculosis is subgrouped into 21 different serological groups based on variations in the O-ANTIGEN of its lipopolysaccharide (LPS). All Y. pestis strains fail to express an O-antigen.

The Yersinia pestis genome recipe

The first Yersinia strain to be sequenced was Y. pestis CO92 biovar Orientalis10. The CO92 strain was originally isolated in 1992 from a veterinarian in Colorado, USA, who was infected by a sneezing cat. The genome consists of a 4.65-Mb chromosome and three plasmids — pMT1 or pFra (96.2 kb), pYV or pCD (70.3 kb), and pPla or pPCP1 (9.6 kb) (Table 1). Close inspection of the chromosome sequence indicates that it has undergone fierce genetic flux. It is scarred by genes that have been acquired from other organisms, large sections are jumbled and it seems to be in the early stages of decay. These main features of this highly dynamic genome can be seen as representing three key steps towards the evolution of Y. pestis: add DNA, stir and reduce.

Table 1 Plasmids important in the virulence of pathogenic yersiniae

Add DNA. As the nucleotide sequences of more bacterial genomes become available, it is evident that they are mosaics of DNA sequences from different origins, owing to the lateral exchange of large mobile genetic elements, such as plasmids, phage or transposons. The genetic material that is acquired often contributes to an organism's virulence — for example, by broadening its host range, improving its ability to overcome host defences or to cause tissue damage. The proceess of LATERAL GENE TRANSFER allows microbial pathogens to evolve extremely rapidly, in 'quantum leaps'.

Plasmid acquisition seems to be a key element in the evolutionary jump of Y. pestis from an enteric pathogen to a flea-transmitted systemic pathogen. In addition to the virulence plasmid pYV that is common to all pathogenic yersiniae, most Y. pestis strains have two further plasmids — pPla that encodes the plasminogen activator Pla14, and pMT1 that encodes the putative murine toxin Ymt and the F1 capsule. The precise role of these determinants in host adaptation and virulence is unclear, but there are several indications that they are involved in transmission. Pla, for example, is important for dissemination of Y. pestis after subcutaneous injection into a mammalian host15, and although capsule-deficient mutants can still cause disease in humans16, Y. pestis strains that lack the entire pMT1 plasmid are unable to colonize fleas17. Recently, Hinnebusch et al. have shown that the murine toxin Ymt acts as an intracellular phospholipase D and is required for the survival of Y. pestis in the flea midgut compartment, but not in the proventriculus compartment18. However, the chromosomally unstable haemin storage (hms) locus that encodes outer-surface proteins is also required for flea-borne transmission. Conversely, the hms locus is required to infect the proventriculus and not the midgut of the flea19. Deletion of the hms locus in Y. pestis results in changes in blood-feeding behaviour and less efficient transmission of plague. So overall, the acquisition of two plasmids (pPla and pMT1) by horizontal gene transfer along with the pre-existing chromosomal hms locus, helps to explain the rapid evolutionary transition of Y. pestis to flea-borne transmission.

Although the existence of the pPla and pMT1 plasmids and the hms locus was known before the CO92 genome sequence became available, little was known about the rest of the 4.65-Mb chromosome. One characteristic of loci that have been acquired by lateral gene transfer is an atypical G+C content relative to the rest of the genome. This means that the complete genome sequence can be screened for recently acquired genes or cassettes of genes — often referred to as pathogenicity islands — by looking for 'spikes' of G+C variation. G+C analysis of the Y. pestis CO92 genome identified at least 21 such regions, including the 102-kb unstable element that contains the hms locus10 (Table 2). Among these regions were several genes that seem to have come from other insect pathogens. Sequences related to the parasitism of insects include homologues of insecticidal toxin complexes (Tcs) from Photorhabdus luminescens, Serratia entomophila and Xenorhabdus nematophilus20. The toxins are complexes of the products of three different gene families — namely, tcaA/tcaB/tcdA, tcaC/tcdB and tccC. In addition, a predicted coding sequence showing similarity to an insect virus-like enhancin protein — a proteolytic enzyme that can damage insect gut membranes — was also identified in a region of low G+C content10. The sequence was flanked by transposase fragments, which indicates horizontal acquisition.

Table 2 Selected putative pathogenicity islands from Y. pestis (Y. pseudotuberculosis)

Other apparent acquisitions include a chromosomally encoded type-III secretion system that is similar in gene content and order to the Spi2 type III system of Salmonella enterica serovar Typhimurium21, and several adhesins and iron-scavenging systems (Table 2). The Y. pestis CO92 genome sequence also contains several genes that are predicted to encode novel surface antigens that could have a role in virulence. Ten fimbrial-type surface structures that are often important in bacterial attachment were identified, five of which were flanked by genes encoding transposases or integrases, which also indicates horizontal acquisition10. A large arsenal of independent gene clusters encoding different fimbriae and adhesins could help Y. pestis evade the host immune response, or allow multiple interactions with several different host tissues during the pathogen's complex life cycle.

Stir. A striking feature of the Y. pestis CO92 genome sequence was the large number of insertion sequence (IS) elements. The total of 140 IS elements exceeds that described in most other bacterial genomes and comprises 3.7% of the genome. IS elements are perfectly repeated sequences, and are likely sites for homologous recombination events that can rearrange the genome. All bacterial genomes sequenced so far have a small, but detectable bias towards G on the leading strand of the bi-directional replication fork22. So, the G/C skew in different parts of the genome highlights any irregularities in the composition of the genome (Fig. 3).

Figure 3: Circular representation of the Y. pestis CO92 genome.
figure 3

The outer circle is marked in bases. Circles 1 and 2 (from the outside in), all genes colour-coded by function, forward and reverse strand; circles 3 and 4, pseudogenes; circles 5 and 6, insertion sequence elements; circle 7, G+C content (higher values outward); circle 8, GC bias ((G−C)/(G+C)), turquoise indicates values >1, burgundy indicates values <1. Colour-coding for genes: dark blue, pathogenicity or adaptation; black, energy metabolism; red, information transfer; dark green, surface-associated; cyan, degradation of large molecules; magenta, degradation of small molecules; yellow, central or intermediary metabolism; pale blue, regulators; orange, conserved hypothetical; brown, pseudogenes; pink, phage and insertion sequence elements; pale green, unknown; grey, miscellaneous. The three multiple inversion regions (MIR) are highlighted in dark blue. IS, insertion sequence; ORFs, open reading frames. Reproduced from Ref. 10 © (2001) Nature Macmillan Magazines Ltd.

The G/C skew plot of Y. pestis CO92 shows three anomalies — two inversions and one translocation10 (Fig. 3). Each is bounded by IS elements, which indicates that they could be the result of recent recombination events. Polymerase chain reaction analysis indicates that although the chromosome contains both possible orientations of the two inverted sequences, they are present in unequal proportions10. This has also been shown to be the case for strain GB (biovar Orientalis) and strain A16 (biovar Antiqua)10. It seems that several different chromosomal configurations can exist in the same population, which indicates that genomic rearrangements can occur during the growth of the organism — a feature that has not previously been reported for a bacterium. It is not known how these events affect the biology of the organism, but as the expression of bacterial genes is influenced by their orientation with respect to the direction of DNA replication, it seems reasonable to conclude that such rearrangements could alter virulence.

Reduce. Loss of gene function, or genome decay, occurs as a bacterium adapts to its host. For example, many PSEUDOGENES — which are often ignored as sequencing artefacts — could, in fact, be remnants of functional genes from a pathogen that is in the process of 'downsizing' its genome content as it adapts. This is illustrated by the genome sequence of the obligate intracellular pathogen Rickettsia prowazekii23. The 1.11-Mb R. prowazekii genome has a large number of pseudogenes and the highest proportion of non-coding DNA in any prokaryote — more than 24%. The intergenic DNA might represent the scattered remnants of genes that are no longer required (or are harmful to the existence of the organism), and that have been lost in a step-wise process as the organism acquires an obligate intracellular lifecycle. Analysis of the genome of the leprosy bacillus Mycobacterium leprae indicates a similar situation as the genome contains numerous pseudogenes and extensive genetic downsizing that is not found in other mycobacterial species that have a more free-living existence24. It has been proposed that the genome of M. leprae has evolved to the natural minimal gene-set for mycobacteria, and is now a pathogen that is on the brink of survival24.

Close examination of the CO92 nucleotide sequence has identified at least 149 pseudogenes, which represent 4% of the genome. Several mechanisms account for the accumulation of pseudogenes in Y. pestis, including IS-element expansion, deletion, point mutation and slippage in tracts of single-nucleotide repeats. The total of 149 is likely to be an underestimate, as certain features such as point mutations are difficult to detect through direct sequencing. Although the number is modest compared to those of R. prowazekii and M. leprae, it indicates that reductive evolution has begun, and reflects adaptation to new hosts and a gradual change in life cycle.

A selected list of Y. pestis pseudogenes that might once have been involved in host adaptation or virulence is shown in Table 3. These include putative insect toxin genes, such as tcaB and tcaA — which contain a frameshift mutation and an internal deletion, respectively — and the general toxin cytotoxic necrotizing factor 1. The disruption of these genes might be necessary for the life cycle of Y. pestis, which persists in the flea gut for relatively long periods of time and would not want to kill its insect host.

Table 3 Y. pestis pseudogenes that might be important in the pathogenesis of other Yersinia

As Y. pestis has changed its life cycle from that of the ancestral Y. pseudotuberculosis strain, it would not be expected to use genes that are required for enteropathogenicity as the newly evolved Y. pestis would no longer be transmitted by the faecal–oral route. Enteropathogens specifically adhere to surfaces of the gut and invade the cells lining it. Proteins that are important for this process in Y. pseudotuberculosis include YadA and Inv, both of which are represented by pseudogenes in Y. pestis25,26. Many of the other pseudogenes reported — for example, a putative intimin adhesion protein — might have encoded adhesin molecules that potentially had a role in enteropathogenesis.

Some pseudogenes might be able to regain their function. Several pathogens have been shown to be capable of switching surface-expressed antigens on or off by slipped-strand mispairing of repeat sequences during replication27, and a similar process has been shown in Y. pestis in the ureD gene. This organism is characteristically UREASE-negative; however, activity can be restored in vitro by the spontaneous deletion of a single base pair in runs of nucleotides in the ureD gene28. This type of reversible mutation would free Y. pestis from the metabolic burden of producing proteins that are not required in its new flea/mammal life cycle, yet still give Y. pestis the potential to express them should a subsequent need arise. However, the loss of some enteropathogen virulence traits seem to be irreversible in Y. pestis as the gene pathways encoding them have been inactivated by multiple mutations. Motility and LPS biosynthesis are examples in which at least five genes in each pathway seem to no longer function in Y. pestis CO92 (Refs 10,29).

The Y. pestis CO92 genome also contains several pseudogenes of unknown function. Given that many of the familiar pseudogenes seem to be associated with a redundant enteric life cycle, identifying these sequences in Y. pestis might reveal potential virulence determinants for investigation in the enteropathogenic yersiniae.

However, not all of the lost genes relate to putative virulence determinants — many relate to physiological functions (Table 4). Indeed, it is becoming increasingly evident that some genes increase the virulence of the organism when they are inactivated. When pathogenic Shigella strains developed from a non-pathogenic Escherichia coli ancestor, the loss of ompT and cadA genes (so-called 'black holes') might have contributed to their virulence and evolution30,31. More recently, this phenomenon has been shown in Mycobacterium tuberculosis, in which several experimentally designed knockout mutants are more virulent than the wild-type strain32. This might also be the case for Y. pestis, particularly with respect to the loss of bioenergetic functions, such as dicarboxylic amino-acid metabolism. It has been known for some time that all the Y. pestis strains tested so far lack enzymes that alter the catabolic flow of carbon, such as glucose-6-phosphate dehydrogenase and aspartase3,33,34. The reduction of unnecessary metabolic pathways might allow the organism to conserve energy and the newly evolved, streamlined organism might then contribute to the development of acute disease.

Table 4 Y. pestis pseudogenes that might be important in central/intermediary metabolism

Other Yersinia genome projects

A second Y. pestis strain KIM10 (biovar Mediaevalis), which was originally isolated from a plague patient in Kurdistan, has recently been sequenced11. This provides the opportunity to compare the genome content of a modern-day plague strain with that of a more ancient strain, even though only a few hundred years separate the pandemic events by which these biovars are defined. Direct comparison showed that the two genomes share more than 98% of their sequence, although the KIM10 genome is about 50 kb smaller than the CO92 genome as it has fewer IS elements and several small deletions. However, despite the genomes being so closely related, a remarkable amount of genome rearrangement has taken place. The differences seem to result from multiple inversions of the sequence at various insertion points. In particular, there are three regions where multiple inversions have taken place (Fig. 3). The most complicated multiple inversion region, MIR1, spans the replication origin and contains at least nine inversions11 (Fig. 3).

The genome sequences of Y. pseudotuberculosis IP 32953 (serotype I) (see Y. pseudotuberculosis genome sequence in the Online links) and Y. enterocolitica 8081 (biogroup1B, serotype O8) (see Y. enterocolitica genome sequence in the Online links) are also soon to be published. They too seem to have gained significant portions of their genome by lateral gene transfer. However, both genomes have fewer IS elements and pseudogenes than Y. pestis strains KIM10 or CO92 (Refs 35, 36). This indicates that the Y. pseudotuberculosis and Y. enterocolitica genomes are more stable than Y. pestis. Genome-sequence data confirm that Y. pestis and Y. pseudotuberculosis are closely related, with gene homology of nearly 97% and largely co-linear gene organization35. By contrast, Y. enterocolitica is more distantly related, and is about the same evolutionary distance from Y. pseudotuberculosis and Y. pestis as E. coli is from Salmonella species36.

Meanwhile, closer inspection of the disease syndromes of Y. enterocolitica and Y. pseudotuberculosis indicates that, although they seem similar, the two species do in fact cause different infections. Although both pathogens invade through M cells, Y. enterocolitica colonizes the Peyer's patches, whereas Y. pseudotuberculosis is more widely disseminated and typically causes acute abdominal pain with mesenteric lymphadenitis of the small intestine. One distinguishing feature of Y. enterocolitica disease compared to Y. pseudotuberculosis is that it causes a more severe diarrhoea — a pronounced watery diarrhoea and occasionally bloody diarrhoea with fever in children. The heat-stable toxin Yst has been identified in all enteropathogenic Y. enterocolitica, but is absent in Y. pseudotuberculosis37. This could be one of the distinguishing genetic features responsible for this difference in symptoms. So, although diarrhoea is a common outcome, the diseases are different. This partly explains the 'Yersinia paradox', although it does not shed light on why Y. pestis causes such a different disease.

The evolution of pathogenic yersiniae

The insights from genome analysis allow us to piece together a picture of how these three species might have evolved. It seems clear that Y. enterocolitica has evolved independently. As discussed, it can be separated into three lineages (Fig. 4) — the mostly avirulent biogroup 1A strains that lack the virulence plasmid pYV, the mouse-virulent Old-World strains (biogroups 2 to 5) and mouse-lethal New-World strains (biogroup 1B).

Figure 4: Simplified model of Yersinia species evolution based on present knowledge of genome data.
figure 4

The non-pathogenic yersiniae gain the virulence plasmid pYV to form the predecessor of pathogenic yersiniae. Y. enterocolitica diverges from Y. pseudotuberculosis and forms three lineages: 1A, Old World and New World. Y. pseudotuberculosis gains the ability to parasitize insects and form biofilms in hosts before evolving into Y. pestis through the acquisition of the plasmids pPla and pMT1, genome mixing and decay. Hms, haemin storage; HPI and HPI*, high-pathogenicity islands; IS, insertion sequence.

The New-World strains have acquired several elements by lateral gene transfer that contribute to their increased virulence compared with Old-World strains. In particular, the New-World strains contain a 'high-pathogenicity island' (HPI) that encodes the synthesis of the siderophore yersiniabactin, an iron-sequestering low-molecular-weight compound that is invaluable in the iron-limiting environment of the host38. The importance of the HPI region to mouse virulence has been shown by transferring it from a New-World strain into an Old-World strain; the modified strain was lethal in mice39. The HPI region has also been found in other Enterobacteriaceae40, some of which might be candidates for donating the HPI region to the Y. enterocolitica New-World strains and also to Y. pseudotuberculosis and Y. pestis. A HPI region is also present in Y. pseudotuberculosis and Y. pestis, but sequence analysis reveals that it is significantly different to the HPI region that is present in Y. enterocolitica, indicating that the two regions might have been acquired independently40. More recently, another element has been identified that seems to occur exclusively in New-World strains — that is, an additional type-II secretion gene cluster2.

By contrast, Y. pestis is much more closely related to Y. pseudotuberculosis, indicating that it evolved from this species in a short amount of time. There is strong evidence that strains of the O:1b serogroup of Y. pseudotuberculosis are the most closely related to Y. pestis owing to the 98.9% sequence identity of the LPS region of this serogroup compared to that of Y. pestis29. However, O:1b strains are not clonal, and the ancestral O:1b strain for Y. pestis is unknown29. Other studies on the presence and sequence of the fyuA and irp2 genes indicate that Y. pseudotuberculosis O:1 and O:3 strains belong to a distinct lineage compared with other Y. pseudotuberculosis strains and are closer to the Y. pestis lineage38.

Analysis of the Y. pestis sequence reveals a genome that has undergone a much more severe genetic flux than either Y. pseudotuberculosis or Y. enterocolitica. It has gained several loci by lateral gene transfer, but has also translocated large regions of its chromosome and now seems to be in the early stages of genome decay. As discussed, the mode of transmission of Y. pestis can be at least partly attributed to the acquisition of two plasmids (pPla and pMT1) and the hms locus. The first crucial step in Y. pestis evolution might have been the acquisition of pMT1 by an O:1 or O:3 strain of Y. pseudotuberculosis (Fig. 4). A serendipitous discovery during the sequencing of the genome of the recently isolated multidrug-resistant Salmonella typhi strain CT18 from Vietnam might shed light on the possible origin of this plasmid. The 96.2-kb pMT1 plasmid in Y. pestis has more than 50% sequence identity to the cryptic plasmid pHCM2 that is found in the S. typhi strain CT18 (Ref. 42). Such high-sequence identity indicates the recent transfer of the plasmid, perhaps between S. typhi and Y. pseudotuberculosis and/or Y. pestis in a dually infected human host, or the gut of a flea vector that fed on multiple hosts. Given that Salmonella and Yersinia are gut pathogens, one possible evolutionary scenario is that Salmonella was the donor of this replicon (pMT1) in the gut of a rodent. Once the ancestral Y. pseudotuberculosis strain acquired pMT1, the combination of chromosomally encoded Hms proteins might have allowed for more efficient colonization of insects and/or fleas5. Subsequent acquisition of pPla might then have enhanced the ability to disseminate after transmission to a mammalian host. Ensuing microevolution resulted in the three lineages given the biovar designations, Antiqua, Mediaevalis and Orientalis. Orientalis biovar is glycerol negative and nitrate positive, and Mediaevalis is glycerol positive and nitrate negative, which indicates that these biovars arose independently from the glycerol- and nitrate-positive Antiqua progenitor group9. Indeed, further analysis using subtractive hybridization has confirmed that Mediaevalis and Orientalis evolved independently from Antiqua43.

However, the presence of pPla and pMT1 is not sufficient to account for the extraordinary virulence of Y. pestis44,45. This seems to be due to lateral gene transfer into the Y. pestis chromosome. At least 21 regions of apparent lateral gene transfer have been identified by variation in G+C content within the CO92 genome. So when did Y. pestis acquire them? The answer is perhaps unexpected. In virtually all cases, comparative microarray hybridization analysis of dozens of Y. pseudotuberculosis and Y. pestis strains indicate that similar sequences are also present in the Y. pseudotuberculosis genome46, which indicates that these sequences were acquired by Y. pseudotuberculosis before Y. pestis started to diverge. For example, the sequences that might be related to the parasitism of insects (insecticidal toxin and baculovirus enhancin) are found in a wide range of Y. pseudotuberculosis strains. These sequences appear as pseudogenes in Y. pestis, but in Y. pseudotuberculosis they seem active, indicating that Y. pseudotuberculosis — as well as infecting mammals — is also widely found in soil and might have evolved the ability to kill insects, perhaps to gain access to nutrients. Rather than the adaptation of Y. pestis to the flea gut occurring in a single evolutionary event, Y. pseudotuberculosis might already have been associated with insect hosts for some time.

Recently, it has been found that all Y. pestis strains and about 20% of Y. pseudotuberculosis strains that have been tested so far form biofilms on the mouths of the nematode Caenorhabditis elegans, thereby preventing the worms from feeding47,48; however, Y. enterocolitica strains are unable to do this. It has been postulated that the ability to form biofilms on biotic surfaces might also have evolved originally in Y. pseudotuberculosis as a mechanism to prevent predation by nematodes that feed on bacteria47. The hms gene has been shown to be required for Yersinia biofilm formation in nematodes as well as in fleas47. A distinct phenotype of the hms locus is autoagglutination — that is, the tendency for Y. pestis cells to form clumps in liquid media. In fleas, this 'sticky' attribute is necessary to adhere to the cuticle-covered, proventicular spines, allowing the formation of a dense aggregate that is embedded in an extracellular matrix, reminiscent of a biofilm. So, once Y. pseudotuberculosis and/or Y. pestis extended its host range into insects, biofilm formation might have assumed another role — that is, blocking the foregut in fleas. This would have been a key step in allowing the initial transfer of the Y. pestis bacteria into mammals through a flea bite.

Conclusions

Common themes to emerge from the genome analysis of more than 50 microbial pathogens that have been sequenced so far, include extensive lateral gene transfer (particularly among enteric pathogens), genome decay (among obligate intracellular pathogens) and extensive antigenic variation by gene shuffling or slipped-strand mispairing49. Y. pestis has all of these characteristics. It is an organism in an intermediate stage of genetic flux, in which the acquisition of novel sequences by lateral gene transfer seems to be counterbalanced by ongoing genome decay. Perhaps the most striking aspect of the evolution of Yersinia is the extremely rapid emergence of Y. pestis from Y. pseudotuberculosis and genome analysis shows us how this has happened. In terms of gene expansion, apart from the acquisition of pMT1 and pPla, there seems to be little difference between the two species. In other words, Y. pseudotuberculosis already has all the extra genes that Y. pestis needs for virulence. The key process that turned Y. pseudotuberculosis into Y. pestis seems instead to have been gene loss — for example, of the insect toxins that would have killed the insect host, and certain physiological functions that accentuate Y. pestis virulence in humans. This loss seems to have been triggered by the extensive expansion of IS elements, which caused significant genome rearrangements. Once Y. pseudotuberculosis had acquired certain crucial genes, the instability that was introduced by the IS elements was the main force to release its virulence potential — as Y. pestis.

Several questions remain regarding the pathogenesis of these Yersinia species and their evolution. For example, little is known about the expressed determinants that are important during pneumonic transfer of Y. pestis. Presumably, these are inactive in Y. pseudotuberculosis, which is incapable of causing infection by this route. Similarly, little is known about the relationship of Y. pseudotuberculosis and insects and/or fleas in nature. Why is Y. pestis exceptionally pathogenic compared with its recent relative Y. pseudotuberculosis? Perhaps the answer lies in the mode of transmission. For an enteropathogen, the most efficient method to ensure transmission to a new host is to cause diarrhoea. By contrast, as Y. pestis has to spread to the blood of a new host through a flea vector, the more severe the bacteraemia, the greater the chances of being transmitted through a flea bite. So, there is a strong selective pressure to cause severe disease. The factors influencing the rise and fall of plague pandemics also remain obscure. There will undoubtedly be many factors involved, but the genetic make-up of Y. pestis is likely to be important. It is possible that during the spread of an epidemic, passage through humans could allow Y. pestis to become more transmissible or more pathogenic, particularly during pneumonic transfer in which close human contact could aid transmission. Such a 'hypervirulent' strain of Vibrio cholerae was recently shown to have arisen during passage through humans in a cholera epidemic50. The flexible genome of Y. pestis makes it a likely candidate for such a mechanism.

Finally, what is the likely fate of Y. pestis? Has it finished altering its enteric genome content? Will its flexibility allow the bacterium to cause future pandemics? Or will continued genome decay see the species burn itself out in an evolutionary 'dead-end'? Irrespective of the natural scenario for Y. pestis, it will remain an important organism to investigate both in terms of the development of measures to counteract the threat of bioterrorism, and as a model to study the evolution of pathogens that threaten mankind.