Evolutionary history and higher order classification of AAA+ ATPases

https://doi.org/10.1016/j.jsb.2003.10.010Get rights and content

Abstract

The AAA+ ATPases are enzymes containing a P-loop NTPase domain, and function as molecular chaperones, ATPase subunits of proteases, helicases or nucleic-acid-stimulated ATPases. All available sequences and structures of AAA+ protein domains were compared with the aim of identifying the definitive sequence and structure features of these domains and inferring the principal events in their evolution. An evolutionary classification of the AAA+ class was developed using standard phylogenetic methods, analysis of shared sequence and structural signatures, and similarity-based clustering. This analysis resulted in the identification of 26 major families within the AAA+ ATPase class. We also describe the position of the AAA+ ATPases with respect to the RecA/F1, helicase superfamilies I/II, PilT, and ABC classes of P-loop NTPases. The AAA+ class appears to have undergone an early radiation into the clamp-loader, DnaA/Orc/Cdc6, classic AAA, and “pre-sensor 1 β-hairpin” (PS1BH) clades. Within the PS1BH clade, chelatases, MoxR, YifB, McrB, Dynein-midasin, NtrC, and MCMs form a monophyletic assembly defined by a distinct insert in helix-2 of the conserved ATPase core, and additional helical segment between the core ATPase domain and the C-terminal α-helical bundle. At least 6 distinct AAA+ proteins, which represent the different major clades, are traceable to the last universal common ancestor (LUCA) of extant cellular life. Additionally, superfamily III helicases, which belong to the PS1BH assemblage, were probably present at this stage in virus-like “selfish” replicons. The next major radiation, at the base of the two prokaryotic kingdoms, bacteria and archaea, gave rise to several distinct chaperones, ATPase subunits of proteases, DNA helicases, and transcription factors. The third major radiation, at the outset of eukaryotic evolution, contributed to the origin of several eukaryote-specific adaptations related to nuclear and cytoskeletal functions. The new relationships and previously undetected domains reported here might provide new leads for investigating the biology of AAA+ ATPases.

Introduction

A large part of the proteome of any organism is devoted to proteins that bind nucleoside triphosphates and, typically, utilize them as substrates in various reactions (reviewed in Vetter and Wittinghofer, 1999). Several distinct NTP-binding protein folds have been structurally characterized to date, but amongst these the P-loop NTPases (Saraste et al., 1990) are by far the most abundant class, which accounts for 10–18% of the predicted gene products in the sequenced prokaryotic and eukaryotic genomes (Koonin et al., 2000a). Proteins with P-loop NTPase domains are also present in the majority of viruses studied to date (Gorbalenya and Koonin, 1989). The P-loop NTPases are thought to be a monophyletic assemblage of protein domains, and several distinct versions of this domain are traceable to the last universal common ancestor (LUCA) of all modern cellular life forms (Kyrpides et al., 1999; Leipe et al., 2002). This suggests that the P-loop domain originated long before the time of the LUCA and had undergone considerable structural and functional diversification prior to this period. Thus, understanding the natural history of P-loop NTPases is critical for understanding the key aspects of life’s evolution, ranging from the early phases to the radiation of major organismal lineages.

Most members of the P-loop NTPase fold hydrolyze the β–γ phosphate bond of a bound nucleoside triphosphate, most often, ATP or GTP. The free energy of this hydrolysis reaction is typically utilized to induce conformational changes in other molecules. This constitutes the basis of the biochemical activities and biological functions of most P-loop fold proteins. In contrast, members of one major lineage of P-loop proteins, the kinases, transfer the ATP γ-phosphate to diverse substrates (Leipe et al., 2003). Structurally, P-loop domains adopt a 3-layered α/β sandwich configuration that contains regularly recurring α–β units with the β-strands forming a central, mostly parallel β-sheet surrounded on both sides by α-helices (Milner-White et al., 1991) (see also the SCOP database (Murzin et al., 1995): http://scop.mrc-lmb.cam.ac.uk/scop/). At the sequence level, P-loop NTPases are generally characterized by two conserved sequence motifs, the Walker A and B motifs, which bind, respectively, the β and γ phosphate moieties of the bound NTP, and a Mg2+ cation (Saraste et al., 1990; Vetter and Wittinghofer, 1999; Walker et al., 1982).

Sequence and structure analyses suggest that the primary diversification event in the evolution of the P-loop fold resulted in the two principal classes of the P-loop domains. The first of these, the KG (Kinase–GTPase) division includes the kinases and GTPases that share number of structural similarities, such as the adjacent placement of the P-loop and Walker B strands. The other class, the ASCE division (for additional strand, catalytic E), is characterized by an additional strand in the core sheet, which is located between the P-loop strand and the Walker B strand (Leipe et al., 2002, Leipe et al., 2003; Fig. 1). As opposed to kinases and GTPases, ATP hydrolysis by the proteins of the ASCE group typically depends on a conserved catalytic (proton-abstracting) glutamate that primes a water molecule for the nucleophilic attack on the γ-phosphate group of ATP. The ASCE division includes AAA+, ABC, PilT/VirD4, superfamily 1/2 (SF1/2) helicases, and RecA/F1/F0 superfamilies of ATPases, along with several additional, less confidently classified families.

Starting over a decade ago, the AAA ATPases (ATPases associated with a variety of cellular activities) were encountered in studies on an astonishing range of biochemical systems (Confalonieri and Duguet, 1995; Lupas and Martin, 2002; Ogura and Wilkinson, 2001). These included, among others, the eukaryotic proteasomal ATPases, CDC48, and FtsH, which are involved in processes related to protein stability and degradation in bacteria and eukaryotes, NSF, which is implicated in vesicular fusion, Pex1p, involved in peroxisome biogenesis, and Bcs1p, which participates in the assembly of mitochondrial membrane complexes. Approximately around the same time, a detailed computational analysis of various cellular and viral proteins involved in nucleic acid metabolism, such as DnaA, the MCM proteins, NtrC-type transcription factors, and helicases of various RNA and DNA viruses comprising the helicase “superfamily 3” (SF3), suggested that all these proteins shared a conserved ATPase domain (Koonin, 1993).

Solution of the X-ray structure of the NSF protein and its comparison with the clamp loader subunit structure supported the unification of these ATPases into a monophyletic group (Guenther et al., 1997; Lenzen et al., 1998). Concomitantly, we conducted a systematic analysis of these ATPase domains using advanced sequence profile analysis methods and structural comparisons, which resulted in the unification of the bona fide AAA ATPase and the DnaA/MCM/NtrC/SFIII-related proteins into a single, monophyletic “AAA+” class (Neuwald et al., 1999). Additionally, this analysis showed that various other ATPase domain families, such as ClpAB/Hsp100, ClpX, HslU, and Lon, which are involved in protein folding and degradation, the eukaryotic motor protein dynein, a large, conserved eukaryotic protein with 6 ATPase domains (subsequently termed midasin), magnesium and cobalt chelatases, the bacterial DNA-replication clamp loaders, and eukaryotic replication factor C subunits, also belonged to the AAA+ class. It was also proposed that the AAA+ domain might be a common denominator in the catalytic assembly or disassembly of large cellular complexes of polypeptides and nucleic acids and that the majority of AAA+ ATPases function as oligomeric ring structures, which provide symmetric or quasi-symmetric surfaces for interactions with other molecules or a central pore for threading molecules in an extended conformation (Neuwald, 1999; Neuwald et al., 1999).

Since the publication of the original analysis of the AAA+ class, a wealth of structural and biochemical studies have been published that have strongly reinforced the monophyly of AAA+ ATPases and elucidated intricate functional details of how oligomeric rings of AAA+ proteins could be deployed in various biological contexts (Dougan et al., 2002; Lupas and Martin, 2002; Ogura and Wilkinson, 2001). Currently over 15 structures of distinct types of the AAA+ domain are available (Fig. 1). This data, along with the genome sequences of diverse organisms from many of the principal phylogenetic lineages, provides for a “post-genomic” vantage point to address several issues, which have not been tractable previously: (1) A formal, unified definition of the AAA+ class that combines sequence and structural information. (2) The higher order relationships within the AAA+ class. (3) The earliest events in the evolution of AAA+ ATPases and its differentiation from the other ASCE ATPases. (4) The trends in colonization of various functional niches during the evolution of this class of ATPases. Here, we address these problems, particularly in light of the new information that became available since the previous survey of the AAA+ class fo ATPases (Neuwald et al., 1999).

Section snippets

The defining structural and catalytic features of the AAA+ ATPases

We collated all currently available structures of proteins that have been confidently assigned to the AAA+ class and prepared a multiple alignment of their sequences on the basis of their structure superposition. This allowed us to map all the major conserved sequence features to their 3D structural cognates in the core AAA+ domain. Furthermore, this structure-based alignment also enabled correction of the earlier alignments (Koonin, 1993; Neuwald et al., 1999), which were derived principally

General conclusions

Our understanding of the AAA+ ATPases, especially in structural and mechanistic terms, has vastly improved since the publication of the previous survey of this protein class (Neuwald et al., 1999). Using the wealth of structures and genomic information currently at our disposal, we identified the defining structural features of the AAA+ superclass and constructed an evolutionary classification along with a reconstruction of some major aspects of their evolutionary history. In particular, some

Materials and methods

Sequences of AAA+ proteins were extracted from the non-redundant (NR) protein sequence database (National Center for Biotechnology Information, NIH, Bethesda) using the PSI-BLAST program (Altschul et al., 1997), with the sequences of known AAA+ proteins as queries. Sequence similarity-based protein clustering was performed using the BLASTCLUST program (ftp://ftp.ncbi.nih.gov/blast/documents/README.bcl). Multiple alignments were constructed using the Clustal X (Thompson et al., 1997) or T-Coffee

Acknowledgements

On account of constraints of space we have, regrettably, been unable to cite a large number of primary papers on functions of AAA+ ATPases. We have mainly concentrated on works presenting structures and experimental papers that are directly relevant to the higher order classification. We apologize for the omissions of other important works. Supplementary material including all Genbank identifiers of AAA+ proteins identified in this work, a more inclusive sequence alignment, alignments for the

References (110)

  • J Felsenstein

    Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods

    Methods Enzymol.

    (1996)
  • M.N Fodje

    Interplay between an AAA module and an integrin I domain may regulate the function of magnesium chelatase

    J. Mol. Biol.

    (2001)
  • P Forterre

    The origin of DNA genomes and DNA replication proteins

    Curr. Opin. Microbiol.

    (2002)
  • G.K Fu

    Bacterial protease Lon is a site-specific DNA-binding protein

    J. Biol. Chem.

    (1997)
  • R Giraldo

    Common domains in the initiators of DNA replication in Bacteria, Archaea and Eukarya: combined structural, functional and phylogenetic perspectives

    FEMS Microbiol. Rev.

    (2003)
  • A.E Gorbalenya

    A new superfamily of putative NTP-binding domains encoded by genomes of small DNA and RNA viruses

    FEBS Lett.

    (1990)
  • B Guenther

    Crystal structure of the delta’ subunit of the clamp-loader complex of E. coli DNA polymerase III

    Cell.

    (1997)
  • F Guo

    Crystal structure of ClpA, an Hsp100 chaperone and regulator of ClpAP protease

    J. Biol. Chem.

    (2002)
  • N.R Hayashi

    The nirQ gene, which is required for denitrification of Pseudomonas aeruginosa, can activate the RubisCO from Pseudomonas hydrogenothermophila

    Biochim. Biophys. Acta

    (1998)
  • N.R Hayashi

    The cbbQ genes, located downstream of the form I and form II RubisCO genes, affect the activity of both RubisCOs

    Biochem. Biophys. Res. Commun.

    (1999)
  • M.M Hingorani et al.

    ATP binding to the Escherichia coli clamp loader powers opening of the ring-shaped clamp of DNA polymerase III holoenzyme

    J. Biol. Chem.

    (1998)
  • L Holm et al.

    Dali: a network tool for protein structure comparison

    Trends Biochem. Sci.

    (1995)
  • J.R Hoskins

    Clp ATPases and their role in protein unfolding and degradation

    Adv. Protein Chem.

    (2001)
  • M.J Huynen et al.

    Gene and context: integrative approaches to genome analysis

    Adv. Prot. Chem.

    (2000)
  • Y Kawabe

    A novel protein interacts with the Werner’s syndrome gene product physically and functionally

    J. Biol. Chem.

    (2001)
  • E.V Koonin

    Protein fold recognition using sequence profiles and its application in structural genomics

    Adv. Protein Chem.

    (2000)
  • S Krzywda

    The crystal structure of the AAA domain of the ATP-dependent protease FtsH of Escherichia coli at 1.5 Å resolution

    Structure (Camb)

    (2002)
  • D.G Lee et al.

    ATPase switches controlling DNA replication initiation

    Curr. Opin. Cell. Biol.

    (2000)
  • D.D Leipe

    Classification and evolution of P-loop GTPases and related ATPases

    J. Mol. Biol.

    (2002)
  • D.D Leipe

    Evolution and classification of P-loop kinases and related proteins

    J. Mol. Biol.

    (2003)
  • C.U Lenzen

    Crystal structure of the hexamerization domain of N-ethylmaleimide-sensitive fusion protein

    Cell

    (1998)
  • J Liu

    Structure and function of Cdc6/Cdc18: implications for origin recognition and checkpoint control

    Mol. Cell

    (2000)
  • A.N Lupas et al.

    AAA proteins

    Curr. Opin. Struct. Biol.

    (2002)
  • G Mocz et al.

    Model for the motor component of dynein heavy chain based on homology to the AAA family of oligomeric ATPases

    Structure (Camb)

    (2001)
  • A.G Murzin

    SCOP: a structural classification of proteins database for the investigation of sequences and structures

    J. Mol. Biol.

    (1995)
  • A.F Neuwald

    The hexamerization domain of N-ethylmaleimide-sensitive factor: structural clues to chaperone function

    Struct. Fold Des.

    (1999)
  • C Notredame

    T-Coffee: a novel method for fast and accurate multiple sequence alignment

    J. Mol. Biol.

    (2000)
  • U Pieper

    The GTP-binding domain of McrB: more than just a variation on a common theme?

    J. Mol. Biol.

    (1999)
  • C.D Putnam

    Structure and mechanism of the RuvB Holliday junction branch migration motor

    J. Mol. Biol.

    (2001)
  • W Rottbauer

    Reptin and pontin antagonistically regulate heart growth in zebrafish embryos

    Cell

    (2002)
  • M Saraste

    The P-loop—a common motif in ATP- and GTP-binding proteins

    Trends Biochem. Sci.

    (1990)
  • M Unno

    The structure of the mammalian 20S proteasome at 2.75 A resolution

    Structure (Camb)

    (2002)
  • R.D Vale

    The molecular motor toolbox for intracellular transport

    Cell

    (2003)
  • J Adachi et al.

    MOLPHY: Programs for Molecular Phylogenetics

    (1992)
  • S.F Altschul

    Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

    Nucleic Acids Res.

    (1997)
  • L Aravind et al.

    DNA-binding proteins and evolution of transcription regulation in the archaea

    Nucleic Acids Res.

    (1999)
  • L Aravind

    Conserved domains in DNA repair proteins and evolution of repair systems

    Nucleic Acids Res.

    (1999)
  • L Aravind

    Apoptotic molecular machinery: vastly increased complexity in vertebrates revealed by genome comparisons

    Science

    (2001)
  • S.E Basham et al.

    The Caenorhabditis elegans polarity gene ooc-5 encodes a Torsin-related protein of the AAA ATPase superfamily

    Development

    (2001)
  • J Bassler

    Identification of a 60S preribosomal particle that is closely linked to nuclear export

    Mol. Cell

    (2001)
  • Cited by (0)

    View full text