Journal of Molecular Biology
Volume 333, Issue 4, 31 October 2003, Pages 781-815
Journal home page for Journal of Molecular Biology

Evolution and Classification of P-loop Kinases and Related Proteins

https://doi.org/10.1016/j.jmb.2003.08.040Get rights and content

Abstract

Sequences and structures of all P-loop-fold proteins were compared with the aim of reconstructing the principal events in the evolution of P-loop-containing kinases. It is shown that kinases and some related proteins comprise a monophyletic assemblage within the P-loop NTPase fold. An evolutionary classification of these proteins was developed using standard phylogenetic methods, analysis of shared sequence and structural signatures, and similarity-based clustering. This analysis resulted in the identification of approximately 40 distinct protein families within the P-loop kinase class. Most of these enzymes phosphorylate nucleosides and nucleotides, as well as sugars, coenzyme precursors, adenosine 5′-phosphosulfate and polynucleotides. In addition, the class includes sulfotransferases, amide bond ligases, pyrimidine and dihydrofolate reductases, and several other families of enzymes that have acquired new catalytic capabilities distinct from the ancestral kinase reaction. Our reconstruction of the early history of the P-loop NTPase fold includes the initial split into the common ancestor of the kinase and the GTPase classes, and the common ancestor of ATPases. This was followed by the divergence of the kinases, which primarily phosphorylated nucleoside monophosphates (NMP), but could have had broader specificity. We provide evidence for the presence of at least two to four distinct P-loop kinases, including distinct forms specific for dNMP and rNMP, and related enzymes in the last universal common ancestor of all extant life forms. Subsequent evolution of kinases seems to have been dominated by the emergence of new bacterial and, to a lesser extent, archaeal families. Some of these enzymes retained their kinase activity but evolved new substrate specificities, whereas others acquired new activities, such as sulfate transfer and reduction. Eukaryotes appear to have acquired most of their kinases via horizontal gene transfer from Bacteria, partly from the mitochondrial and chloroplast endosymbionts and partly at later stages of evolution. A distinct superfamily of kinases, which we designated DxTN after its sequence signature, appears to have evolved in selfish replicons, such as bacteriophages, and was subsequently widely recruited by eukaryotes for multiple functions related to nucleic acid processing and general metabolism. In the course of this analysis, several previously undetected groups of predicted kinases were identified, including widespread archaeo-eukaryotic and archaeal families. The results could serve as a framework for systematic experimental characterization of new biochemical and biological functions of kinases.

Introduction

A large part of the proteome of any organism is devoted to proteins that hydrolyze or bind nucleoside triphosphates (NTPs) (review1). Although there are several distinct NTP-binding protein folds, the P-loop NTPases are by far the most abundant and diverse, comprising 10%–18% of the predicted gene products in the sequenced prokaryotic and eukaryotic genomes.2 The P-loop NTPases are a monophyletic assemblage of protein domains that appear to have emerged at a very early stage of life's evolution; the last universal common ancestor (LUCA) of all modern cellular life forms apparently already encoded multiple P-loop NTPases.3., 4., 5., 6. Typically, P-loop NTPases hydrolyze the β–γ phosphate bond of a bound nucleoside triphosphate. Structurally, they adopt a three-layered α/β sandwich configuration that contains regularly recurring α-β units with the β-strands forming a central, mostly parallel β-sheet surrounded on both sides by α-helices (see SCOP database†). At the sequence level, P-loop NTPases are generally characterized by two strongly conserved sequence motifs, the Walker A and B motifs, which, respectively, bind the β and γ phosphate moieties of the bound NTP, and a Mg2+ cation.7 The Walker A motif forms a loop between strand 1 and helix 1 of the P-loop domain and adopts the sequence pattern GxxxxGK [ST] or a variation thereof. The Walker B motif is composed of a conserved aspartate (or, less commonly, glutamate) at the C terminus of a hydrophobic strand and provides a bond for the octahedral coordination of a Mg2+ cation, which, in turn, is coordinated to the β and γ phosphate groups of NTP.7 Furthermore, a hydrogen bond between the Walker B aspartate and the conserved threonine/serine of the P-loop secures the proper relative positioning of the two phosphate-binding motifs.

Comparative genomics has suggested that at least seven major lineages of P-loop NTPases were already represented in LUCA: (i) RecA and F1/F0-related ATPases; (ii) nucleic acid-dependent ATPases (helicases, Swi2, and PhoH-like ATPases); (iii) AAA+ ATPases; (iv) apoptotic (AP) NTPases and their relatives; (v) ABC-ATPases and the relatives; (vi) P-loop containing kinases; and (vii) GTPases and related ATPases5., 6., 8., 9., 10., 11. (L.A. & E.V.K., unpublished results). Since the P-loop NTPases are the most prevalent fold in the protein domain universe, analysis of their genomic distribution and evolutionary history simultaneously throws light on a number of disparate biological processes. In previous studies, we attempted to reconstruct the major aspects of the natural history of AAA+ATPases10 and GTPases.5 Here, we analyze the P-loop-containing kinases and their derivatives.

Kinases are ubiquitous enzymes that transfer the γ phosphate of ATP to a wide range of substrates, ranging from nucleotides and other small molecules to nucleic acids and proteins. Enzymes with kinase function have evolved independently in a number of the major folds and are found in the P-loop, Rossmannoid, RRM-like, ribonuclease H, and TIM b/a barrel fold among others.12 Earlier studies have indicated that kinase activity might have evolved within the P-loop fold on multiple independent occasions. Thus, protein kinases, such as ETK, and small molecule kinases, such as adenosylcobinamide kinase, evolved, respectively, within the GTPase class5 and the RecA/F1-ATPase class.13 However, structural studies suggest that the best-characterized P-loop kinases, namely nucleotide kinases, along with some kinases of other small molecules, such as shikimate and 6-phosphofructose, form a monophyletic lineage distinct from all other P-loop NTPases. Furthermore, structural comparisons have indicated that sulfotransferases, which transfer sulfate moieties to a variety of substrates, also adopt a fold nearly identical with these kinases.14 Hereinafter, we refer to this monophyletic assemblage of kinases and their relatives simply as the P-loop kinases.

The extreme sequence divergence of P-loop kinases beyond the core elements (primarily, the Walker A and B motifs) has so far hampered a clear understanding of the evolutionary relationships within this class of P-loop proteins. However, accumulation of over 30 crystal structures of P-loop kinases, along with the wealth of sequence data (largely from genome sequencing projects) creates the pre-requisites for tackling this problem. Using sequence profile analysis combined with structural comparisons, we identify the key sequence and structural features that define the kinase class within the P-loop NTPase fold. We then use these features to detect the entire complements of P-loop kinases encoded in all sequenced genomes. Traditional phylogenetic tree analysis and a cladistic approach using sequence and structural motifs (identification of shared derived characters) are combined to extract evolutionary information at various levels and to develop an evolutionary classification of the P-loop kinases.

Section snippets

General sequence, structural and catalytic features of the P-loop kinases

We first defined the core P-loop kinase class by constructing an alignment of the kinases with known 3D structures on the basis of the structural superposition. This allowed the identification of the major conserved structural features of P-loop kinases. The sequences of the kinases with known structures were then used as seeds in BLAST searches to identify all their sequence neighbors in the NR database. These were then added to the initial, structure-based alignment, and used to identify the

Identification of kinase families and relationships between them

All characterized and predicted P-loop kinases detected in the database searches were clustered using the BLASTCLUST program with varying score density and protein length overlap thresholds. Those groups that remained stable over a range of thresholds were considered likely to define true monophyletic families or at least cores of such families. The alignments of the individual families were analyzed to identify regions of extended conservation between and beyond the principal conserved

Materials and Methods

Sequences of P-loop kinases and related proteins were extracted from the non-redundant (NR) protein sequence database (National Center for Biotechnology Information, NIH, Bethesda) by using the PSI-BLAST program,164., 165. with the sequences of kinases identified previously in the literature employed as queries. Sequence similarity-based protein clustering was performed using the BLASTCLUST program†. Multiple alignments were constructed using the

References (180)

  • N. Ostermann et al.

    Insights into the phosphoryltransfer mechanism of human thymidylate kinase gained from crystal structures of enzyme complexes along the reaction coordinate

    Struct. Fold. Des.

    (2000)
  • C. Vonrhein et al.

    The structure of a trimeric archaeal adenylate kinase

    J. Mol. Biol.

    (1998)
  • T. Krell et al.

    The three-dimensional structure of shikimate kinase

    J. Mol. Biol.

    (1998)
  • L. Kraft et al.

    Conformational changes during the catalytic cycle of gluconate kinase as revealed by X-ray crystallography

    J. Mol. Biol.

    (2002)
  • G. Obmolova et al.

    Crystal structure of dephospho-coenzyme A kinase from Haemophilus influenzae

    J. Struct. Biol.

    (2001)
  • C.A. Hasemann et al.

    The crystal structure of the bifunctional enzyme 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase reveals distinct domain homologies

    Structure

    (1996)
  • Y. Kakuta et al.

    Conserved structural motifs in the sulfotransferase family

    Trends Biochem. Sci.

    (1998)
  • H.J. Muller-Dieckmann et al.

    The structure of uridylate kinase with its substrates, showing the transition state geometry

    J. Mol. Biol.

    (1994)
  • M. Yun et al.

    Structural basis for the feedback regulation of Escherichia coli pantothenate kinase by coenzyme A

    J. Biol. Chem.

    (2000)
  • T. Stehle et al.

    Refined structure of the complex between guanylate kinase and its substrate GMP at 2.0 Å resolution

    J. Mol. Biol.

    (1992)
  • T. Bertrand et al.

    Sugar specificity of bacterial CMP kinases as revealed by crystal structures and mutagenesis of Escherichia coli enzyme

    J. Mol. Biol.

    (2002)
  • M. Negishi et al.

    Structure and function of sulfotransferases

    Arch. Biochem. Biophys.

    (2001)
  • K. Uyeda et al.

    The active sites of fructose 6-phosphate,2-kinase: fructose-2,6-bisphosphatase from rat testis. Roles of Asp-128, Thr-52, Thr-130, Asn-73, and Tyr-197

    J. Biol. Chem.

    (1997)
  • L. Wiesmuller et al.

    cDNA-derived sequence of UMP-CMP kinase from Dictyostelium discoideum and expression of the enzyme in Escherichia coli

    J. Biol. Chem.

    (1990)
  • R. Schricker et al.

    The adenylate kinase family in yeast: identification of URA6 as a multicopy suppressor of deficiency in major AMP kinase

    Gene

    (1992)
  • D.F. Woods et al.

    The discs-large tumor suppressor gene of Drosophila encodes a guanylate kinase homolog localized at septate junctions

    Cell

    (1991)
  • J.M. Anderson

    Cell signalling: MAGUK magic

    Curr. Biol.

    (1996)
  • U. Kistner et al.

    Nucleotide binding by the synapse associated protein SAP90

    FEBS Letters

    (1995)
  • Y. Li et al.

    Structural basis for nucleotide-dependent regulation of membrane-associated guanylate kinase-like domains

    J. Biol. Chem.

    (2002)
  • S.K. Kim

    Tight junctions, membrane-associated guanylate kinases and cell signaling

    Curr. Opin. Cell Biol.

    (1995)
  • O. Bossinger et al.

    Zonula adherens formation in Caenorhabditis elegans requires dlg-1, the homologue of the Drosophila gene discs large

    Dev. Biol.

    (2001)
  • P.A. Ropp et al.

    Cloning and expression of a cDNA encoding uridine kinase from mouse brain

    Arch. Biochem. Biophys.

    (1996)
  • R.B. Calder et al.

    Cloning and characterization of a eukaryotic pantothenate kinase gene (panK) from Aspergillus nidulans

    J. Biol. Chem.

    (1999)
  • A. Lehmacher et al.

    Biosynthesis of cyclic 2,3-diphosphoglycerate. Isolation and characterization of 2-phosphoglycerate kinase and cyclic 2,3-diphosphoglycerate synthetase from Methanothermus fervidus

    FEBS Letters

    (1990)
  • K. Lacher et al.

    Archaebacterial adenylate kinase from the thermoacidophile Sulfolobus acidocaldarius: purification, characterization, and partial sequence

    Arch. Biochem. Biophys.

    (1993)
  • N. Bucurenci et al.

    CMP kinase from Escherichia coli is structurally related to other nucleoside monophosphate kinases

    J. Biol. Chem.

    (1996)
  • P. Briozzo et al.

    Structures of Escherichia coli CMP kinase alone and in complex with CDP: a new fold of the nucleoside monophosphate binding domain and insights into cytosine nucleotide specificity

    Structure

    (1998)
  • G.T. Ma et al.

    Cloning and expression of the heterodimeric deoxyguanosine kinase/deoxyadenosine kinase of Lactobacillus acidophilus R-26

    J. Biol. Chem.

    (1995)
  • J.L. Loeffen et al.

    cDNA of eight nuclear encoded subunits of NADH:ubiquinone oxidoreductase: human complex I cDNA characterization completed

    Biochem. Biophys. Res. Commun.

    (1998)
  • K. Wild et al.

    The three-dimensional structure of thymidine kinase from herpes simplex virus type 1

    FEBS Letters

    (1995)
  • S.K. Singh et al.

    Crystal structure of Haemophilus influenzae NadR protein. A bifunctional enzyme endowed with NMN adenyltransferase and ribosylnicotinimide kinase activities

    J. Biol. Chem.

    (2002)
  • D.H. Duckworth et al.

    The enzymology of virus-infected bacteria. X. A biochemical-genetic study of the deoxynucleotide kinase induced by wild type and amber mutants of phage T4

    J. Biol. Chem.

    (1967)
  • L.M. Olivier et al.

    Characterization of phosphomevalonate kinase: chromosomal localization, regulation, and subcellular targeting

    J. Lipid Res.

    (1999)
  • K.X. Huang et al.

    Overexpression, purification, and characterization of the thermostable mevalonate kinase from Methanococcus jannaschii

    Protein Exptl Purif.

    (1999)
  • S.M. Houten et al.

    Nonorthologous gene displacement of phosphomevalonate kinase

    Mol. Genet. Metab.

    (2001)
  • T. Zhou et al.

    Structure and mechanism of homoserine kinase: prototype for the GHMP kinase superfamily

    Struct. Fold. Des.

    (2000)
  • H. Izu et al.

    Purification and characterization of the Escherichia coli thermoresistant glucokinase encoded by the gntK gene

    FEBS Letters

    (1996)
  • I.R. Vetter et al.

    Nucleoside triphosphate-binding proteins: different scaffolds to achieve phosphoryl transfer

    Quart. Rev. Biophys.

    (1999)
  • R.F. Doolittle et al.

    Determining divergence times of the major kingdoms of living organisms with a protein clock

    Science

    (1996)
  • M.M. Miyamoto et al.

    Constraints on protein evolution and the age of the eubacteria/eukaryote split

    Syst. Biol.

    (1996)
  • Cited by (245)

    • Characterizing a novel CMK-EngA fusion protein from Bifidobacterium: Implications for inter-domain regulation

      2023, Biochemistry and Biophysics Reports
      Citation Excerpt :

      In eukaryotes, the CMP kinase is a multifunctional enzyme, which can convert not only CMP but also UMP and its deoxy monophosphates to their respective dinucleotide form [17]. In contrast, Bacteria have separate kinases to act on these two nucleotides; CMP kinases to convert CMP and dCMP to their dinucleotide, and Aspartate Kinases to convert UMP to UDP [18,19]. Bifidobacterium too possesses a separate UMP kinase for the conversion of UMP to UDP.

    • BY-kinases: Protein tyrosine kinases like no other

      2023, Journal of Biological Chemistry
    View all citing articles on Scopus

    Supplementary data associated with this article can be found at doi: 10.1016/j.jmb.2003.08.040

    View full text