Journal of Molecular Biology
Evolution and Classification of P-loop Kinases and Related Proteins☆
Introduction
A large part of the proteome of any organism is devoted to proteins that hydrolyze or bind nucleoside triphosphates (NTPs) (review1). Although there are several distinct NTP-binding protein folds, the P-loop NTPases are by far the most abundant and diverse, comprising 10%–18% of the predicted gene products in the sequenced prokaryotic and eukaryotic genomes.2 The P-loop NTPases are a monophyletic assemblage of protein domains that appear to have emerged at a very early stage of life's evolution; the last universal common ancestor (LUCA) of all modern cellular life forms apparently already encoded multiple P-loop NTPases.3., 4., 5., 6. Typically, P-loop NTPases hydrolyze the β–γ phosphate bond of a bound nucleoside triphosphate. Structurally, they adopt a three-layered α/β sandwich configuration that contains regularly recurring α-β units with the β-strands forming a central, mostly parallel β-sheet surrounded on both sides by α-helices (see SCOP database†). At the sequence level, P-loop NTPases are generally characterized by two strongly conserved sequence motifs, the Walker A and B motifs, which, respectively, bind the β and γ phosphate moieties of the bound NTP, and a Mg2+ cation.7 The Walker A motif forms a loop between strand 1 and helix 1 of the P-loop domain and adopts the sequence pattern GxxxxGK [ST] or a variation thereof. The Walker B motif is composed of a conserved aspartate (or, less commonly, glutamate) at the C terminus of a hydrophobic strand and provides a bond for the octahedral coordination of a Mg2+ cation, which, in turn, is coordinated to the β and γ phosphate groups of NTP.7 Furthermore, a hydrogen bond between the Walker B aspartate and the conserved threonine/serine of the P-loop secures the proper relative positioning of the two phosphate-binding motifs.
Comparative genomics has suggested that at least seven major lineages of P-loop NTPases were already represented in LUCA: (i) RecA and F1/F0-related ATPases; (ii) nucleic acid-dependent ATPases (helicases, Swi2, and PhoH-like ATPases); (iii) AAA+ ATPases; (iv) apoptotic (AP) NTPases and their relatives; (v) ABC-ATPases and the relatives; (vi) P-loop containing kinases; and (vii) GTPases and related ATPases5., 6., 8., 9., 10., 11. (L.A. & E.V.K., unpublished results). Since the P-loop NTPases are the most prevalent fold in the protein domain universe, analysis of their genomic distribution and evolutionary history simultaneously throws light on a number of disparate biological processes. In previous studies, we attempted to reconstruct the major aspects of the natural history of AAA+ATPases10 and GTPases.5 Here, we analyze the P-loop-containing kinases and their derivatives.
Kinases are ubiquitous enzymes that transfer the γ phosphate of ATP to a wide range of substrates, ranging from nucleotides and other small molecules to nucleic acids and proteins. Enzymes with kinase function have evolved independently in a number of the major folds and are found in the P-loop, Rossmannoid, RRM-like, ribonuclease H, and TIM b/a barrel fold among others.12 Earlier studies have indicated that kinase activity might have evolved within the P-loop fold on multiple independent occasions. Thus, protein kinases, such as ETK, and small molecule kinases, such as adenosylcobinamide kinase, evolved, respectively, within the GTPase class5 and the RecA/F1-ATPase class.13 However, structural studies suggest that the best-characterized P-loop kinases, namely nucleotide kinases, along with some kinases of other small molecules, such as shikimate and 6-phosphofructose, form a monophyletic lineage distinct from all other P-loop NTPases. Furthermore, structural comparisons have indicated that sulfotransferases, which transfer sulfate moieties to a variety of substrates, also adopt a fold nearly identical with these kinases.14 Hereinafter, we refer to this monophyletic assemblage of kinases and their relatives simply as the P-loop kinases.
The extreme sequence divergence of P-loop kinases beyond the core elements (primarily, the Walker A and B motifs) has so far hampered a clear understanding of the evolutionary relationships within this class of P-loop proteins. However, accumulation of over 30 crystal structures of P-loop kinases, along with the wealth of sequence data (largely from genome sequencing projects) creates the pre-requisites for tackling this problem. Using sequence profile analysis combined with structural comparisons, we identify the key sequence and structural features that define the kinase class within the P-loop NTPase fold. We then use these features to detect the entire complements of P-loop kinases encoded in all sequenced genomes. Traditional phylogenetic tree analysis and a cladistic approach using sequence and structural motifs (identification of shared derived characters) are combined to extract evolutionary information at various levels and to develop an evolutionary classification of the P-loop kinases.
Section snippets
General sequence, structural and catalytic features of the P-loop kinases
We first defined the core P-loop kinase class by constructing an alignment of the kinases with known 3D structures on the basis of the structural superposition. This allowed the identification of the major conserved structural features of P-loop kinases. The sequences of the kinases with known structures were then used as seeds in BLAST searches to identify all their sequence neighbors in the NR database. These were then added to the initial, structure-based alignment, and used to identify the
Identification of kinase families and relationships between them
All characterized and predicted P-loop kinases detected in the database searches were clustered using the BLASTCLUST program with varying score density and protein length overlap thresholds. Those groups that remained stable over a range of thresholds were considered likely to define true monophyletic families or at least cores of such families. The alignments of the individual families were analyzed to identify regions of extended conservation between and beyond the principal conserved
Materials and Methods
Sequences of P-loop kinases and related proteins were extracted from the non-redundant (NR) protein sequence database (National Center for Biotechnology Information, NIH, Bethesda) by using the PSI-BLAST program,164., 165. with the sequences of kinases identified previously in the literature employed as queries. Sequence similarity-based protein clustering was performed using the BLASTCLUST program†. Multiple alignments were constructed using the
References (180)
- et al.
Protein fold recognition using sequence profiles and its application in structural genomics
Advan. Protein Chem.
(2000) - et al.
Classification and evolution of P-loop GTPases and Related ATPases
J. Mol. Biol.
(2002) - et al.
Helicases: amino acid sequence comparisons and structure–function relationships
Curr. Opin. Struct. Biol.
(1993) - et al.
Sequence and structure classification of kinases
J. Mol. Biol.
(2002) - et al.
Crystal structure of the IIB subunit of a fructose permease (IIBLev) from Bacillus subtilis
J. Mol. Biol.
(1998) - et al.
SCOP: a structural classification of proteins database for the investigation of sequences and structures
J. Mol. Biol.
(1995) - et al.
On the in vivo function of the RecA ATPase
J. Mol. Biol.
(1999) - et al.
Crystal structure of the helicase domain from the replicative helicase-primase of bacteriophage T7
Cell
(1999) - et al.
Induced-fit movements in adenylate kinases
J. Mol. Biol.
(1990) - et al.
Structure of the complex between adenylate kinase from Escherichia coli and the inhibitor Ap5A refined at 1.9 Å resolution. A model for a catalytic transition state
J. Mol. Biol.
(1992)
Insights into the phosphoryltransfer mechanism of human thymidylate kinase gained from crystal structures of enzyme complexes along the reaction coordinate
Struct. Fold. Des.
The structure of a trimeric archaeal adenylate kinase
J. Mol. Biol.
The three-dimensional structure of shikimate kinase
J. Mol. Biol.
Conformational changes during the catalytic cycle of gluconate kinase as revealed by X-ray crystallography
J. Mol. Biol.
Crystal structure of dephospho-coenzyme A kinase from Haemophilus influenzae
J. Struct. Biol.
The crystal structure of the bifunctional enzyme 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase reveals distinct domain homologies
Structure
Conserved structural motifs in the sulfotransferase family
Trends Biochem. Sci.
The structure of uridylate kinase with its substrates, showing the transition state geometry
J. Mol. Biol.
Structural basis for the feedback regulation of Escherichia coli pantothenate kinase by coenzyme A
J. Biol. Chem.
Refined structure of the complex between guanylate kinase and its substrate GMP at 2.0 Å resolution
J. Mol. Biol.
Sugar specificity of bacterial CMP kinases as revealed by crystal structures and mutagenesis of Escherichia coli enzyme
J. Mol. Biol.
Structure and function of sulfotransferases
Arch. Biochem. Biophys.
The active sites of fructose 6-phosphate,2-kinase: fructose-2,6-bisphosphatase from rat testis. Roles of Asp-128, Thr-52, Thr-130, Asn-73, and Tyr-197
J. Biol. Chem.
cDNA-derived sequence of UMP-CMP kinase from Dictyostelium discoideum and expression of the enzyme in Escherichia coli
J. Biol. Chem.
The adenylate kinase family in yeast: identification of URA6 as a multicopy suppressor of deficiency in major AMP kinase
Gene
The discs-large tumor suppressor gene of Drosophila encodes a guanylate kinase homolog localized at septate junctions
Cell
Cell signalling: MAGUK magic
Curr. Biol.
Nucleotide binding by the synapse associated protein SAP90
FEBS Letters
Structural basis for nucleotide-dependent regulation of membrane-associated guanylate kinase-like domains
J. Biol. Chem.
Tight junctions, membrane-associated guanylate kinases and cell signaling
Curr. Opin. Cell Biol.
Zonula adherens formation in Caenorhabditis elegans requires dlg-1, the homologue of the Drosophila gene discs large
Dev. Biol.
Cloning and expression of a cDNA encoding uridine kinase from mouse brain
Arch. Biochem. Biophys.
Cloning and characterization of a eukaryotic pantothenate kinase gene (panK) from Aspergillus nidulans
J. Biol. Chem.
Biosynthesis of cyclic 2,3-diphosphoglycerate. Isolation and characterization of 2-phosphoglycerate kinase and cyclic 2,3-diphosphoglycerate synthetase from Methanothermus fervidus
FEBS Letters
Archaebacterial adenylate kinase from the thermoacidophile Sulfolobus acidocaldarius: purification, characterization, and partial sequence
Arch. Biochem. Biophys.
CMP kinase from Escherichia coli is structurally related to other nucleoside monophosphate kinases
J. Biol. Chem.
Structures of Escherichia coli CMP kinase alone and in complex with CDP: a new fold of the nucleoside monophosphate binding domain and insights into cytosine nucleotide specificity
Structure
Cloning and expression of the heterodimeric deoxyguanosine kinase/deoxyadenosine kinase of Lactobacillus acidophilus R-26
J. Biol. Chem.
cDNA of eight nuclear encoded subunits of NADH:ubiquinone oxidoreductase: human complex I cDNA characterization completed
Biochem. Biophys. Res. Commun.
The three-dimensional structure of thymidine kinase from herpes simplex virus type 1
FEBS Letters
Crystal structure of Haemophilus influenzae NadR protein. A bifunctional enzyme endowed with NMN adenyltransferase and ribosylnicotinimide kinase activities
J. Biol. Chem.
The enzymology of virus-infected bacteria. X. A biochemical-genetic study of the deoxynucleotide kinase induced by wild type and amber mutants of phage T4
J. Biol. Chem.
Characterization of phosphomevalonate kinase: chromosomal localization, regulation, and subcellular targeting
J. Lipid Res.
Overexpression, purification, and characterization of the thermostable mevalonate kinase from Methanococcus jannaschii
Protein Exptl Purif.
Nonorthologous gene displacement of phosphomevalonate kinase
Mol. Genet. Metab.
Structure and mechanism of homoserine kinase: prototype for the GHMP kinase superfamily
Struct. Fold. Des.
Purification and characterization of the Escherichia coli thermoresistant glucokinase encoded by the gntK gene
FEBS Letters
Nucleoside triphosphate-binding proteins: different scaffolds to achieve phosphoryl transfer
Quart. Rev. Biophys.
Determining divergence times of the major kingdoms of living organisms with a protein clock
Science
Constraints on protein evolution and the age of the eubacteria/eukaryote split
Syst. Biol.
Cited by (245)
Catalytic amyloids for nucleotide hydrolysis
2024, Methods in EnzymologyThe Cytotoxic Mycobacteriophage Protein Phaedrus gp82 Interacts with and Modulates the Activity of the Host ATPase, MoxR
2023, Journal of Molecular BiologyCharacterizing a novel CMK-EngA fusion protein from Bifidobacterium: Implications for inter-domain regulation
2023, Biochemistry and Biophysics ReportsCitation Excerpt :In eukaryotes, the CMP kinase is a multifunctional enzyme, which can convert not only CMP but also UMP and its deoxy monophosphates to their respective dinucleotide form [17]. In contrast, Bacteria have separate kinases to act on these two nucleotides; CMP kinases to convert CMP and dCMP to their dinucleotide, and Aspartate Kinases to convert UMP to UDP [18,19]. Bifidobacterium too possesses a separate UMP kinase for the conversion of UMP to UDP.
BY-kinases: Protein tyrosine kinases like no other
2023, Journal of Biological ChemistryStructural and functional studies of Arabidopsis thaliana triphosphate tunnel metalloenzymes reveal roles for additional domains
2022, Journal of Biological Chemistry
- ☆
Supplementary data associated with this article can be found at doi: 10.1016/j.jmb.2003.08.040