[3] - Identification of Sensory and Signal‐Transducing Domains in Two‐Component Signaling Systems

https://doi.org/10.1016/S0076-6879(06)22003-2Get rights and content

Abstract

The availability of complete genome sequences of diverse bacteria and archaea makes comparative sequence analysis a powerful tool for analyzing signal transduction systems encoded in these genomes. However, most signal transduction proteins consist of two or more individual protein domains, which significantly complicates their functional annotation and makes automated annotation of these proteins in the course of large‐scale genome sequencing projects particularly unreliable. This chapter describes certain common‐sense protocols for sequence analysis of two‐component histidine kinases and response regulators, as well as other components of the prokaryotic signal transduction machinery: Ser/Thr/Tyr protein kinases and protein phosphatases, adenylate and diguanylate cyclases, and c‐di‐GMP phosphodiesterases. These protocols rely on publicly available computational tools and databases and can be utilized by anyone with Internet access.

Introduction

Sequence analysis of regulatory proteins played a key role in the discovery of two‐component signal transduction. Indeed, the sequence alignments of the chemotaxis response regulator CheY and transcriptional regulators OmpR and ArcA from Escherichia coli with Bacillus subtilis sporulation proteins Spo0F and Spo0A by James Hoch and colleagues (Trach et al., 1985) and with the N‐terminal fragment of the chemotaxis methylesterase CheB by Ann and Jeffry Stock and Daniel Koshland (Stock et al., 1985) convinced them that all these protein fragments were homologous. This homology, in turn, suggested “an evolutionary and functional relationship between the chemotaxis system and systems that are thought to regulate gene expression in response to changing environmental conditions” (Stock et al., 1985). This prescient conclusion has been verified in subsequent studies that described phosphorylation of these proteins and identified their common CheY‐like receiver (REC) domain as an evolutionarily stable compact structural unit (Stock et al., 1989, Stock et al., 1993, Volz and Matsumura, 1991) that undergoes a distinctive change upon phosphorylation (Kern et al., 1999, Lee et al., 2001).

Identification of the receiver domain was followed by sequence analysis of histidine kinases, most importantly by Parkinson and Kofoid (1992), who described five conserved sequence motifs (H, N, G1, F, and G2 boxes), and by Grebe and Stock (1999), who classified histidine kinases into 11 families based on sequence similarity in their kinase domains (http://www.uni‐kl.de/FB‐Biologie/AG‐Hakenbeck/TGrebe/HPK/Table4.htm).1 These papers provided a solid basis for recognition of histidine kinases in genomic sequences and analysis of the diversity in their domain organization (Dutta et al., 1999).

The importance of sequence analysis in studies of bacterial and archaeal signal transduction systems has received an additional boost from genome sequencing projects, which provided virtually unlimited material for comparative studies. However, these studies revealed a stunning complexity and diversity of signal transduction systems in various microorganisms. The total number of sensory histidine kinases encoded in the genomes of E. coli K12 and B. subtilis, 30 and 36, respectively, proved to be quite modest compared to the sets of histidine kinases encoded by such environmental organisms as Pseudomonas aeruginosa (62 proteins), Streptomyces coelicolor (95 proteins), or Myxococcus xanthus (138 proteins); see http://www.ncbi.nlm.nih.gov/Complete_Genomes/SignalCensus.html (Galperin, 2005). Furthermore, the list of microbial environmental receptors has been expanded and now, in addition to histidine kinases and methyl‐accepting chemotaxis proteins, includes Ser/Thr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c‐di‐GMP phosphodiesterases (Galperin, 2004, Galperin, 2005, Kennelly, 2002, Kennelly and Potts, 1996, Römling et al., 2005). All these environmental receptors share a pool of sensory domains, which can be extracytoplasmic (periplasmic or, in gram‐positive bacteria, extracellular), membrane‐embedded, or cytoplasmic (Galperin et al., 2001, Nikolskaya et al., 2003, Zhulin et al., 2003); see reviews by Taylor and Zhulin, 1999, Galperin, 2004. Another important development was characterization of a complex system of “one‐component” intracellular signaling proteins (Galperin, 2004, Ulrich et al., 2005), such as the anaerobic nitric oxide reductase transcription regulator NorR, which combines a sensor GAF domain with an enhancer‐binding ATPase and a DNA‐binding domain (Gardner et al., 2003, Pohlmann et al., 2000). To complicate the picture even further, certain receptors contain more than one sensory domain and/or more than one output domain and participate in the cross‐talk between different signal transduction pathways (Galperin, 2004). However, this very complexity makes case‐by‐case sequence analysis of signal transduction proteins so effective. The following paragraphs discuss the computational tools and databases used most commonly in sequence analysis of sensory and signal transduction proteins and describe analytical methods used for recognizing histidine kinases, response regulators, and other bacterial signaling components in genomic sequences and for delineating their constituent domains.

Section snippets

Computational Tools for Domain Identification

Identification of the CheY‐like receiver (REC) domain (Stock et al., 1985, Trach et al., 1985) as a common phosphoacceptor domain in various two‐component systems demonstrated the power of comparative sequence analysis in studies of prokaryotic signal transduction systems. In subsequent studies, many other conserved protein domains involved in signal transduction were identified and included in public domain databases, such as Pfam, SMART, InterPro, and CDD (Table I). Each of these databases

Overview

A typical sensory histidine kinase consists of at least three distinct domains: a sensor (signal input) domain, a His‐containing phosphoacceptor (dimerization) HisKA domain, and an ATP‐binding HATPase domain (Dutta et al., 1999, Grebe and Stock, 1999, Hoch, 2000, Stock et al., 2000). There are numerous variations on this common theme. Sensor domains can be periplasmic, membrane‐embedded, or cytoplasmic, and a single histidine kinase can contain two or more sensory domains. Extracytoplasmic

Overview

All response regulators of the two‐component signal transduction system contain the CheY‐like phosphoacceptor (receiver, REC) domain (Stock et al., 2000, West and Stock, 2001), either in a stand‐alone form (e.g., the chemotaxis response regulator CheY or the sporulation regulator Spo0F) or fused to an effector, or output, domain, which is usually located at the C terminus of the polypeptide chain (Grebe and Stock, 1999, Stock et al., 2000). Two‐domain response regulators are typically thought

Overview

Analysis of the rapidly accumulating genome sequences from diverse bacteria and archaea revealed the great variety of sensory proteins. The characteristic architecture of histidine kinases and MCPs, which include a periplasmic sensory domain, a transmembrane segment with one or more transmembrane helices, and a cytoplasmically located output domain, was predicted for many proteins encoded in the newly sequenced genomes (Galperin, 2004, Galperin et al., 2001). However, while their N‐terminal

Functional Annotation of Multidomain Proteins

The complexity of microbial signal transduction machinery and the paucity of experimentally characterized proteins make annotating signaling proteins even in well‐studied organisms an arduous task. For example, of the 30 histidine kinases encoded by E. coli K12, functions of five (RstB, YehU, YpdA, YfhK, YedV) are unknown and several others have poorly defined substrates. For (predicted) signal transduction proteins encoded in the newly sequenced genomes this task becomes even more daunting.

References (101)

  • T. Hirokawa et al.

    SOSUI: Classification and secondary structure prediction system for membrane proteins

    Bioinformatics

    (1998)
  • N. Hulo et al.

    The PROSITE database

    Nucleic Acids Res.

    (2006)
  • B. Karniol et al.

    The HWE histidine kinases, a new family of bacterial two‐component sensor kinases with potentially diverse roles in environmental signaling

    J. Bacteriol.

    (2004)
  • A. Krupa et al.

    KinG: A database of protein kinases in genomes

    Nucleic Acids Res.

    (2004)
  • I. Letunic et al.

    SMART 5: Domains in the context of genomes and networks

    Nucleic Acids Res.

    (2006)
  • E. Martinez‐Hackert et al.

    Structural relationships in the OmpR family of winged‐helix transcription factors

    J. Mol. Biol.

    (1997)
  • T. Mizuno

    Compilation of all genes encoding two‐component phosphotransfer signal transducers in the genome of Escherichia coli

    DNA Res.

    (1997)
  • N.J. Mulder et al.

    InterPro, progress and status in 2005

    Nucleic Acids Res.

    (2005)
  • A.M. Stock et al.

    Three‐dimensional structure of CheY, the response regulator of bacterial chemotaxis

    Nature

    (1989)
  • A.M. Stock et al.

    Two‐component signal transduction

    Annu. Rev. Biochem.

    (2000)
  • A. Toro‐Roman et al.

    Structural analysis and solution studies of the activated regulatory domain of the response regulator ArcA: A symmetric dimer mediated by the a4‐b5‐a5 face

    J. Mol. Biol.

    (2005)
  • K.A. Trach et al.

    Deduced product of the stage 0 sporulation gene spo0F shares homology with the Spo0A, OmpR, and SfrA proteins

    Proc. Natl. Acad. Sci. USA

    (1985)
  • L.E. Ulrich et al.

    MiST: A microbial signal transduction database

    Nucleic Acids Res.

    (2007)
  • S.B. Williams et al.

    Functional similarities among two‐component sensors and methyl‐accepting chemotaxis proteins suggest a role for linker region amphipathic helices in transmembrane signal transduction

    Mol. Microbiol.

    (1999)
  • C.H. Wu et al.

    PIRSF: Family classification system at the Protein Information Resource

    Nucleic Acids Res.

    (2004)
  • H. Yu et al.

    Identification of the algZ gene upstream of the response regulator algR and its participation in control of alginate production in Pseudomonas aeruginosa

    J. Bacteriol.

    (1997)
  • Z. Yuan et al.

    SVMtm: Support vector machines to predict transmembrane segments

    J. Comput. Chem.

    (2004)
  • G.S. Anand et al.

    Kinetic basis for the stimulatory effect of phosphorylation on the methylesterase activity of CheB

    Biochemistry

    (2002)
  • M. Arai et al.

    ConPred II: A consensus prediction method for obtaining transmembrane topology models with high reliability

    Nucleic Acids Res.

    (2004)
  • L. Aravind et al.

    AT‐hook motifs identified in a wide variety of DNA‐binding proteins

    Nucleic Acids Res.

    (1998)
  • L. Aravind et al.

    The cytoplasmic helical linker domain of receptor histidine kinase and methyl‐accepting proteins is common to many prokaryotic signalling proteins

    FEMS Microbiol. Lett.

    (1999)
  • I. Baikalov et al.

    Structure of the Escherichia coli response regulator NarL

    Biochemistry

    (1996)
  • D.A. Baker et al.

    Structure, function and evolution of microbial adenylyl and guanylyl cyclases

    Mol. Microbiol.

    (2004)
  • C. Ban et al.

    Crystal structure and ATPase activity of MutL: Implications for DNA repair and mutagenesis

    Cell

    (1998)
  • H.M. Berman et al.

    The Protein Data Bank

    Nucleic Acids Res.

    (2000)
  • A.M. Bilwes et al.

    Structure of CheA, a signal‐transducing histidine kinase

    Cell

    (1999)
  • C. Chan et al.

    Structural basis of activity and allosteric control of diguanylate cyclase

    Proc. Natl. Acad. Sci. USA

    (2004)
  • J.K. Cheung et al.

    Glutamate residues in the putative transmembrane region are required for the function of the VirS sensor histidine kinase from Clostridium perfringens

    Microbiology

    (2000)
  • B. Christen et al.

    Allosteric control of cyclic di‐GMP signaling

    J. Biol. Chem.

    (2006)
  • G.E. Crooks et al.

    WebLogo: A sequence logo generator

    Genome Res.

    (2004)
  • M. Cserzo et al.

    TM or not TM: Transmembrane protein prediction with low false positive rate using DAS‐TMfilter

    Bioinformatics

    (2004)
  • M. D'Souza et al.

    Sentra, a database of signal transduction proteins for comparative genome analysis

    Nucleic Acids Res.

    (2007)
  • R. Dutta et al.

    GHKL, an emergent ATPase/kinase superfamily

    Trends Biochem. Sci.

    (2000)
  • R.D. Finn et al.

    Pfam: Clans, web tools and services

    Nucleic Acids Res.

    (2006)
  • M.Y. Galperin

    Bacterial signal transduction network in a genomic perspective

    Environ. Microbiol.

    (2004)
  • M.Y. Galperin

    A census of membrane‐bound and intracellular signal transduction proteins in bacteria: Bacterial IQ, extroverts and introverts

    BMC Microbiol.

    (2005)
  • M.Y. Galperin

    Structural classification of bacterial response regulators: Diversity of output domains and domain combinations

    J. Bacteriol.

    (2006)
  • M.Y. Galperin et al.

    MHYT, a new integral membrane sensor domain

    FEMS Microbiol. Lett.

    (2001)
  • M.Y. Galperin et al.

    A specialized version of the HD hydrolase domain implicated in signal transduction

    J. Mol. Microbiol. Biotechnol.

    (1999)
  • A.M. Gardner et al.

    Regulation of the nitric oxide reduction operon (norRVW) in Escherichia coli: Role of NorR and Σ54 in the nitric oxide stress response

    J. Biol. Chem.

    (2003)
  • Cited by (28)

    • The atypical two-component sensor kinase Lpl0330 from Legionella pneumophila controls the bifunctional diguanylate cyclase-phosphodiesterase Lpl0329 to modulate Bis-(3′-5′)-cyclic dimeric GMP synthesis

      2011, Journal of Biological Chemistry
      Citation Excerpt :

      The N termini of HKs are diverse and usually contain sensory or “input ” domains that respond to environmental stimuli to activate the transmitter domain. The transmitter domain consists of two distinct subdomains: an ATP-binding HATPase domain involved in the autophosphorylation of the HK at the conserved histidine residue in a His-containing phosphoacceptor (dimerization) His kinase A (HisKA) domain (4). The phosphoryl group is then transferred from this histidine residue to an aspartate residue in the receiver domain of the RR partner.

    • Two-Component Systems in Microbial Communities: Approaches and Resources for Generating and Analyzing Metagenomic Data Sets

      2007, Methods in Enzymology
      Citation Excerpt :

      Characterization of these pathways and their integration in the context of community metabolism inferred from metagenomic sequence data is an important step in understanding community ecophysiology and may reveal biological features of specific organisms (e.g., chemotaxis) that can facilitate their isolation and cultivation. As there have been several other chapters (Galperin, 2005, 2006a), including some in this volume (see Galperin and Nikolskaya, 2007; Wuichet et al., 2007), that detail methods of searching for and analyzing signal transduction genes in microbial genomes, this chapter focuses on current approaches and resources for generating and analyzing environmental sequence data. Since the first use of ribosomal RNA sequences to characterize the diversity of bacteria and archaea in environmental samples, molecular approaches have become ubiquitous in microbial ecology.

    • Oxygen and Redox Sensing by Two-Component Systems That Regulate Behavioral Responses: Behavioral Assays and Structural Studies of Aer Using In Vivo Disulfide Cross-Linking

      2007, Methods in Enzymology
      Citation Excerpt :

      We also describe pitfalls and important controls that are well known in the chemotaxis community but are not readily accessible to new investigators. The two‐component histidine kinase system for E. coli chemotaxis is described elsewhere in this volume (Galperin and Nikolskaya, 2007; Wuichet et al., 2007). Briefly, the Tsr, Tar, Trg, and Tap chemoreceptors modulate the autophosphorylation of the histidine kinase CheA when it is coupled to the chemoreceptor signaling domain via the CheW protein.

    View all citing articles on Scopus
    View full text