The complete Corynebacterium glutamicum ATCC 13032 genome sequence and its impact on the production of l-aspartate-derived amino acids and vitamins

https://doi.org/10.1016/S0168-1656(03)00154-8Get rights and content

Abstract

The complete genomic sequence of Corynebacterium glutamicum ATCC 13032, well-known in industry for the production of amino acids, e.g. of l-glutamate and l-lysine was determined. The C. glutamicum genome was found to consist of a single circular chromosome comprising 3 282 708 base pairs. Several DNA regions of unusual composition were identified that were potentially acquired by horizontal gene transfer, e.g. a segment of DNA from C. diphtheriae and a prophage-containing region. After automated and manual annotation, 3002 protein-coding genes have been identified, and to 2489 of these, functions were assigned by homologies to known proteins. These analyses confirm the taxonomic position of C. glutamicum as related to Mycobacteria and show a broad metabolic diversity as expected for a bacterium living in the soil. As an example for biotechnological application the complete genome sequence was used to reconstruct the metabolic flow of carbon into a number of industrially important products derived from the amino acid l-aspartate.

Introduction

In the mid-1950s, Kinoshita and co-workers in Japan isolated a bacterium, which was shown to excrete large quantities of l-glutamic acid into the culture medium (Kinoshita et al., 1957). This bacterium, Corynebacterium glutamicum, was described as a short, aerobic, gram-positive rod capable of growing on a variety of sugars or organic acids. Under optimal conditions, this organism converted glucose into high yields of l-glutamic acid within a few days. Currently about 1×106 tons of this amino acid are produced with this microorganism annually and used as a flavoring agent (Leuchtenberger, 1996). During the past 40 years, various mutants of C. glutamicum have been isolated with the capacity to produce significant amounts of different l-amino acids. Today, l-lysine is produced with mutants deregulated in the biosynthetic pathway on a scale of 4.5×105 tons per year. This amino acid is mainly used as a feed additive.

The common practice of developing amino acid-overproducing strains by mutagenesis and selection is a very well established technique (Rowlands, 1984). Mutagenic procedures were optimized in terms of the mutagen used and the dose applied. Selection procedures were designed to allow maximum expression and detection of the desirable mutant types. So far the improvement of amino acid-producing C. glutamicum strains has mainly been carried out by an iterative procedure of mutagenesis and selection. However, the precise genetic and physiological changes resulting in an increased overproduction of amino acids in various C. glutamicum strains remained unknown. Future success in attempts to further increase the productivity and yield of already highly productive strains will depend on the availability of detailed information on the metabolic pathways, their regulations, and their mutations. In recent years, genetic engineering has become a fascinating alternative to mutagenesis and random screening procedures (Sahm et al., 1995). Overexpression or deletion of genes in microorganisms via recombinant DNA techniques is the most powerful method for the construction of strains with the desired genotype. Furthermore, this approach avoids the complication of uncharacterized mutations that are often obtained with classical mutagenesis.

Since the mid-1980s, several genes from the biosynthetic pathways leading to the aspartate-derived amino acids l-lysine, l-threonine, and l-isoleucine, as well as to the vitamin D-pantothenate in C. glutamicum have been cloned and analyzed (Sahm et al., 2000). These genes were mainly identified by heterologous complementation of Escherichia coli mutants, and occasionally, in the homologous system by conferring an amino acid-analog resistance. These studies already led to a general understanding of metabolic pathways, but a complete picture of the complex interactions could not be achieved due to the lack of detailed genetic information. Genomic sequencing followed by automatic and manual annotation turned out to represent the ideal method to obtain the missing genetic information for the development of industrial C. glutamicum strains. For this reason, we decided in 1998 to sequence the genome of C. glutamicum (Hodgson, 1998), sometimes also referred as Brevibacterium divaricatum, B. flavum, B. lactofermentum, or C. melassecola (Liebl et al., 1991, Kämpfer and Kroppenstedt, 1996). The sequencing strategy was to use large-insert libraries, e.g. cosmid- and BAC-clones for establishing the complete genome sequence (Tauch et al., 2002a). We now report on the completed genomic sequence of the type strain C. glutamicum ATCC 13032. The genome data provide a rich source for metabolic reconstruction of the pathways leading to industrially important products derived from the amino acid l-aspartate.

During our sequencing work, we learned that due to its outstanding biotechnological relevance, the genome of C. glutamicum was sequenced independently by different groups. The Japanese company Kyowa Hakko Kogyo Co., Ltd. established a sequence independently from our project and put it into the public databases (GenBank NC_003450). Its market competitor, Ajinomoto Co. sequenced a close relative, C. efficiens, an organism isolated by researchers of this company (Fudou et al., 2002). The sequence of this strain was released recently in the GenBank database (NC_004369).

Section snippets

Assembly and annotation of the C. glutamicum ATCC 13032 genome sequence

The complete genome sequence of C. glutamicum ATCC 13032 was determined from 116 overlapping genomic clones. Of these, 95 were isolated from an ordered SuperCos I cosmid library (Bathe et al., 1996), and 21 were selected from a set of 2304 bacterial artificial chromosomes (BACs) upon mapping to cosmid contig ends by colony hybridization and terminal BAC sequencing (Tauch et al., 2002a). The cosmid library alone covered only 86.6% of the C. glutamicum genome and the ordered BAC library was

The structure of the C. glutamicum ATCC 13032 genome

General features of the C. glutamicum genome sequence are shown in Table 1 and Fig. 1. The C. glutamicum genome is represented by a circular chromosome of 3 282 708 bp, which is smaller than the genome of the taxonomically related bacterium M. tuberculosis (4.2 Mb), but larger than that of its close relative C. diphtheriae (2.5 Mb). The G+C content of the genome is 53.8%, which is close to that of E. coli and rather unusual for the taxonomic class of the Actinobacteria referred to as ‘high G+C

Annotation of coding regions

Gene finding tools in conjunction with homology searches in databases and an additional expert annotation with the genome annotation tool GenDB (Meyer et al., 2003) revealed 3002 potential protein-coding genes in the C. glutamicum genome sequence (Table 1). To 2489 of these, at least putative functions or localizations could be assigned by similarity analyses. Of the remaining predicted genes, 250 are similar to hypothetical proteins in other organisms (conserved hypothetical proteins) and only

Metabolic reconstruction of the biosyntheses of aspartate-derived amino acids and vitamins from glucose

A number of metabolites of biotechnological importance are derived from the amino acid l-aspartate. These are l-lysine, l-threonine, l-methionine and l-isoleucine. Two others compounds, the amino acid l-valine and the vitamin D-pantothenate, are strongly interconnected to the synthesis of aspartate-derived amino acids and were, therefore, included into this study. For the reconstruction of the formation of all these compounds from glucose, several functional complexes have to be considered.

Conclusions

The establishment of a completely annotated C. glutamicum genome sequence is a big leap forward to the understanding of the biology of this organism and will boost metabolic engineering to overproduce compounds of biotechnological relevance. It helped to identify missing genes to close the respective biosynthetic pathways directly or by providing a limited number of candidate genes to be tested. The complete genome sequence is the basis for extensive expression analyses by proteome and

Acknowledgements

The C. diphtheriae sequence data were produced by the C. diphtheriae Sequencing Group at the Sanger Institute and obtained from ftp://ftp.sanger.ac.uk/pub/pathogens/cdip/. The work was supported by grant 031U213D of the Bundesministerium für Bildung und Forschung.

References (107)

  • M. Merkamm et al.

    Ketopantoate reductase activity is only encoded by ilvC in Corynebacterium glutamicum

    J. Biotechnol.

    (2003)
  • M. O'Regan et al.

    Cloning and nucleotide sequence of the phosphoenolpyruvate carboxylase-coding gene of Corynebacterium glutamicum ATCC 13032

    Gene

    (1989)
  • S.-D. Park et al.

    Isolation and analysis of metA, a methionine biosynthetic gene encoding homoserine acetyltransferase in Corynebacterium glutamicum

    Mol. Cells

    (1998)
  • S.-Y. Park et al.

    Characterization of glk, a gene coding for glucose kinase of Corynebacterium glutamicum

    FEMS Microbiol. Lett.

    (2000)
  • R.T. Rowlands

    Industrial strain improvement: mutagenesis and random screening procedures

    Enzymes Microb. Technol.

    (1984)
  • C. Rückert et al.

    Genome-wide analysis of the l-methionine biosynthetic pathway in Corynebacterium glutamicum by targeted gene deletion and homologous complementation

    J. Biotechnol.

    (2003)
  • H. Sahm et al.

    Metabolic design in amino acid producing bacterium Corynebacterium glutamicum

    FEMS Microbiol. Lett.

    (1995)
  • A. Schäfer et al.

    The Corynebacterium glutamicum cglIM gene encoding a 5-cytosine methyltransferase enzyme confers a specific DNA methylation pattern in an McrBC-deficient Escherichia coli strain

    Gene

    (1997)
  • A. Tauch et al.

    TetZ, a new tetracycline resistance determinant discovered in gram-positive bacteria, shows high homology to gram-negative regulated efflux systems

    Plasmid

    (2000)
  • A. Tauch et al.

    Strategy to sequence the genome of Corynebacterium glutamicum ATCC 13032: use of a cosmid and a bacterial artificial chromosome library

    J. Biotechnol.

    (2002)
  • A. Tauch et al.

    The 27.8-kb R-plasmid pTET3 from Corynebacterium glutamicum encodes the aminoglycoside adenyltransferase gene cassette aadA9 and the regulated tetracycline efflux system Tet 33 flanked by active copies of the widespread insertion sequence IS6100

    Plasmid

    (2002)
  • J.H. Badger et al.

    CRITICA: coding region identification tool invoking comparative analysis

    Mol. Biol. Evol.

    (1999)
  • L. Barksdale

    The genus Corynebacterium

  • B. Bathe et al.

    A physical and genetic map of the Corynebacterium glutamicum ATCC 13032 chromosome

    Mol. Gen. Genet.

    (1996)
  • A. Bellmann et al.

    Expression control and specificity of the basic amino acid exporter LysE of Corynebacterium glutamicum

    Microbiology

    (2001)
  • S. Bröer et al.

    Lysine uptake and exchange in Corynebacterium glutamicum

    J. Bacteriol.

    (1990)
  • W.A. Claes et al.

    Identification of two prpDBC gene clusters in Corynebacterium glutamicum and their involvement in propionate degradation via the 2-methylcitrate cycle

    J. Bacteriol.

    (2002)
  • G.E. Colón et al.

    Production of isoleucine by overexpression of ilvA in a Corynebacterium lactofermentum threonine producer

    Appl. Microbiol. Biotechnol.

    (1995)
  • J. Cremer et al.

    Cloning of the dapA dapB cluster of the lysine-secreting bacterium Corynebacterium glutamicum

    Mol. Gen. Genet.

    (1990)
  • A.L. Delcher et al.

    Improved microbial gene identification with GLIMMER

    Nucleic Acids Res.

    (1999)
  • H. Dominguez et al.

    Complete sucrose metabolism requires fructose phosphotransferase activity in Corynebacterium glutamicum to ensure phosphorylation in liberated fructose

    Appl. Environ. Microbiol.

    (1996)
  • N. Dusch et al.

    Expression of the Corynebacterium glutamicum panD gene encoding l-aspartate-α-decarboxylase leads to pantothenate overproduction in Escherichia coli

    Appl. Environ. Microbiol.

    (1999)
  • H. Ebbighausen et al.

    Transport of branched-chain amino acids in Corynebacterium glutamicum

    Arch. Microbiol.

    (1989)
  • H. Ebbighausen et al.

    Isoleucine excretion in Corynebacterium glutamicum: evidence for a specific efflux carrier system

    Appl. Microbiol. Biotechnol.

    (1989)
  • B.J. Eikmanns

    Identification, sequence analysis, and expression of a Corynebacterium glutamicum gene cluster encoding the three glycolytic enzyme glyceraldehyde-3-phosphate dehydrogenase, 3-phosphoglycerate kinase, and triosephosphate isomerase

    J. Bacteriol.

    (1992)
  • B.J. Eikmanns et al.

    The phosphoenolpyruvate carboxylase gene of Corynebacterium glutamicum: molecular cloning, nucleotide sequence, and expression

    Mol. Gen. Genet.

    (1989)
  • B.J. Eikmanns et al.

    Amplification of three biosynthesis genes in Corynebacterium glutamicum and its influence on carbon flux in different strains

    Appl. Microbiol. Biotechnol.

    (1991)
  • B.J. Eikmanns et al.

    Nucleotide sequence, expression and transcriptional analysis of the Corynebacterium glutamicum gltA gene encoding citrate synthase

    Microbiology

    (1994)
  • B.J. Eikmanns et al.

    Cloning, sequence analysis, expression, and inactivation of the Corynebacterium glutamicum icd gene encoding isocitrate dehydrogenase and biochemical characterization of the enzyme

    J. Bacteriol.

    (1995)
  • A. Erdmann et al.

    Regulation of lysine excretion in the lysine producer strain Corynebacterium glutamicum MH20-22B

    Biotechnol. Lett.

    (1993)
  • T.M. Fuchs et al.

    Characterization of a Bordetella pertussis diaminopimelate (DAP) biosynthesis locus identifies dapC, a novel gene coding for an N-succinyl-l, l-DAP aminotransferase

    J. Bacteriol.

    (2000)
  • R. Fudou et al.

    Corynebacterium efficiens sp. nov., a glutamic-acid-producing species from soil and vegetables

    Int. J. Syst. Evol. Microbiol.

    (2002)
  • M. Fujita et al.

    A deletion in the sapA homologue cluster is responsible for the loss of the S-layer in Campylobacter fetus strain TK

    Arch. Microbiol.

    (1997)
  • K. Gabriel et al.

    The actinophage RP3 DNA integrates site-specifically into the putative tRNAArg(AGG) gene of Streptomyces rimosus

    Nucleic Acids Res.

    (1995)
  • P. Gourdon et al.

    Cloning of the malic enzyme gene from Corynebacterium glutamicum and role of the enzyme in lactate metabolism

    Appl. Environ. Microbiol.

    (2000)
  • A. Grigoriev

    Analyzing genomes with cumulative skew diagrams

    Bioinformatics

    (1998)
  • K.S. Han et al.

    The molecular structure of the Corynebacterium glutamicum threonine synthase gene

    Mol. Microbiol.

    (1990)
  • K. Hashiguchi et al.

    Effects of an Escherichia coli ilvA mutant gene encoding feedback-resistant threonine deaminase on l-isoleucine production by Brevibacterium flavum

    Biosci. Biotechnol. Biochem.

    (1997)
  • K. Hatakeyama et al.

    Analysis of the biotin biosynthesis pathway in coryneform bacteria: cloning and sequencing of the bioB gene from Brevibacterium flavum

    DNA Seq.

    (1993)
  • K. Hatakeyama et al.

    Genomic organization of the biotin biosynthetic genes of coryneform bacteria: cloning and sequencing of the bioA-bioD genes from Brevibacterium flavum

    DNA Seq.

    (1993)
  • Cited by (0)

    1

    Present address: Qiagen AG, Max-Volmer-Straße 4, D-40724 Hilden, Germany.

    View full text