Trends in Genetics
Volume 18, Issue 4, 1 April 2002, Pages 176-179
Journal home page for Trends in Genetics

Research update
Identifying functional links between genes using conserved chromosomal proximity

https://doi.org/10.1016/S0168-9525(01)02621-XGet rights and content

Abstract

Conservation of proximity of a pair of genes across multiple genomes generally indicates that their functions could be linked. Here, we present a systematic evaluation using 42 complete microbial genomes from 25 phylogenetic groups to test the reliability of this observation in predicting function for genes. We find a relationship between the number of phylogenetic groups in which a gene pair is proximate and the probability that the pair belongs to a common pathway. Our method produces 1586 links between ortholog families substantiated by observed proximity in genomes representing at least three phylogenetic groups. Of the pairs annotated in the KEGG database, 80% are in the same biological pathway in KEGG.

Section snippets

Finding proximate genes and constructing profiles of conservation

We define genes as proximate if they are on the same strand and within 300 base pairs [13], or if their respective paralogs are within 300 bp (Fig. 1), with paralogs and orthologs defined according to the Clusters of Orthologous Groups database (COG; Box 1) 15., 16.. Because the complete genome sequences available are not evenly distributed phylogenetically (e.g. two strains of Helicobacter pylori), we adopt the COG categorization of the 42 microbial genomes into 25 major phylogenetic groups

How reliable is this method?

To estimate the ability of our method to detect links between functionally related genes, we identify the subset of links where both ortholog families are annotated and assigned to a pathway in the KEGG database [20], or to a functional category in the COG database [15] (Box 1). We then calculate the fractions of these that are in the same KEGG pathway and of the same COG functional category. As Fig. 3a shows, the functional correspondence is positively correlated to the minimal number of

Acknowledgements

We would like to thank Adnan Derti for a critical reading of this manuscript. This work was supported in part by a National Science Foundation Integrative Graduate Education and Research Traineeship Program grant to JCM. IY is supported by a Whitaker Fellowship.

References (22)

  • I. Kobayashi

    Behavior of restriction-modification systems as selfish mobile elements and their impact on genome evolution

    Nucleic Acids Res.

    (2001)
  • Cited by (60)

    • The power of operon rearrangements for predicting functional associations

      2015, Computational and Structural Biotechnology Journal
      Citation Excerpt :

      The idea for expanding predictions beyond those produced within a single genome works as follows (Fig. 4A): genes separated in a genome of interest (or target genome), could be inferred to functionally interact if their orthologs were found to be in the same operon in some other genomes (the informative genomes). This idea has been implemented on the basis of operons predicted by conservation of gene order [37–39], and was later expanded to include operons predicted by intergenic distances [40]. It is to be expected that operon rearrangements increase the number of available predicted functional associations.

    • Clustering of gene ontology terms in genomes

      2014, Gene
      Citation Excerpt :

      The analysis of expression data for yeast, fruit fly, worm, rat, mouse and human indicated that neighboring genes are likely co-expressed (Fukuoka et al., 2004). The proximity of a pair of genes has been used to predict gene functions (Raghupathy and Durand, 2009; Yanai et al., 2002). In bacteria, gene essentiality determines chromosome organization (Rocha and Danchin, 2003), and essential genes occur more frequently and are conserved in the leading replicating strand as compared with the expected average frequency for all genes.

    • Comparative genomics approaches to understanding and manipulating plant metabolism

      2013, Current Opinion in Biotechnology
      Citation Excerpt :

      Of all the types of genomic evidence, gene clustering — proximity in the genome — is the most widely useful. In prokaryotes, functionally related genes often occur in operons or are transcribed from the same promoter, or may simply be neighbors or near-neighbors [9,10]. Clustering of functionally related genes is not common in plants, although examples exist among genes of secondary metabolism [11].

    • Analysis of the Evolution of the MoxR ATPases

      2022, Journal of Physical Chemistry A
    View all citing articles on Scopus
    View full text