Selection for the miniaturization of highly expressed genes

https://doi.org/10.1016/j.bbrc.2007.06.085Get rights and content

Abstract

Most widely expressed genes are also highly expressed. Based on high or wide expression, different models were proposed to explain the small sizes of highly/widely expressed genes. We found that housekeeping genes are not more compact than narrowly expressed genes with similar expression levels, but compactness and expression level are correlated in housekeeping genes (except that highly expressed Arabidopsis HK genes have longer intron length). Meanwhile, we found evidence that genes with high functional/regulatory complexity do not have longer introns and longer proteins. The genome design hypothesis is thus not supported. Furthermore, we found that housekeeping genes are not more compact than the narrowly expressed somatic genes with similar average expression levels. Because housekeeping genes are expected to have much higher germline expression levels than narrowly expressed somatic genes, transcription-associated deletion bias is not supported. Selection of the compactness of highly expressed genes for economy is supported.

Section snippets

Materials and methods

The gene characters were parsed from the annotated genomes downloaded from NCBI (ftp://ftp.ncbi.nih.gov/genomes/): Homo sapiens (build 36 version 1), Mus musculus (build 35 version 1), and A. thaliana (updated Nov. 04, 2005). The protein characters were estimated using the SwissPfam version 20 (ftp://ftp.genetics.wustl.edu/pub/Pfam/). In the case of alternative splicing variants, we retained the longest mRNA for analysis.

We determined the gene expression breadth, level and tissue specificity

Highly expressed genes are compact, but widely expressed genes are not

HK genes are ubiquitously expressed in all tissues, so their size evolution is not related to expression breadth. We found that the compactness of HK genes is correlated with their expression levels except that highly expressed Arabidopsis HK genes have longer introns (Table 1).

Besides ubiquitous expression, some researchers used other additional criteria (e.g. basic cellular functions) to define HK genes more stringently [13]. In well-compiled sets of human housekeeping genes, we got stronger

Discussion

In A. thaliana, the highly expressed genes tend to have longer introns (Table 1, [35]), while the genes with complex expression pattern do not have longer introns (Table 2). Thus, the majority of introns in plants may not play important roles in expression complexity other than increasing expression level [36].

We successively find evidence against genome design hypothesis [14] as well as the transcription-associated mutational bias hypothesis [5], [6], and support the energetic cost hypothesis

Acknowledgments

We thank the anonymous referee for useful comments. This research was supported by NSFC (30270695) and BNU.

References (39)

  • A. Coghlan et al.

    Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae

    Yeast

    (2000)
  • R. Jansen et al.

    Analysis of the yeast transcriptome with structural and functional categories: characterizing highly expressed proteins

    Nucleic Acids Res.

    (2000)
  • C.I. Castillo-Davis et al.

    Selection for short introns in highly expressed genes

    Nat. Genet.

    (2002)
  • H. Akashi

    Translational selection and yeast proteome evolution

    Genetics

    (2003)
  • A.O. Urrutia et al.

    The signature of selection mediated by expression on human genes

    Genome Res.

    (2003)
  • J.M. Comeron

    Selective and mutational patterns associated with gene expression in humans: influences on synonymous composition and intron presence

    Genetics

    (2004)
  • J. Warringer et al.

    Evolutionary constraints on yeast protein size

    BMC Evol. Biol.

    (2006)
  • L.D. Hurst et al.

    Imprinted genes have few and small introns

    Nat. Genet.

    (1996)
  • G. Sarkar et al.

    Access to a messenger RNA sequence or its protein product is not limited by tissue or species specificity

    Science

    (1989)
  • Cited by (35)

    View all citing articles on Scopus
    View full text