Trends in Genetics
Genome AnalysisOrigin of a substantial fraction of human regulatory sequences from transposable elements
Section snippets
5′ promoter regions
Promoters can be defined as the sequence regions that are located directly 5′ of transcription initiation sites and that regulate their 3′ adjacent genes. The Human Promoter Database (HPD; http://zlab.bu.edu/~mfrith/HPD.html) is a repository of >2000 such human promoter sequences, each of ∼500 bp, that were identified by their location 5′ of experimentally characterized transcription initiation sites [11]. We analyzed these promoters to assess the extent to which they are derived from TEs. Of
cis-regulatory elements
To demonstrate unequivocally an effect of TEs on the regulation of host genes, it is necessary to show that experimentally characterized cis-regulatory elements that bind nuclear transcription factors have been derived from TE sequences. We searched systematically for such cases using the Transcription Factor Database (TRANSFAC; http://transfac.gbf.de/TRANSFAC/) [12]. A total of 846 experimentally characterized human cis-regulatory sites from 288 genes, along with their coordinates in GenBank
Untranslated regions of mRNA
Both 5′ and 3′ untranslated regions (UTRs) of mRNA sequences often encode important cis-elements that function to regulate either transcription or translation. Human mRNA sequences taken from the Mammalian Gene Collection (http://mgc.nci.nih.gov/) [16], a database of experimentally characterized full-length mRNA sequences, were surveyed for the presence of TE-derived sequences. mRNA sequences were partitioned into 5′UTRs, protein-coding sequences (CDSs) and 3′UTRs for comparison. Not
Scaffold/matrix attachment regions (S/MARs)
Another mode of transcription regulation in eukaryotes involves the formation of distinct chromatin loops mediated by attachment of specific DNA regions to the nuclear scaffold or matrix [17]. The S/MAR transaction database (S/MARt DB, http://transfac.gbf.de/SMARtDB/) includes a collection of experimentally characterized S/MAR sequences compiled from original publications [18]. We surveyed these sequences for the presence of TEs, and found they are enriched in TE-derived sequences (Table 1).
Conclusion
In addition to their well-documented parasitic properties, TE insertions could result in evolutionary changes that are beneficial to the host, particularly by the donation of regulatory sequences. Here, we demonstrate the potential of TEs to affect substantially the regulation of thousands of human genes by donating cis-regulatory sites. In addition to these gene-specific regulatory effects, TEs appear to affect regulation of the human genome in a more global manner by creating S/MARs that form
Acknowledgements
Galina V. Glazko was supported by research grants from NIH (GM-20293) and NASA (NCC2-1057) awarded to Masatoshi Nei. We thank Nathan J. Bowen and Wolfgang J. Miller for discussions on the relationship between TEs and S/MARs.
References (22)
Genomic scrap yard: how genomes utilize all that junk
Gene
(2000)- et al.
Transposable elements are found in a large number of human protein-coding genes
Trends Genet.
(2001) Locus control regions: coming of age at a decade plus
Trends Genet.
(1999)Locus control regions of mammalian beta-globin gene clusters: combining phylogenetic analyses and experimental results to gain functional insights
Gene
(1997)A nuclear matrix/scaffold attachment region co-localizes with the gypsy retrotransposon insulator sequence
J. Biol. Chem.
(1998)Initial sequencing and analysis of the human genome
Nature
(2001)- et al.
Perspective: transposable elements, parasitic DNA, and genome evolution
Evolution Int. J. Org. Evolution
(2001) Selfish DNA: a sexually-transmitted nuclear parasite
Genetics
(1982)- et al.
Selfish genes, the phenotype paradigm and genome evolution
Nature
(1980) - et al.
Selfish DNA: the ultimate parasite
Nature
(1980)
Molecular domestication – more than a sporadic episode in evolution
Genetica
Cited by (474)
Transposable Elements Shaping the Epigenome
2022, Handbook of Epigenetics: The New Molecular and Medical Genetics, Third Edition