Mulan: Multiple-sequence local alignment and visualization for studying function and evolution

  1. Ivan Ovcharenko1,6,
  2. Gabriela G. Loots2,
  3. Belinda M. Giardine3,
  4. Minmei Hou4,
  5. Jian Ma4,
  6. Ross C. Hardison3,
  7. Lisa Stubbs2, and
  8. Webb Miller4,5
  1. 1 Energy, Environment, Biology and Institutional Computing, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
  2. 2 Genome Biology Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
  3. 3 Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
  4. 4 Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
  5. 5 Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA

Abstract

Multiple-sequence alignment analysis is a powerful approach for understanding phylogenetic relationships, annotating genes, and detecting functional regulatory elements. With a growing number of partly or fully sequenced vertebrate genomes, effective tools for performing multiple comparisons are required to accurately and efficiently assist biological discoveries. Here we introduce Mulan (http://mulan.dcode.org/), a novel method and a network server for comparing multiple draft and finished-quality sequences to identify functional elements conserved over evolutionary time. Mulan brings together several novel algorithms: the TBA multi-aligner program for rapid identification of local sequence conservation, and the multiTF program for detecting evolutionarily conserved transcription factor binding sites in multiple alignments. In addition, Mulan supports two-way communication with the GALA database; alignments of multiple species dynamically generated in GALA can be viewed in Mulan, and conserved transcription factor binding sites identified with Mulan/multiTF can be integrated and overlaid with extensive genome annotation data using GALA. Local multiple alignments computed by Mulan ensure reliable representation of short- and large-scale genomic rearrangements in distant organisms. Mulan allows for interactive modification of critical conservation parameters to differentially predict conserved regions in comparisons of both closely and distantly related species. We illustrate the uses and applications of the Mulan tool through multispecies comparisons of the GATA3 gene locus and the identification of elements that are conserved in a different way in avians than in other genomes, allowing speculation on the evolution of birds. Source code for the aligners and the aligner-evaluation software can be freely downloaded from http://www.bx.psu.edu/miller_lab/.

Footnotes

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3007205. Article published online before print in December 2004.

  • 6 Corresponding author. E-mail ovcharenko1{at}llnl.gov; fax (925) 422-2099.

    • Accepted August 31, 2004.
    • Received July 14, 2004.
| Table of Contents

Preprint Server