Ancient expansion of the ribonuclease A superfamily revealed by genomic analysis of placental and marsupial mammals
Introduction
Represented by the famous prototype, bovine pancreatic ribonuclease (RNase A), the RNase A superfamily is one of the best studied protein families. Numerous fundamental discoveries have been made in biochemistry, molecular biology, structural biology, enzymology, and molecular evolution by studying various members of the superfamily (Anfinsen, 1973, Blackburn and Moore, 1982, Jermann et al., 1995, Rosenberg et al., 1995, D'Alessio and Riordan, 1997, Beintema and Kleineidam, 1998, Zhang et al., 1998, Zhang et al., 2002b, Cho et al., 2005). Note that some unrelated proteins, such as RNases H, L, P, and T, and also ribozymes, have the ribonucleolytic activity, because this activity originated multiple times in evolution. However, for simplicity, we use “RNase” to refer to the RNase A superfamily or its members in this article. RNases are known to have a diverse array of physiological functions. For example, pancreatic ribonuclease (RNase 1) degrades dietary RNA molecules in the digestive gut (Barnard, 1969), angiogenin (RNase 5) stimulates blood vessel formation (Fett et al., 1985, Strydom et al., 1985), and RNases 2, 3, and 7 have antibacterial or antiviral activities that are believed to be used in innate immunity (Young et al., 1986, Domachowske et al., 1998a, Domachowske et al., 1998b, Harder and Schroder, 2002, Zhang and Rosenberg, 2002a, Zhang et al., 2003). RNases share a number of common sequence features. The entire open reading frame (ORF) of ∼ 130 codons is encoded in a single exon. RNases have a signal peptide at the N-terminus and six to eight conserved cysteines that form disulfide bridges. They have three distinct catalytic residues, called the catalytic triad, and several other conserved motifs, although the level of conservation varies among proteins (Beintema and Kleineidam, 1998, Cho et al., 2005). To date, 13 human RNase genes have been reported (Cho et al., 2005). Eight of them have been subject to functional characterizations and are known to have the ribonucleolytic activity. These RNases are the pancreatic ribonuclease (RNase 1), eosinophil-derived neutotoxin (EDN or RNase 2), eosinophil cationic protein (ECP or RNase 3), RNase 4, angiogenin (RNase 5), RNase 6 (or k6), RNase 7, and RNase 8. RNases 9–13 were identified only recently and RNases 9 and 10 have no detectable RNase activities, while RNase activities have not been examined for RNases 11–13 (Penttinen et al., 2003, Castella et al., 2004a, Castella et al., 2004b, Devor et al., 2004, Cho et al., 2005). Except for RNases 2 and 3 and RNases 7 and 8, which emerged by gene duplication within primates (Rosenberg et al., 1995, Zhang et al., 2002a, Zhang et al., 2003), all human RNase genes are represented (in one or multiple copies) in the mouse and rat (Cho et al., 2005). In the chicken and zebrafish genomes, however, only a limited number of closely related RNase genes are found (Cho et al., 2005). In the bullfrogs, although several divergent RNases have been reported, they are more closely related to each other than to any mammalian RNases, thus representing an amphibian-specific cluster (Rosenberg et al., 2001). RNases have never been found in any invertebrates, including the genomes of two completely sequenced urochordates, Ciona intestinalis and Ciona savignyi (Cho et al., 2005). Evolutionary analysis and structural features indicate that angiogenins are the first diverging group in mammalian RNases (Cho et al., 2005) with all non-mammalian vertebrate RNases having angionenin-like structures, although the angiogenic activity has yet to be examined or detected in these RNases. The phylogenetic patterns suggest that the superfamily arose in early vertebrate evolution as an angiogenin-like molecule and it underwent independent expansions during amphibian evolution and mammalian evolution. There are several interesting questions with regard to the mammalian expansion of the RNase superfamily. First, a previous study (Cho et al., 2005) showed that the expansion occurred after the bird–mammal divergence [∼ 310 My ago; (Hedges et al., 1996)], but before the primate–rodent separation [∼ 85 My ago; (Murphy et al., 2004)], leaving the precise date of the expansion uncertain. Second, previous studies were limited to primates and rodents, with the size and diversity of the RNase superfamily unknown in any other orders of placental mammals or any non-placental mammals. The draft genome sequences of the dog (Canis familiaris), cow (Bos taurus), and opossum (Monodelphis domestica) became available recently, providing an opportunity to address the above questions. As a marsupial, the opossum is particularly valuable for unraveling the timing of the mammalian RNases superfamily expansion. In this work, we identify all RNase genes from the above three genome sequences and report that the mammalian RNase superfamily expanded before the split between placental and marsupial mammals [∼ 180 My ago; (Murphy et al., 2004)]. We further show that this expansion was followed by differential gene retention and duplication among different orders of placental mammals, generating a great variation in the RNase gene repertoire among species.
Section snippets
Materials and methods
In this paper, we use “dog” for C. familiaris (Cf for short), “cow” for B. taurus (Bt), and “opossum” for the gray short-tailed opossum M. domestica (Md). A “functional gene” refers to an RNase gene that contains an uninterrupted open reading frame (ORF), whereas a “pseudogene” refers to a gene whose ORF is interrupted by a premature stop codon anywhere in the ORF or by frame-shifting insertions/deletions. Pseudogenes are distinguished from functional genes by having “ps” after the gene name
Identification of RNase A genes in dog, cow, and opossum
From whole-genome searches based on TBLASTN, we identified RNase genes and pseudogenes from the genome sequences of the dog, cow, and opossum. We also conducted BLASTN-based nucleotide sequence searches to identify pseudogenes that had not been found by TBLASTN. The entire catalogs of the RNase genes identified in this study are listed in Table 1 and Supplementary Tables S1–S3, and the DNA sequences of the RNase genes are provided in Supplementary Data Sets 1–3. The chromosomal locations of the
Acknowledgments
We thank Jaap Beintema for valuable comments and Eric Devor for sharing unpublished data. This work was supported by National Institutes of Health grant GM67030 to J.Z.
References (45)
- et al.
The ribonuclease A superfamily of mammals and birds: identifying new members and tracing evolutionary histories
Genomics
(2005) - et al.
RNase 7, a novel innate immune defense antimicrobial protein of healthy human skin
J. Biol. Chem.
(2002) - et al.
The complete amino-acid sequence of bovine-milk angiogenin
FEBS Lett.
(1988) - et al.
Mammalian phylogenomics comes of age
Trends Genet.
(2004) Evolution by gene duplication: an update
Trends Ecol. Evol.
(2003)- et al.
Pseudogenization of the tumor-growth promoter angiogenin in a leaf-eating monkey
Gene
(2003) Principles that govern folding of protein chains
Science
(1973)Biological function of pancreatic ribonuclease
Nature
(1969)- et al.
The ribonuclease A superfamily: general discussion
Cell. Mol. Life Sci.
(1998) - et al.
Pancreatic Ribonuclease
(1982)