Elsevier

Gene

Volume 312, 17 July 2003, Pages 61-72
Gene

The human genome has 49 cytochrome c pseudogenes, including a relic of a primordial gene that still functions in mouse

https://doi.org/10.1016/S0378-1119(03)00579-1Get rights and content

Abstract

Using a computational approach, we have identified 49 cytochrome c (cyc) pseudogenes in the human genome. Analysis of these provides a detailed description of the molecular evolution of the cyc gene. Almost all of the pseudogenes are full-length, and we have concluded that they mostly originated from independent retrotransposition events (i.e. they are processed). Based on phylogenetic analysis and detailed sequence comparison, we have further divided these pseudogenes into two groups. The first, consisting of four young pseudogenes that were dated to be between 27 and 34 Myr old, originated from a gene almost identical to the modern human cyc gene. The second group of pseudogenes is much older and appears to have descended from ancient genes similar to modern rodent cyc genes. Thus, our results support the observation that accelerated evolution in cyc sequence had occurred in the primate lineage. The oldest pseudogene in the second group, dated to be over 80 Myr old, resembles the testis-specific cyc gene in modern rodents. It is likely that the mammalian ancestor had both the somatic and the testis-specific cyc genes. While the testis-specific gene is still functional in modern rodents, the human has lost it, retaining only a pseudogene in its place. Thus, our study may have identified a pseudogene that is a dead relic of a gene that has completely died off in the human lineage.

Introduction

Cytochrome c (cyc) is a central component of the electron transfer chain in the cell, and is involved in both aerobic and anaerobic respiration. It is also involved in other cellular processes such as apoptosis (Kluck et al., 1997) and heme biosynthesis (Biel and Biel, 1990). It is a ubiquitous protein, found in all eukaryotes and prokaryotes. Because of its importance, relatively small size (104 amino acids in mammals) and ease of isolation, cyc has been very intensively studied. Cyc has also been used as a paradigm in the study of the evolution of protein sequence and structure (Chothia and Lesk, 1985, Wu et al., 1986, Mills, 1991). The amino acid sequences of cyc from many species are now available (Banci et al., 1999); the sequences among vertebrates are especially conserved except among primates, where acceleration in non-synonymous mutation has been observed (Evans and Scarpulla, 1988, Grossman et al., 2001).

By screening genomic DNA libraries, multiple copies of cytochrome c processed pseudogenes were discovered in mammalian genomes (Scarpulla et al., 1982, Scarpulla, 1984), including 11 copies in human (Evans and Scarpulla, 1988). Processed pseudogenes are disabled copies of functional genes that do not produce a functional, full-length protein (Vanin, 1985, Mighell et al., 2000, Harrison et al., 2002a). It is believed that they arose from LINE1-mediated retrotransposition, i.e. reverse-transcription of mRNA transcripts followed by integration into genomic DNA, presumably in the germ line (Kazazian and Moran, 1998, Esnault et al., 2000). They are characterized by a complete lack of introns, the presence of small flanking direct repeats and a polyadenine tract near the 3′ end (provided that they have not decayed). Existence of pseudogenes in the genome can obscure the identification and cloning of functional genes; however, pseudogenes can also provide a fossil record of gene sequences existing at various times during evolution.

Previously, we identified over 2000 ribosomal protein (RP) pseudogenes in the human genome (Harrison et al., 2002b, Zhang et al., 2002), most of which were previously overlooked by DNA hybridization experiments. Motivated by this discovery of an unexpectedly large number of additional pseudogenes, we carried out a similar comprehensive survey on human cytochrome c pseudogenes. Our study provides a complete molecular record of the recent evolution of this gene and demonstrates the importance of examining pseudogenic sequences. It also demonstrates a specific instance of a gene disappearing and leaving only a fossil pseudogene in its place.

Section snippets

Materials and methods

The basic procedures of our pseudogene discovery pipeline have been previously described (Zhang et al., 2002). A brief overview is given below.

The human cyc pseudogene population

A total of 50 cyc homology loci were identified in the human genome, including 49 pseudogenes (denoted as HCP) and one intron-containing functional gene (denoted as HCS). The HCS gene was located on chromosome 7 (cytogenic band 7p15.3, see Fig. 1), the annotation was confirmed by the perfect alignment of the exons, intron, and the 5′ and 3′ regions with the previously reported nucleotide sequence ((Evans and Scarpulla, 1988), GenBank ID: 181241). It is known that the HCS gene contains two

Discussion

The 49 cyc pseudogenes we describe here present an evolutionary record of the human cytochrome c gene; our findings strongly support the hypothesis that this gene has evolved at a very rapid rate in the recent human lineage. The sequence information we report here will not only aid researchers to design better HCS-specific probes to avoid pseudogene complications, but will also be very useful in calibrating and estimating various evolutionary and phylogenetic models. The discovery of the common

Acknowledgements

MG acknowledges NIH grant 2P01GM54160-04. Z.Z. thanks Dr. Paul Harrison for comments on the manuscript and Dr. Duncan Milburn and Nat Echols for computational help.

References (39)

  • S.F. Altschul et al.

    Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

    Nucleic Acids Res.

    (1997)
  • J.A. Bailey et al.

    Recent segmental duplications in the human genome

    Science

    (2002)
  • A. Bairoch et al.

    The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000

    Nucleic Acids Res.

    (2000)
  • L. Banci et al.

    Mitochondrial cytochromes c: a comparative analysis

    J. Biol. Inorg. Chem.

    (1999)
  • S.W. Biel et al.

    Isolation of a Rhodobacter capsulatus mutant that lacks c-type cytochromes and excretes porphyrins

    J. Bacteriol.

    (1990)
  • A. Cordonnier et al.

    Isolation of novel human endogenous retrovirus-like elements with foamy virus-related pol sequence

    J. Virol.

    (1995)
  • C. Esnault et al.

    retrotransposons generate processed pseudogenes

    Nat. Genet.

    (2000)
  • M.J. Evans et al.

    The human somatic cytochrome c gene: two classes of processed pseudogenes demarcate a period of rapid molecular evolution

    Proc. Natl. Acad. Sci. USA

    (1988)
  • P.M. Harrison et al.

    Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22

    Genome Res.

    (2002)
  • Cited by (0)

    View full text