Abstract
Schistosoma mansoni is the primary causative agent of schistosomiasis, which affects 200 million individuals in 74 countries. We generated 163,000 expressed-sequence tags (ESTs) from normalized cDNA libraries from six selected developmental stages of the parasite, resulting in 31,000 assembled sequences and 92% sampling of an estimated 14,000 gene complement. By analyzing automated Gene Ontology assignments, we provide a detailed view of important S. mansoni biological systems, including characterization of metazoa-specific and eukarya-conserved genes. Phylogenetic analysis suggests an early divergence from other metazoa. The data set provides insights into the molecular mechanisms of tissue organization, development, signaling, sexual dimorphism, host interactions and immune evasion and identifies novel proteins to be investigated as vaccine candidates and potential drug targets.
Similar content being viewed by others
Main
Schistosomiasis is a public health problem in many developing countries, and Schistosoma mansoni is the most widespread species of the causative trematode parasite1. Parasite eggs laid in the hepatic portal vasculature are the principal cause of morbidity, and the ensuing pathology may prove fatal2. Control of the disease by chemotherapy has relied heavily on praziquantel, potentially allowing drug-resistant parasites to emerge3. Protective immune mechanisms in humans that might form the basis for a vaccine have proven difficult to characterize4 owing to effective immune evasion by parasites. Nevertheless, the successful vaccination of both rodents and primates with attenuated larvae5 indicates that the goal is feasible.
As representatives of the platyhelminths, schistosomes are the lowest group of bilateria that diverged early from the metazoan lineage6. With a blind-ending gut and no body cavity, their body plan seems simple, but tissues corresponding to the main organ systems of higher animals are present. Schistosomes have a complex life cycle, and they are among the first animals to develop sexual dimorphism and heteromorphic sex chromosomes. They are intimately associated with the gastropod mollusk intermediate and the mammalian final host, perhaps relying on host signals for development. Active transmission between hosts and internal migrations show their capacity for sophisticated neuromuscular coordination.
The large size (270 Mb; ref. 7) and complexity of the S. mansoni genome have previously deterred full-scale sequencing (see The Institute for Genomic Research and The Sanger Institute websites). Current knowledge of expressed genes is limited to a set of 163 full-length cDNAs and approximately 16,000 ESTs, 75% derived from adult worms8,9. We report here a multicenter effort to obtain and annotate extensive transcriptome data for S. mansoni, using both a normalized cDNA library10 from adults and ORESTES minilibraries from six life-cycle stages (Supplementary Fig. 1 online). This approach, based on arbitrary primers and low-stringency RT–PCR11, preferentially amplifies the central, function-defining coding regions of messages12. This first large-scale database for a bilaterian acoelomate should enhance our understanding of the evolution, biology and adaptation to parasitism of these animals and identify novel proteins to be exploited as drug targets and vaccine candidates.
Results
Transcriptome features and gene complement
We obtained 163,586 EST reads from the S. mansoni transcriptome: 151,684 using ORESTES minilibraries and 11,902 from a normalized adult worm library. All our results are from a filtered data set of 124,681 analyzed reads, which resulted in 30,988 assembled EST sequences (Table 1), called Schistosoma mansoni assembled EST sequences (SmAEs). Newly identified S. mansoni genes are listed by product in Supplementary Table 1 online. The SmAE data set is estimated to sample 92% of the S. mansoni transcriptome. Comparison of SmAEs with publicly available sequences shows that 77% represent new S. mansoni gene fragments, either novel paralogs (1%), new orthologs (20%) or fragments with unknown function (no match in GenBank; 55%; Table 1). An average SmAE sequence provides around 32% coverage of a matching gene in GenBank (Supplementary Fig. 2 online); nevertheless, 359 novel orthologs have their entire coding region fully sequenced (Supplementary Table 2 online).
The total number of genes in the parasite was predicted by two different methods to be around 14,000 (Table 1), comparable to the 14,000–19,000 predicted genes of other fully sequenced invertebrates13,14,15. Extrapolation from nonredundant bases acquired from adult worm ESTs indicates that 7,200 genes are expressed in this stage (Supplementary Fig. 3 online). We obtained 58,846 tags from serial analysis of gene expression (SAGE), and the number of unique tags reached a clear plateau at 6,263 (Supplementary Fig. 3 online), suggesting that almost all adult transcripts were sampled. Thus, about 50% of all S. mansoni genes are expressed in adult worms.
Functional classification of transcripts
We assigned Gene Ontology classifications to 8,001 SmAEs (Gene Ontology browser is available at the project website). The distribution of SmAEs among the main categories is shown in Supplementary Table 3 online. Protein metabolism was the most frequently identified of the biological process categories (Fig. 1a). Searching for conserved domains (in the Pfam database) showed that protein kinases were the most abundant (Fig. 1b) proteins, with 180 identified, suggesting that S. mansoni has a more compact set of protein kinases than any of the fully sequenced metazoa16. Most of the top 15 Pfam domains were from proteins involved in either intercellular communication or transcriptional regulation, which is expected for a parasite with multiple tissues and organs.
Being a metazoan
It has been proposed that the platyhelminth acoelomates, represented by S. mansoni, diverged from other eubilaterian metazoa more than a billion years ago6. As such, they lie somewhere between the unieukaryotes Saccharomyces cerevisiae and Plasmodium falciparum and the more advanced invertebrates Caenorhabditis elegans, Drosophila melanogaster and Ciona intestinalis. Phylogenetic analyses (ref. 6 and Supplementary Fig. 4 online) support the ancient and independent divergence of acoelomates from other metazoa, which may explain the high fraction (55%) of SmAEs with no significant matches to sequences in GenBank. Thus, S. mansoni sequences should make an important contribution to understanding early metazoan evolution.
Metazoa-specific and eukarya-conserved sequences
We selected SmAEs that encode proteins that have been conserved among either the eukarya or the metazoa by comparison with known proteomes of organisms whose genomes have been completely sequenced. We built a metazoa-specific base set with the SmAEs that had orthologs only in each of the multicellular eukaryotes, Homo sapiens, D. melanogaster, C. elegans and C. intestinalis, but no matches with the unicellular eukaryotes, S. cerevisiae and P. falciparum, or with prokaryotes. The base set contains 1,598 sequences (∼645 genes) that may be essential to the more complex metazoan cell functions. The eukarya-conserved sequences had at least one ortholog in all of the eukaryotes listed above. This data set contains 3,194 SmAEs (∼1,443 genes), representing S. mansoni genes that would be important for eukaryotic cell functions.
The relative distribution of SmAEs in Gene Ontology categories for the eukarya-conserved and metazoa-specific data sets (Fig. 2) shows that the latter set contains higher proportions of sequences in a few categories (cell-to-cell interactions, developmental processes, response to external stimulus and signal transduction). In general, the metazoa-specific sequences that have diverse roles in the tissues of a complex organism are overrepresented relative to the eukarya-conserved sequences.
Cell adhesion and tissue structure
As triploblastic acoelomates, schistosomes have three germ layers, bilateral symmetry, dorso-ventral patterning and rudimentary organs, for which intercellular adhesion mechanisms were an evolutionary prerequisite. The occurrence of homotypic cell adhesion is indicated by transcripts for protocadherins and the proteins that link them to the actin cytoskeleton in adherens junctions (Table 2). The small G proteins involved in actin polymerization are all present. The existence of organized tight junctions, important in maintaining the integrity of epithelia, can also be inferred, and evidence for gap junctions is provided by two pannexins/innexins. The extracellular matrix is represented by collagens, laminins and tenascins to which cells may attach by a potential integrin heterodimer; the intracellular links between integrins and the actin cytoskeleton are also evident.
The ability to undergo remodeling is a feature of organized tissues, but evidence for apoptosis is fragmentary. Some orthologs of this pathway were found (Table 2) whereas others (Bax, Bcl-2 family, endonuclease G) were not. In contrast, numerous components of autophagy were identified, apart from Apg13p and initiator Apg12p. This situation probably reflects the absence of wandering phagocytes to eliminate redundant cells.
Antero-posterior axis differentiation
S. mansoni has several axis-determining components in common with other metazoa. The presence of nanos, pumilio and the knirps gap-gene strongly suggests parallels with the mechanism used by D. melanogaster, in which maternal factors segregate to one pole of the egg and determine the antero-posterior axis. We detected the polycomb group transcripts, enhancer of zeste, polyhomeotic distal and extra sex combs, responsible for the maintenance of pattern, but none of the archetypal Hox cluster sequences. Orthologs of putative S. mansoni homeotic transcription factors included LIM-homeodomain, double homeobox protein 4 and homeotic protein Msx1.
Dorso-ventral patterning
Dorso-ventral patterning may be dictated by an analog of the TGF-β pathway. We identified activin/TGF-β receptor orthologs, Smad4, Smad8 and Medea as well as the known Smad1 and Smad2 (ref. 17). The R-Smads (Smad1, Smad2 and Smad8) are anchored to the plasma membrane by SARA, also newly identified. Specification of the dorso-ventral axis may also involve the Wnt pathway; we identified two Wnts and their transmembrane receptor frizzled as well as the cytosolic components of the intracellular signaling cascade dishevelled, axin, Gsk3 and β-catenin.
Epithelia
Adult schistosomes have three epithelia, surface tegument, gastrodermis and protonephridial canals, which control the transport of material into and out of their bodies. We found transcripts of villin family members supervillin and archvillin, which may cap and bundle actin filaments to provide an internal scaffold for cellular extensions cross-braced at their base by spectrin, also present. Functional studies have identified mediated transport of sugars, amino acids and nucleotides18. At least nine SmAEs for sugar transporters (some ATP-driven) can be added to the already cloned Sgtp1, Sgtp2 and Sgtp4 (ref. 19). We identified several transporters for lipids, amino acids, nucleotides and ions (Table 3).
Endocytosis is prominent in the gastrodermis but caveolin-type lipid rafts have also been postulated in the tegument surface20. We did not identify caveolin transcripts but did find the raft-associated flotillin. Transcripts for components of clathrin-mediated endocytosis included the clathrin heavy chain, assembly protein Ap180 and adaptor complex Ap2, which together encode all the functions to select cargo and form a vesicle. Dynamin, the master regulator of endocytosis, was present, along with phospholipid-interacting endophilin, Eps15 and epsin. In addition to low density lipoprotein–binding proteins21, transcripts for serotransferrin, low density lipoprotein and very low density lipoprotein receptors attest to the importance of receptor-mediated endocytosis.
Motility and the nervous system
All life-cycle stages have an extensive and intricately organized musculature comprised of smooth fibers22, and only the cercarial tail has a form of striated muscle. We identified transcripts for several myosins, two actins, tropomyosin, paramyosin and troponins C, I and T, involved in the regulation of contraction, the filament attachment proteins, α-actinin, vinculin and titin, many of which are novel paralogs. We found no transcripts encoding specific striated muscle proteins.
Platyhelminths are the first metazoan group to possess a central nervous system23 and have a variety of sensory structures24 that transduce a wide range of stimuli. Notch receptor, its transcription factor partner (suppressor of hairless) and membrane-bound ligand (delta) suggest a role for Notch signaling in S. mansoni neurogenesis. Transcripts for axon guidance molecules to direct nerves to their synaptic partners (netrin and its membrane receptor Unc5, two semaphorin-like and two plexin-like molecules) document the presence of a molecular repertoire for sophisticated neural circuitry. Regarding sensory structures, we identified components of the light detection system (a rhodopsin paralog of that previously described8,25, rhodopsin kinase, arrestin and transducin), the first two in eggs and germ balls, respectively, consistent with the responsiveness of miracidia and cercariae to light.
Signaling
Transcriptome analysis identifies the molecular basis for some elements of schistosome neurotransmitter/receptor systems. We found ligand-gated channels, including three versions of the nicotinic acetylcholine receptor, choline o-acetyltransferase for synthesis and acetylcholine esterase for breakdown of this inhibitory neurotransmitter. We also found a glutamate receptor and transcripts for the γ-amino butyric acid (GABA) transporter and GABA receptor–associated protein but not the inhibitory GABA receptor itself.
We found G-protein-coupled receptors for glutamate and the excitatory transmitter serotonin along with its transporter, as well as a putative muscarinic acetylcholine receptor. Although S. mansoni has been reported to respond to catecholamine26, we found no transcripts for the relevant receptors. Primitive neuroendocrine processes are known to be mediated by FaRP-type peptides27, but we found a transcript only for allatostatin precursor protein. Nevertheless, orthologs of hormone proprotein convertase 2, which processes the precursors of bioactive peptides, and its regulatory neuroendocrine protein 7B2 were present, as was glycine peptidyl α-amide monooxygenase, required for the C-terminal amidation of the resulting peptides. Proprotein convertase 2 generates the opioid peptides and enkephalin in higher animals and might have the same function in schistosomes, as these peptides have previously been reported28.
It is difficult to envisage how hormone signaling might operate in acoelomates, except over a short distance or through the neuroendocrine route. Nevertheless, two members of the nuclear receptor superfamily (retinoid-X and fushi tarazu factor 1) have been characterized29, and SmAEs for a retinoic acid receptor (RAR-γ), a thyroid hormone receptor family member, a nuclear receptor 1 and a nuclear orphan receptor Tr2/4 can be added. But detection of transcripts for thyroid hormone interactor proteins 4, 12, 13 and 15 and thyroid hormone receptor–associated proteins Trap240 and Trap80, together with the reported effect of thyroid hormone on schistosome development30, suggests that at least one nuclear orphan receptor may have a functional ligand. An ortholog of thyroid peroxidase, required to synthesize thyroid hormone, is present, but thyroglobulin, its vertebrate substrate, is not. If there is endogenous thyroid hormone, perhaps S. mansoni uses an alternative tyrosine-rich protein as a precursor.
The presence of transcripts for a series of cytochrome P450 enzymes, testosterone 6-β-hydroxylase and 17b-hydroxysteroid dehydrogenase suggests that schistosomes synthesize steroid hormones from cholesterol. They also seem to have some receptor elements (progesterone receptor membrane component 2 and estrogen-related receptor), which could bind endogenous steroids or mediate the supposed action of exogenous steroids on their maturation. Identification of other receptors for insulin and FGF, but not their ligands, reinforces the concept that host molecules act on parasite receptors. The presence of SmAEs encoding neurotensin and natriuretic peptide receptors is notable but more difficult to place in context.
Sex determination and sexual maturation
Most platyhelminths are hermaphrodites, but sexual dimorphism seems to have evolved separately on at least eight occasions, arguing for a relatively simple underlying mechanism31. Determination of sex is inherent whereas envelopment by the male is a prerequisite for female maturation32, showing the need for cross-talk. We detected orthologs of fox-1, mog-1, mog-4, tra-2 and fem-1, involved in the determination of sex in C. elegans. We also found the ortholog of mago-nashi, which in C. elegans (mag-1) specifies female development by inhibiting the hermaphrodite phenotype. The presence of the above transcripts in S. mansoni confirms their evolutionarily ancient role in sex determination, but it is unclear how they contribute to the dioecious state.
Being a parasite
Schistosomes have a prolonged association with their hosts and should therefore possess specific adaptations to the parasitic way of life. Adult worms are bathed in, and feed on, host blood, and we found transcripts for echicetin-like molecules that affect hemostasis and prevent thrombosis. Adult worms also expressed apyrase (CD39/ATP-diphosphohydrolase), an enzyme involved in platelet aggregation and thromboregulation that has been localized to the tegument33, possibly indicating the capacity to inhibit platelet activation.
Longevity
In contrast to the short lifespan of C. elegans or D. melanogaster, schistosomes have predicted lifespan of 6–10 years34. In yeast and C. elegans, an extra copy of Sir2 or sir-2.1, implicated in chromatin silencing, can increase lifespan, and we identified orthologs to sir-2.1, sir-2.2, sir-2.5, sir-2.6 and sir-2.7 in S. mansoni. We identified SmAEs from the insulin-signaling pathway, associated with longevity in C. elegans, including Daf2, an insulin-like receptor, Age1, a phosphatidylinositol-3-OH kinase and Daf16. Daf16 is a transcription factor that regulates many genes that affect lifespan, including enzymes that protect against or repair oxidative damage35. We also identified Pdk1 and PTEN, proteins that regulate the Daf2 pathway.
Stress responses
S. mansoni undergoes rapid transitions between environments that are accompanied by temperature and osmotic stresses. We extended the list of previously described heat shock genes (23 SmAEs, 12 possibly new), which includes an HtrA ortholog, a stress-regulated serine protease. Uroplakin is believed to limit the permeability of membranes to water and small non-electrolytes36; we found an ortholog in egg, miracidia and cercaria stages. Parasites also encounter oxidative stress during host immune attack, which is dealt with by antioxidant enzymes, both previously characterized (superoxide dismutases, thioredoxin and glutathione reductases and peroxidases) and novel, including mitochondrial thioredoxin 2, a PKC-interacting thioredoxin, thioredoxin-like 2, an ortholog of Plasmodium yoelii thioredoxin, and glutaredoxin 3.
The innate immune response comprises primitive mechanisms used by metazoa in defense against infection14,15. The Toll pathway has an important role in this, and we identified several components including Tollip, pellino and NF-κB kinase (NEMO), implying that S. mansoni can respond to extracellular pathogens. The presence of transcripts for adenosine deaminase, Dicer and Piwi/argonaute indicates that S. mansoni can also deal with intracellular attack mediated by viral dsRNA. By extension, the last two genes indicate that post-transcriptional gene silencing could occur, and the use of RNA interference to suppress schistosome gene function was recently reported37,38.
Evasion of host immune responses
S. mansoni has been proposed to use several strategies to evade host immune responses, including protection of the tegument surface by a secreted membranocalyx39, molecular mimicry, antigenic variation and immunomodulation. As an example of molecular mimicry, the convergent evolution of S. mansoni and Biomphalaria glabrata (snail intermediate host) tropomyosins 1 and 2, has been suggested40 on the basis of immunological cross-reactivity and amino acid sequence identity (∼63%). We detected a new isoform, tropomyosin 3, in adults, eggs and germ balls with only 35% amino acid identity to B. glabrata, suggesting a different tissue location not subjected to the same selective pressure.
In the context of antigenic variation, we found no evidence of highly variable gene families (compared with Plasmodium), but our database identified 449 putative novel paralogs to known S. mansoni genes (Table 1); 33 of these had high identity and >30% coverage (Supplementary Table 4 online). This multiplicity of isoforms would allow the parasite to use paralogs of an essential enzyme targeted by the immune system to avoid loss of function, thus making vaccine development more difficult. Indeed, we identified several paralogs of previously investigated vaccine candidates (Supplementary Table 5 online).
Non-synonymous single-nucleotide polymorphisms (SNPs) are another source of variation. Analysis of redundant EST coverage of genes encoding vaccine candidates identified eight putative polymorphisms, two of which could be validated (see Supplementary Methods online) in isolates from different regions of the world. We detected alternative splicing in several genes, including a recently identified exon skipping in Sm14 (ref. 41) present in germ balls, schistosomula and adults.
Modulation of mammalian host immune responses by a schistosome infection is well documented, but the agents and mechanisms are not yet fully defined. The presence of transcripts for pro-inflammatory phospholipase A2-activating protein supports the documented effect of lyso-phosphatidylserine as an inducer of T-regulatory cells and Th2 polarization42. S. mansoni eggs and adults induce a characteristic allergic response43,44. The identification of a family of orthologs to wasp venom allergen 5 raises the question of how the parasite benefits from amplifying such a response.
Stage-associated frequency of sequences
The frequency of reads in a SmAE cluster obtained from different life cycle stages can reflect differential gene expression when the same set of primers is used for generating ORESTES minilibraries. We validated this approach experimentally by semi-quantitative RT–PCR (Supplementary Fig. 5 online). We analyzed 5,172 sequences obtained with the same set of primers, generating 2,058 SmAEs. We found that 82 of these had conspicuously different patterns of distribution among stages (with 99.8% confidence), several being predominant in one stage only (Fig. 3 and Supplementary Table 6 online). In particular, germ balls overexpressed elastase 2a (secretion for host invasion45), troponin I and tropomyosin 2 (muscle development), and centrin3 and S-rex/Nsp (differentiation).
Potential drug targets and multidrug-resistance genes
One main benefit from our project should be the identification of novel proteins amenable to rational drug design. Selected examples of potential molecular targets are detailed in Table 4. Existing anthelminthics46 that disrupt neurotransmission provide the rationale for one group. Paralogs of calcium channel subunits, the targets of praziquantel, and cyclophillins, which mediate the antischistosomal effect of cyclosporin, are also listed. Molecules proposed as targets in other systems include innexins (connexins of vertebrates) and DNA polymerase. We identified transcripts for several multidrug resistance transporters, however, which could complicate the development of new drugs.
Potential vaccine candidates
Potential vaccine candidates should include proteins that are preferentially surface-exposed or exported and that are expressed in intramammalian stages. These properties can be searched for using Gene Ontology categorization. Thus, orthologs of secreted toxins and surface proteins involved in cell adhesion both warrant investigation (Table 5). Three orthologs of Plasmodium circumsporozoite protein, expressed in schistosomula and adults, and an ortholog of the S. cerevisiae threonine-rich cell-wall protein may be surface-exposed. Likewise, receptors that potentially bind host hormones should be accessible to the immune system. Targeting glycosyl phosphatidyl inositol–anchored proteins or receptors for nutrients could impair vital functions in the parasite and thus provide another avenue for vaccine development.
Discussion
Our study of the S. mansoni transcriptome increases tenfold the number of ESTs available to define the gene complement of this blood fluke and will be an essential resource for annotation of its genome. Our overall impression of this member of one of the simplest extant bilaterian groups is that most, if not all, of the cellular and physiological systems of higher animals were established before the divergence of the platyhelminths. Thus, components required for tissue organization and smooth muscle function were present at an early stage of metazoan evolution. An extensive range of neurotransmitter systems and enzymes for the generation of neuropeptides and opioid peptides indicates substantial capacity for neurosecretory control of physiology. Potential components of thyroid and steroid hormone systems were identified; it will be pertinent to establish the source of ligands for the relevant receptors. Apoptosis seems to be a later evolutionary development, however, with autophagy the predominant means of removing unwanted cells.
Features of the transcriptome that can be associated with the parasitic way of life are more difficult to define. One probable reason for this is that we found no similarity for 55% of SmAEs. A singular advantage of parasitism is the ready access to a supply of nutrients, uptake of which is facilitated by a wide variety of transporters and receptors for lipids and cholesterol. With respect to immune evasion, the paucity of mechanisms for antigenic variation, compared with Plasmodium or Trypanosoma, is notable. Immune evasion by secretion of an inert bilayer masking the parasite-host interface can now be investigated by combining the transcriptome database with proteomics techniques to elucidate the architecture of the tegument surface. A similar approach should allow identification of protein immunomodulators known to be released by cercariae, adult worms and eggs.
We should not forget that S. mansoni is an important human pathogen with no vaccine and a single drug for treatment. Mining the SmAE database for drug targets and vaccine candidates should therefore be a priority. By analogy with other systems, we have singled out a number of chemotherapeutic possibilities from a potentially long list. The prediction of vaccine candidates from sequence information alone is highly speculative, but key antigens should now be identifiable by immunological studies in experimental animals and humans.
Methods
Parasites.
We maintained the BH and PR isolates of S. mansoni in the laboratory by routine passage through mice and snails and recovered parasite life cycle stages as described in Supplementary Methods online. We concentrated cercaria, schistosomula and adults by centrifugation and stored them at −20 °C in RNAlater (Ambion) according to the manufacturer's recommendations before extracting mRNA. We used freshly isolated parasites from the other stages (eggs, miracidia and germ balls) for immediate extraction of mRNA.
Construction of cDNA libraries and sequencing.
We obtained DNase-treated mRNA with MACs mRNA isolation kits (Miltenyi Biotec) and used it to construct cDNA and SAGE libraries. We carried out cDNA synthesis and amplification using the ORESTES protocol with modifications12,47 (see Supplementary Methods online). We prepared normalized poly-dT-primed cDNA libraries as previously described10 using the abundantly available mRNA from adult worms. We sequenced cDNA using standard fluorescence-labeling dye-terminator protocols. To analyze differential gene expression, we used a set of six primers to construct ORESTES cDNA minilibraries from all stages. Sequencing of at least two 96-well plates per library resulted in at least 140 sequences per stage per primer (see Supplementary Methods online).
EST processing pipeline and annotation.
We stored, processed and trimmed EST sequence chromatograms through a web-based service48 and accepted sequences with at least 100 bp with phred-15 or higher for further evaluation. We filtered sequences using BLASTN analysis with a local copy of GenBank NT database and the BlastMachine (Paracel) to eliminate those that matched non-S. mansoni sequences with E ≤ 10−15 and had at least 98% identity along at least 75 nucleotides. We also excluded reads that matched S. mansoni ribosomal or mitochondrial sequences and transposon sequences with E ≤ 10−15 and at least 85% identity along at least 75 nucleotides or that matched bacterial sequences with E ≤ 10−20 and at least 95% identity along at least 75 nucleotides. We filtered further transposon and bacterial sequences by comparing with BLASTX against the set of transposon and bacterial sequences from GenBank NR and eliminating those with matching E ≤ 10−4 and at least 30% identity along at least 75 amino acids with transposons or matching E ≤ 10−6 and at least 95% identity along at least 75 amino acids with bacteria. We clustered and assembled ESTs using CAP3 (ref. 49). We assigned putative protein products to SmAEs based on BLASTX hits to National Center for Biotechnology Information's NR database. We assigned Gene Ontology terms to SmAEs based on BLASTX hits against a database locally built from public sequences associated with Gene Ontology terms. The public Gene Ontology annotated data sets used were from H. sapiens, D. melanogaster, Arabidopsis thaliana, Oryza sativa, C. elegans, S. cerevisiae, Schizosaccharomyces pombe and Vibrio cholerae plus a curated sequence database (Gene Ontology Annotation at EBI) available at the Gene Ontology Consortium website. In both cases, we used E ≤ 10−6 as the BLASTX cut-off. We used ESTscan to deduce amino acid sequences and used them as queries against the Pfam database 7.8.
SAGE.
We constructed a SAGE library with mRNA derived from adult worms (males and females) using the I-SAGE Kit (Invitrogen). We treated poly(A)+ mRNA with DNase before extraction with oligo-dT. We cloned and sequenced concatamers and derived tags from high-quality sequence segments. To determine the relative abundance of transcripts in adult worms, we compared the SAGE tag list with the complete SmAE data set and with all full-length cDNA sequences from S. mansoni.
Phylogeny inferences.
We aligned protein sequences using the ClustalX multiple sequence alignment program. Only unambiguous positions were used in the phylogenetic analysis. We generated phylogenetic trees using the Phylip program as described in Supplementary Methods online.
Differential expression analysis.
To evaluate differential expression, we assembled the ORESTES sequences derived from six primers along all six life cycle stages and considered the number of reads per stage for each cluster as an indirect inference of the expression level in the stage. Sequences with a differential frequency of reads by stage (99.8% confidence) when analyzed by a randomization test50 are discussed. Hierarchical clustering of these data was done using correlation distance UPGMA as provided in the Spotfire for Functional Genomics software (Spotfire). We carried out semi-quantitative RT–PCR to confirm differential expression of three selected genes (see Supplementary Methods online).
SNP analysis.
We identified putative SNPs in S. mansoni genes using Polybayes as described in Supplementary Methods online. We selected a fraction of the putative SNPs in vaccine candidates for experimental validation using DNA derived from pooled adult worms (see Supplementary Methods online).
URLs.
Project website including Schistosoma Gene Ontology browser, BLAST server and SmAEs search tools, http://bioinfo.iq.usp.br/schisto/; The Institute for Genomic Research S. mansoni genome project, http://www.tigr.org/tdb/e2k1/sma1/; The Sanger Institute S. mansoni genome project, http://www.sanger.ac.uk/Projects/S_mansoni/; The Phred/Phrap/Consed System Home Page, http://www.phrap.org/; National Center for Biotechnology, http://www.ncbi.nlm.nih.gov/BLAST/; Gene Ontology Consortium, http://www.geneontology.org/; ESTScan2 server, http://www.ch.embnet.org/software/ESTScan2.html; Pfam server, http://www.sanger.ac.uk/Software/Pfam/.
Accession numbers.
Sequences were deposited in GenBank under accession numbers CD059164–CD088507, CD088510–CD120734, CD120740–CD150744 and CD151578–CD202980. SNPs identified in this study were deposited in dbSNP at National Center for Biotechnology Information under the accession numbers ss8486502–ss8486509.
Note: Supplementary information is available on the Nature Genetics website.
References
World Health Organization. TDR Strategic Direction for Research: Schistosomiasis (World Health Organization, Geneve, 2002).
King, C.L. Initiation and regulation of disease in schistosomiasis. in Schistosomiasis (ed. Mahmoud, A.A.F.) 213–264 (Imperial College Press, London, 2001).
Doenhoff, M.J., Kusel, J.R., Coles, G.C. & Cioli, D. Resistance of Schistosoma mansoni to praziquantel: is there a problem? Trans. R. Soc. Trop. Med. Hyg. 96, 465–469 (2002).
Dunne, D. & Mountford, A. Resistance to infection in humans and animal models. in Schistosomiasis (ed. Mahmoud, A.A.F.) 133–211 (Imperial College Press, London, 2001).
Coulson, P.S. The radiation-attenuated vaccine against schistosomes in animal models: paradigm for a human vaccine? Adv. Parasitol. 39, 271–336 (1997).
Hausdorf, B. Early evolution of the bilateria. Syst. Biol. 49, 130–142 (2000).
Simpson, A.J., Sher, A. & McCutchan, T.F. The genome of Schistosoma mansoni: isolation of DNA, its size, bases and repetitive sequences. Mol. Biochem. Parasitol. 6, 125–137 (1982).
Santos, T.M. et al. Analysis of the gene expression profile of Schistosoma mansoni cercariae using the expressed sequence tag approach. Mol. Biochem. Parasitol. 103, 79–97 (1999).
Williams, S.A. & Johnston, D.A. Helminth genome analysis: the current status of the filarial and schistosome genome projects. Filarial Genome Project. Schistosome Genome Project. Parasitology 118 Suppl, S19–S38 (1999).
Soares, M.B. et al. Construction and characterization of a normalized cDNA library. Proc. Natl. Acad. Sci. USA 91, 9228–9232 (1994).
Dias-Neto, E. et al. Minilibraries constructed from cDNA generated by arbitrarily primed RT–PCR: an alternative to normalized libraries for the generation of ESTs from nanogram quantities of mRNA. Gene 186, 135–142 (1997).
Dias-Neto, E. et al. Shotgun sequencing of the human transcriptome with ORF expressed sequence tags. Proc. Natl. Acad. Sci. USA 97, 3491–3496 (2000).
Adams, M.D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).
Dehal, P. et al. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science 298, 2157–2167 (2002).
The C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018 (1998).
Manning, G., Whyte, D.B., Martinez, R., Hunter, T. & Sudarsanam, S. The protein kinase complement of the human genome. Science 298, 1912–1934 (2002).
Osman, A., Niles, E.G. & LoVerde, P.T. Identification and characterization of a Smad2 homologue from Schistosoma mansoni, a transforming growth factor-β signal transducer. J. Biol. Chem. 276, 10072–10082 (2001).
Pappas, P.W. Membrane transport in helminth parasites: a review. Exp. Parasitol. 37, 469–530 (1975).
Skelly, P.J., Kim, J.W., Cunningham, J. & Shoemaker, C.B. Cloning, characterization, and functional expression of cDNAs encoding glucose transporter proteins from the human parasite Schistosoma mansoni. J. Biol. Chem. 269, 4247–4253 (1994).
Racoosin, E.L., Davies, S.J. & Pearce, E.J. Caveolae-like structures in the surface membrane of Schistosoma mansoni. Mol. Biochem. Parasitol. 104, 285–297 (1999).
Xu, X. & Caulfield, J.P. Characterization of human low density lipoprotein binding proteins on the surface of schistosomula of Schistosoma mansoni. Eur. J. Cell Biol. 57, 229–235 (1992).
Mair, G.R., Maule, A.G., Day, T.A. & Halton, D.W. A confocal microscopical study of the musculature of adult Schistosoma mansoni. Parasitology 121, 163–170 (2000).
Halton, D.W. & Gustafsson, M.K.S. Functional morphology of the platyhelminth nervous system. Parasitology 113, S47–S72 (1996).
Dorsey, C.H., Cousin, C.E., Lewis, F.A. & Stirewalt, M.A. Ultrastructure of the Schistosoma mansoni cercaria. Micron 33, 279–323 (2002).
Hoffmann, K.F., Davis, E.M., Fischer, E.R. & Wynn, T.A. The guanine protein coupled receptor rhodopsin is developmentally regulated in the free-living stages of Schistosoma mansoni. Mol. Biochem. Parasitol. 112, 113–123 (2001).
Pax, R.A. & Bennett, J.L. Neurobiology of parasitic platyhelminths: possible solutions to the problems of correlating structure with function. Parasitology 102 Suppl, S31–S39 (1991).
Smart, D. et al. Peptides related to the Diploptera punctata allatostatins in nonarthropod invertebrates: an immunocytochemical survey. J. Comp. Neurol. 347, 426–432 (1994).
Pryor, S.C. & Elizee, R. Evidence of opiates and opioid neuropeptides and their immune effects in parasitic invertebrates representing three different phyla: Schistosoma mansoni, Theromyzon tessulatum, Trichinella spiralis. Acta Biol. Hung. 51, 331–341 (2000).
de Mendonca, R.L., Escriva, H., Bouton, D., Laudet, V. & Pierce, R.J. Hormones and nuclear receptors in schistosome development. Parasitol. Today 16, 233–240 (2000).
Saule, P. et al. Early variations of host thyroxine and interleukin-7 favor Schistosoma mansoni development. J. Parasitol. 88, 849–855 (2002).
Snyder, S.D., Loker, E.S., Johnston, D.A. & Rollinson, D. The Schistosomatidae: Advances in Phylogenetics and Genomics. in The Interrelationships of Platyhelminthes (eds. Littlewood, D.T.J. & Bray, R.A.) 194–199 (Taylor and Francis, London, 2000).
Basch, P.F. Schistosoma mansoni: nucleic acid synthesis in immature females from single-sex infections, paired in vitro with intact males and male segments. Comp. Biochem. Physiol. B 90, 389–392 (1988).
DeMarco, R., Kowaltowski, A.T., Mortara, R.A. & Verjovski-Almeida, S. Molecular characterization and immunolocalization of Schistosoma mansoni ATP-diphosphohydrolase. Biochem. Biophys. Res. Commun. 307, 831–838 (2003).
Fulford, A.J., Butterworth, A.E., Ouma, J.H. & Sturrock, R.F. A statistical approach to schistosome population dynamics and estimation of the life-span of Schistosoma mansoni in man. Parasitology 110 (Pt 3), 307–316 (1995).
Murphy, C.T. et al. Genes that act downstream of DAF-16 to influence the lifespan of Caenorhabditis elegans. Nature 424, 277–283 (2003).
Hu, P. et al. Role of membrane proteins in permeability barrier function: uroplakin ablation elevates urothelial permeability. Am. J. Physiol. Renal Physiol. 283, F1200–F1207 (2002).
Skelly, P.J., Da'dara, A. & Harn, D.A. Suppression of cathepsin B expression in Schistosoma mansoni by RNA interference. Int. J. Parasitol. 33, 363–369 (2003).
Boyle, J.P., Wu, X.J., Shoemaker, C.B. & Yoshino, T.P. Using RNA interference to manipulate endogenous gene expression in Schistosoma mansoni sporocysts. Mol. Biochem. Parasitol. 128, 205–215 (2003).
Wilson, R.A. & Barnes, P.E. The formation and turnover of the membranocalyx on the tegument of Schistosoma mansoni. Parasitology 74, 61–71 (1977).
Dissous, C. & Capron, A. Convergent evolution of tropomyosin epitopes. Parasitol. Today 11, 45–46 (1995).
Ramos, C.R. et al. Gene structure and M20T polymorphism of the Schistosoma mansoni Sm14 fatty acid-binding protein. Molecular, functional, and immunoprotection analysis. J. Biol. Chem. 278, 12745–12751 (2003).
van der Kleij, D. et al. A novel host-parasite lipid cross-talk. Schistosomal lyso-phosphatidylserine activates toll-like receptor 2 and affects immune polarization. J. Biol. Chem. 277, 48122–48129 (2002).
Cutts, L. & Wilson, R.A. Elimination of a primary schistosome infection from rats coincides with elevated IgE titres and mast cell degranulation. Parasite Immunol. 19, 91–102 (1997).
Damonneville, M., Pierce, R.J., Verwaerde, C. & Capron, A. Allergens of Schistosoma mansoni. II. Fractionation and characterization of S. mansoni egg allergens. Int. Arch. Allergy Appl. Immunol. 73, 248–255 (1984).
Salter, J.P. et al. Cercarial elastase is encoded by a functionally conserved gene family across multiple species of schistosomes. J. Biol. Chem. 277, 24618–24624 (2002).
Mansour, T.E. Chemotherapeutic Targets in Parasites (Cambridge University Press, Cambridge, 2002).
Fietto, J.L., DeMarco, R. & Verjovski-Almeida, S. Use of degenerate primers and touchdown PCR for construction of cDNA libraries. Biotechniques 32, 1404–1411 (2002).
Paquola, A., Nishiyama, M. Jr., Reis, E.M., daSilva, A.M. & Verjovski-Almeida, S. ESTWeb: bioinformatics services for EST sequencing projects. Bioinformatics 19, 1587–1588 (2003).
Huang, X. & Madan, A. CAP3: A DNA sequence assembly program. Genome Res. 9, 868–877 (1999).
Stekel, D.J., Git, Y. & Falciani, F. The comparison of gene expression from multiple cDNA libraries. Genome Res. 10, 2055–2061 (2000).
Acknowledgements
E.D.N. thanks Associação Beneficente Alzira Denise Hertzog da Silva for financial support, D. Rollinson for providing schistosome isolates from Africa and Lebanon and M.G. dos Reis and N. Lucena for providing isolates from northeast Brazil. This project was financed by Fundação de Amparo a Pesquisa do Estado de Sao Paulo and by the Brazilian Ministry of Science and Technology, Conselho Nacional de Desenvolvimento Científico e Tecnológico. The York schistosomiasis group received support from the Biology and Biotechnology Science Research Council, Wellcome Trust and the European Commission Research for Development Programme, Sector Health.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Verjovski-Almeida, S., DeMarco, R., Martins, E. et al. Transcriptome analysis of the acoelomate human parasite Schistosoma mansoni. Nat Genet 35, 148–157 (2003). https://doi.org/10.1038/ng1237
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng1237
This article is cited by
-
Assessment of reference genes at six different developmental stages of Schistosoma mansoni for quantitative RT-PCR
Scientific Reports (2021)
-
Multiomic analysis of Schistosoma mansoni reveals unique expression profiles in cercarial heads and tails
Communications Biology (2021)
-
Optimization of Expression and Purification of Schistosoma mansoni Antigens in Fusion with Rhizavidin
Molecular Biotechnology (2021)
-
The evolution of TNF signaling in platyhelminths suggests the cooptation of TNF receptor in the host-parasite interplay
Parasites & Vectors (2020)
-
The antischistosomal potential of GSK-J4, an H3K27 demethylase inhibitor: insights from molecular modeling, transcriptomics and in vitro assays
Parasites & Vectors (2020)