Introduction

The genus Coffea (family Rubiaceae) encompasses approximately 100 species all of which are native to the African continent and Madagascar and the Mascarene Islands (Davis et al. 2006). Two of these species Coffea canephora and Coffea arabica, are widely cultivated for the production of the coffee beverage. The former is diploid and allogamous, the latter, allotetraploid and preferentially autogamous. Approximately 70% of the world coffee production is coming from C. arabica versus 30% for C. canephora. But the main source of raw material for soluble coffee is C. canephora.

While it is not widely known, coffee is one of the most valuable international exchanges for cultural-traded commodities. This is reflected in the fact that the stock exchange for raw coffee is the forth in value only to wheat, sugar, and soya in the international market (Pendergrast 2009). Additionally, more than 25 million people worldwide are linked to coffee culture and processing. Despite these economic aspects, coffee research suffers of a lack of investments both in terms of science and financial resources. Moreover, coffee is a perennial plant with a time from seed to seed of about 5 years, which makes genetic studies more difficult and time consuming. While some genomic information is publicly available for coffee (e.g., an expressed sequence tags (EST) database— Lin et al. 2005; Poncet et al. 2006), it lags far behind what is available for many other agricultural species. As a result, coffee researchers have only limited access to the plethora of genomic resources available for most major crop species.

Comparative genomics provides the opportunity to leverage genetic/genomic information from one species to another via comparative genetic maps (e.g., Tanksley et al. 1992; Livingstone et al. 1999; Mysore et al. 2001; Dirlewanger et al. 2004; Srinivasachary et al. 2007; Wu et al. 2009a, b). For example, this comparative approach has been used to leverage genomic information from the model species Arabidopsis thaliana to related crop species (Van Dodeweerd et al. 1999; Grant et al. 2000; Ku et al. 2000; Bevan and Walsh 2005). However, the value of comparative genomic information is inversely proportional to the evolutionary distance of the species being compared. Hence, genomic information from rice has been of great benefit to other crops species in the grass family, but has been used minimally for species from other families (Srinivasachary et al. 2007; Wilson et al. 1999). There have been attempts to use genomic information from Arabidopsis to coffee for synteny study (Mahé et al. 2007). However, coffee being only distantly related to Arabidopsis, the genome comparison approach is difficult to perform and is therefore of limited utility. Phylogenetically, the model species most closely related to coffee, for which significant genetics and genomic resources exist, is tomato. Both coffee and tomato belong to the Euasterid I clade of flowering plants and are likely derived from an ancestral species with a haploid number of × = 11 or 12 (Wu et al. 2006).

The tomato genome has already been linked to the genomes of other Solanaceous crops (e.g., potato, eggplant, pepper) via comparative genetic maps (Tanksley et al. 1992; Livingstone et al. 1999). Moreover, comparative bacterial artificial chromosome (BAC) sequencing has shown a high degree of microsynteny conservation among these species (Wang et al. 2008). This high level of conservation can be attributed, at least in part, to the absence of polyploidization in the ancestral genome that gave rise to the family Solanaceae (Wang et al. 2008). Furthermore, recent studies suggest that the Solanaceae and Rubiaceae (which contains coffee) families descended from a diploid ancestor without any recent whole genome duplication (Wu et al. 2006). Hence, one might expect a reasonable level of both macro and micro (Guyot et al. 2009) synteny between the tomato and coffee genomes. If comparative genetic maps could be generated for coffee and tomato, it might allow coffee researchers to access to the large set of genetic and genomic data currently available for tomato—including the genome sequence of tomato which is currently being generated by a consortium of international scientists (http://www.sgn.cornell.edu/; Mueller et al. 2005a, b, 2009). The availability of a comparative map may open the way for coffee researchers to use the tomato genetic map and genome sequence for predicting the chromosomal positions of the large number of coffee ESTs currently available (Lin et al. 2005, http://sgn.cornell.edu/search/library_search.pl?w5c4_library_parameters = canephora).

Coffee and tomato diverged from their last common ancestor since approximately 86 MY, a long period of evolutionary time allowing for many chromosomal rearrangement to accumulate (Wikstrom et al. 2001). Therefore, a detailed understanding of the comparative relationships between the coffee and tomato genomes will likely have to await the sequencing of both genomes. However, the generation of comparative genetic maps for tomato and coffee, as described in this paper, may still allow a cross use of genetic information between the two species. The extent of the predictive value of these comparative maps will be determined in part by the extent to which the two genomes were rearranged subsequent to divergence from their last common ancestor.

Recently, a set of polymerase chain reaction (PCR)-based ortholog markers (Wu et al. 2006), have been used to compare genomes of solanaceous species (tomato, eggplant (Wu et al. 2009a), pepper (Wu et al. 2009b), and tobacco (Wu et al. 2009c)), and a related taxa such as Rubiaceae species: coffee. These markers are referred to Conserved Orthologous Set (COSII) markers. The primary objective of the current study aims at constructing a comparative map of the coffee and tomato genomes based on these COSII markers. This represents the first detailed comparative maps between species belonging to different families and might significantly enhance the potential for genetic/genomic research in coffee.

Materials and methods

Tomato mapping population

The tomato map is based on 80 F2 individuals from the cross Solanum lycopersicum LA925 x Solanum pennellii LA716 (Wu et al. 2009a, b). The complete tomato map, including all COSII markers is available at http://sgn.cornell.edu/cview/map.pl?map_version_id = 52

Coffee mapping population

Genetic diversity of C. canephora is subdivided in two main genetic groups, Guinean and Congolese groups coming from two different geographic origins (Dussert et al. 1999; Gomez et al. 2009). A consensus reference genetic map of C.canephora has been developed in Nestle Tours Centre in collaboration with the Indonesian ICCRI institute (Crouzillat et al. unpublished data). This map derives from a cross between two highly heterozygous genotypes, a Congolese group genotype (BP409) and a Congolese-Guinean hybrid parent (Q121). This same population was used to map 396 COSII loci. The segregating population is composed of 93 F1 individuals.

Identification of coffee COSII orthologs and primer design for mapping

It was first necessary to identify the coffee ortholog corresponding to each tomato COSII gene. This was accomplished by using the publicly available coffee EST database (Lin et al. 2005; Poncet et al. 2006) and employing the search/verification methods described in Wu et al. (2006). Coffee unigene sequences were downloaded for each COSII marker from the Coffee Genomic Network Website (http://coffee.pgn.cornell.edu/index.pl) or SOL website (http://www.sgn.cornell.edu/). The corresponding Arabidopsis cDNA and genomic sequence were downloaded for the same COSII from the TAIR website (http://www.arabidopsis.org/). Based on these three-way alignments (coffee unigene: Arabidopsis cDNA: Arabidopsis genomic sequence) using Clustalw: (http://bioinfo.hku.hk/services/analyseq/cgi-bin/clustalw_in.pl), it was possible to identify the predicted positions of introns for each of the coffee COSII genes. Coffee primers were then designed in exons flanking one or more introns. Amplified introns were expected to carry more polymorphic sites than exons and thus be more useful in the subsequent genetic mapping experiments. Two different software programs were used for primer design: Primer Express and Primer Select. Primer design conditions were as follows: predicted amplicon length ≤ 600 nt, primer length 17-24 nt, Tm ≥ 55°C, hairpins (between and within primers) ≤ 4. PCR amplifications were performed at 55°C using total DNA from BP409 and Q121. Of the PCR products, 10 µl were loaded on 3% agarose gel in order to identify PCR product lengths for DNA sequencing. PCR products were sequenced both strands for the two parents of the mapping population. For identity control, sequences of the PCR products, derived from genomic DNA, were compared with the coffee EST-based unigene sequences. In order to identify polymorphisms, BP409 and Q121 sequences were compared using SeqScape software or SeqDoc website (http://research.imb.uq.edu.au/seqdoc/). With this approach, it was possible to identify simple sequence repeats (SSR), single-nucleotide polymorphism (SNP), and insertion/deletion polymorphisms in coffee COSII sequences useful for genetic mapping.

Coffee genetic mapping

COSII markers were mapped into the coffee mapping population (BP409 × Q121) using three types of polymorphism:

Restriction fragment length polymorphism (RFLP)

Several COSII EST amplicons, derived from C. canephora (Wu et al. 2006), were used as source of DNA for labeled probes using 32P. The screening and mapping of the polymorphic COSII probes was performed using eight restriction enzymes (DraI, EcoRV, HindIII, HaeIII, RsaI, ScaI, HincII, and PvuII).

SSR

Primers pairs were designed in order to obtain a range of PCR amplicon length from 100 to 300 bp. Forward primers were tagged with a fluorescent dyes FAMTM, VIC®, NEDTM or PETTM. Reagent volumes are 2.5 µl of genomic DNA, 10.5 µl of AmpliTaq Gold Master Mix (Applied BiosystemsTM), 1 µl of each primer (initial concentration of 20 µM), and 10.5 µl of water. PCR conditions are 94°C for 10 min, 30 cycles 94°C/55°C/72°C with respective durations of 30/30/60 s and finally 72°C during 7 min before cooling to 4°C. Mix for electrophoresis was 2 µl of PCR product, 20 µl of formamide and 0.5 µl of Genscan 500Liz size standard (Applied BiosystemsTM) and alleles detection was obtained using an ABiPrism 3100 analyzer (Applied BiosystemsTM). Alleles detection and identification were then computed with “GeneScan Analysis Software”, and segregation determined using “Genotyper” software.

SNP

SNPs indentified in sequenced amplicons from the two mapping parents were mapped using the MGB® (Minor Group Binder) TaqMan® technology with hot start DNA polymerase—one primer pair and two TaqMan® probes were designed by Applied Biosystems. The probes were tagged with FAMTM or VIC® dyes fluorescence to discriminate the two segregating alleles from the heterozygous parent(s). Reagent volumes for one genotyping assay were 2.5 µl of genomic DNA, 0.6 µl of 40X MGB® probes Master Mix, 9.4 µl of water, and 12.5 µl of TaqMan® 2X Universal PCR Master Mix (Applied BiosystemsTM). PCR reactions were performed with an ABiPrism 7500 thermocycler (Applied BiosystemsTM) under the following conditions: hot start DNA polymerase activation at 95°C during 10 min, followed by amplification step 40 cycles of 95°C for 15 s, 60°C for 1 min. End point reading for genotype assessment was made at the last step. Progenies genotyping was performed using allelic discrimination assay. Analysis of results has been done with “7500 System Software”.

Additional SNPs were also detected using high resolution melting technology. After detection of SNP in one or both parents of the population, primers pairs were designed in order to have an amplified fragment length between 150 and 500 bp. Reagent components for one genotyping assay are, 2 µl of genomic DNA, 10 µl of Master Mix (2× containing fluorescent dye), 1 µl of primer mix (final concentration of 0.2 µM for each primers), 2.4 µl of MgCl2, 3.6 µl of water. Then PCR reactions were done with Lightcycler® 480 thermocycler (Roche Life Sciences). The PCR conditions were the following, activation of DNA polymerase and DNA denaturation by one cycle at 95°C during 10 min. Next step is the amplification of target DNA by 45 cycles of 95°C for 10 s, 60°C for 15 s, and 72°C for 25 s. The last step is the high resolution melting of PCR products, realized in three times, first is denaturation of PCR products at 95°C during 1 min, second is renaturation at 40°C during 1 min, and finally melting in an interval from 60°C to 95°C with a ramp rate of 0.02°C/s. During this last step, fluorescence is read 25 times per degree. Analysis of melting curves and segregation of alleles were performed using “Genescanning” software.

The linkage analysis and map calculations were performed using JoinMap® software version 4 (Stam 1993; Van Ooijen 2006). Population type was cross-pollinated type, as population under study resulted from a cross of two heterozygous diploid parents with no information available about linkage phases. The genetic maps of the two parents BP409 and Q121 were constructed separately since COSII markers could be mapped on one (backcross) or two (F2) of these parental genetic maps due to allogamous status of C. canephora. Linkage groups were defined at logarithm of odds (LOD) of three or more. Kosambi’s function was used to calculate genetic distances between two loci. The coffee COSII genetic map was then built using F 2 segregating loci as anchor markers in order to merge homologous parental linkage groups. Order of loci on consensus genetic map was compared to that on both parental genetic maps. If the order of markers on the consensus genetic map was clearly different from the order on the parental maps, the order of the parental linkage group was taken as a fixed reference order. Genetic maps were drawn using MapChart software version 2.1 (Voorrips 2002).

Tomato chromosome 7 genomic sequence

The sequence of tomato chromosome 7 is currently being generated as part of the international initiative aimed at sequencing the entire tomato genome (M. Bouzayen personal communication, Mueller et al. 2009). In the current study, we utilized 15.25 Mb of non-redundant sequences generated for chromosome 7. Given that the sequencing effort is targeted to the gene-rich region of the genome, most of the sequence corresponds to the euchromatin part of tomato chromosome 7. Based on the chromosomal location of the sequenced BACs determined by BAC-fluorescence in situ hybridization (FISH) mapping, it has been estimated that 8.96 Mb of the sequences correspond to euchromatin, whereas 6.29 Mb to heterochromatin (M. Bouzayen personal communication). Among the 189 BACs and Fosmids sequenced on chromosome 7, 165 are distributed into 39 contigs; the remaining are singletons. Among the 189 chromosome 7 BACs and Fosmids used in this study, the sequences corresponding to 160 are available at Genbank.

In order to study the synteny between tomato chromosome 7 and coffee linkage groups E and F, for each COSII marker mapped on each of these two linkage groups, the corresponding tomato unigene (available at ftp://ftp.sgn.cornell.edu/COSII/tomato_mapping/) or the Arabidopsis ortholog (when no tomato unigene was identified for the marker) was blasted against the tomato chromosome 7 BAC sequences (Altschul et al. 1997). The marker was assigned to tomato chromosome 7 if the sequence identity exceeded 99% between the tomato unigene and the exonic part of the BAC sequences, or if the Arabidopsis ortholog had a single match on the BAC sequences. On the other hand, we also screened all available (up to 2,800) COSII sequences by a blast search on chromosome 7 to identify new COS markers on chromosome 7. The COSII sequences selected by this mean were then mapped on coffee genome.

Results

Coffee COSII mapping

A total of 1,541 coffee COSII genes were identified in silico from a C. canephora EST public database (http://www.sgn.cornell.edu/). Among them, 990 were analyzed in order to detect DNA polymorphism on the two parents BP409 and Q121. In total, 396 loci (40%) were subsequently mapped. These were mapped as: SNPs 195 loci (49%), RFLPs 157 loci (40%), and SSRs 44 loci (11%).

Coffee COSII genetic map

The coffee COSII genetic map comprised 11 linkage groups covering 1,331 cM (Fig. 1 and Table 1). As the haploid number of C. canephora is × = n = 11, we assume that these 11 linkage groups correspond to the 11 coffee chromosomes. All the linkage groups were defined at LOD >3. The highest number of COSII mapped in a group is 72 for linkage group B (Table 1), and the lowest is 21 for the groups H and I. The average spacing between markers is 3.6 cM and the maximum is 32 cM. Only six COSII markers (2%) show a duplicated locus, demonstrating that the majority of the COSII selected in silico do indeed correspond to single copy genes. A set of 25 COSII genes showed significant distortions of segregation (p ≤ 0.01), seven of them were mapped on linkage group H, six on linkage group C, four on group F, three on group K, two on linkage group E, one on groups A, G, and J. These biased loci are not randomly distributed among linkage groups; they are mainly located on centromeric areas or distal ends.

Fig. 1
figure 1

Coffee COSII synteny map. Shared syntenic blocs with tomato are deduced from the COSII loci shared with tomato. Each tomato chromosome is assigned a different color. Then the COSII loci on the coffee linkage groups and the syntenic blocs are colored according to the corresponding tomato chromosome (see Wu et al. 2009a for correspondence). Areas expected to be syntenic are extended 10 cM beyond last orthologous marker since syntenous tracts often do not extend beyond such distances (see text for discussion). Loci showing distorted segregation are indicated with *P < 0.01 and **P < 0.001

Table 1 Number and distribution of the COSII markers on the coffee genetic map. The COSII number and the average distance between two loci are indicated for each of the 11 coffee-linkage groups

Comparison of the coffee and tomato COSII maps

From the 396 COSII loci mapped on coffee, 257 (65%) are in common with the tomato map (Table 2). On the basis of these common markers, it was possible to present a synteny map of the coffee genome (Fig. 1). An examination of this map reveals that the average length of syntenic tracts (comprised of at least two adjacent markers) is 12 cM (Fig. 1). In an effort to further examine the extent to which the coffee and tomato maps are conserved, we conducted the following experiment. The coffee genetic map was parsed into pairs of linked orthologous markers. The percentage of cases for which the two adjacent markers are syntenic (correspond to the same tomato chromosome) was recorded and plotted against the map distance (cM) between markers. As the maps distance between markers increases, one would expect a higher chance that chromosomal rearrangement(s) would have led to loss in synteny between the coffee and tomato genome. Further, if there is no predictive value, with regards to synteny, between the two markers, one would expect that the two markers would (by chance) still correspond to the same tomato chromosome approximately 8% of the time (since there are 12 chromosomes in the tomato genome, the probability that two markers correspond by chance to the same chromosome is, on average, 1/12 = 0.08).

Table 2 Distribution of the shared 257 COSII markers on tomato chromosomes and the 11 coffee linkage groups. COSII number shared between each tomato chromosome and coffee-linkage group is indicated. The red color shows the highest COSII number shared for each coffee-linkage group

The results from this exercise are shown in Fig. 2. For intervals lower than 8 cM the predictive value ranged from 16-75%. However, for larger intervals (8-12 and 12-16 cM) the predictive value dropped to 3% lower than chance (8%). Thus, we propose that the predictive value of orthologous markers between the coffee and tomato genomes should not be generally extended beyond approximately 10 cM. We applied this guideline to the coffee comparative map in Fig. 1 for the putative syntenic correspondence (designated as colors for the tomato chromosomes) extends 10 cM on either side of each orthologous marker on the coffee map unless additional orthologous markers have been mapped in the region in question. The purpose of this exercise was not to deduce or characterize all genome rearrangements for which the coffee and tomato genomes (see subsequent sections for discussion of this topic), but rather to provide users a semi-quantitative tool for determining the probably counterparts in the tomato genome for as much of the coffee genome as possible (Table 2, electronic supplementary material).

Fig. 2
figure 2

Map distance between adjacent pairs of syntenic markers. Histograms showing the percentage of pair marker intervals (cM) from the coffee maps for which both markers correspond to same tomato chromosome. Number above bars indicates total number of paired marker intervals in each category

Tomato chromosome 7 displays high syntenic relationship with coffee linkage groups E and F

Out of the 34 COSII markers mapped on tomato chromosome 7, up to 32 correspond to coffee linkage groups E and F (Table 2) stressing that the syntenic relationship between the two species is the highest in these particular regions of the tomato and coffee genomes. Therefore, these genomic regions offer a favorable material for more detailed analysis of the syntenic relationship between tomato and coffee genomes. Since tomato chromosome 7 is currently being sequenced by a consortium of French scientists (M. Bouzayen et al. personal communication, Mueller et al. 2009), we thus took advantage of this sequence information to further investigate the fine scale synteny between tomato chromosome 7 and its counterparts in the coffee genome.

Screening for additional COSII markers by Blast searches against the genomic sequence of tomato chromosome 7

The tomato BAC sequences obtained on chromosome 7 were screened by blast for the presence of COSII. We screened the sequences for the presence of COSII mapped on chromosome 7 in the tomato map (67 COSII) and on coffee linkage groups E and F (90 COSII). The results presented in Table 3 shows that out of the 67 COSII mapped on tomato chromosome 7, only 54 were found in the BAC sequences. It is expected not to retrieve the whole set of COSII markers since the chromosome 7 sequence is currently incomplete. Among those 54 markers, 21 could be mapped and localized to linkage group E and F in coffee. On the other hand, five additional COSII markers mapped on coffee linkage group E and F but not present on the tomato genetic map were retrieved in the chromosome 7 BAC sequences which allowed to locate them on the tomato chromosome 7 genetic map. We also looked for other COSII that had been mapped neither on coffee nor on tomato. Blast search on tomato chromosome 7 BAC sequence and COSII set previously described by Wu et al. (2006) identified 18 additional COSII sequences, the coffee mapping of which was only successful for three due to a lack of polymorphism between the parents. Marker C2_At4g24820 maps on coffee linkage group F while C2_At2g42810 and C2_At3g20630 markers are located on linkage group E (Table 3, Fig. 3). These data illustrate the predictive value of coffee-tomato synteny. Overall, the present study allowed the selection of 35 COSII markers that are common between tomato chromosome 7 and coffee linkage groups E and F among which 20 are found on coffee linkage group E and 15 on linkage group F (Table 3, Fig. 3). Coffee group F matches the majority of the short arm of tomato 7 and coffee group E matches the majority of the long arm of tomato 7 (Fig. 3).

Table 3 List of the COSII markers in tomato chromosome 7 and/or coffee linkage groups E and F
Fig. 3
figure 3

Detailed synteny between coffee linkage groups E &F and tomato chromosome 7. COSII markers and unigenes localized on tomato 7 chromosome and the coffee linkage groups E & F. Common markers are connected by lines. Loci order on the tomato chromosome 7 has been modified according to the physical mapping information (see text for discussion). Disruption between syntenic areas between linkage groups E and F located in or near the border between heterochromatin and euchromatin of the long arm

It is also important to mention that our blast search revealed for COSII markers C2_At5g48300 and C2_At1g78600, previously mapped on chromosome 12, a perfect match (100% identity) with two sequences located on chromosome 7 but no homologous sequence in chromosome 12 nor in any other chromosome. Moreover, COSII maker C2_At5g48300 has been mapped on chromosome 7 in two other Solanaceae species closely related to tomato, pepper, and egg plant (Wu et al. 2009a, b). These two markers were therefore added to the set of markers identified as common between tomato chromosome 7 and coffee linkage group E and F. Moreover, these two COSII genes are positioned in a similar order in tomato chromosome 7 and coffee linkage group E further supporting the high syntenic relationship between the two species chromosomal regions.

Reordering the tomato chromosome 7 COSII markers by integrating the physical and genetic maps

Establishing contigs of BAC sequences in the tomato chromosome 7 revealed several inversions between the genetic map position of the marker and its physical position. Considering that the synteny study between tomato and coffee is based on marker location and order, the COSII map of tomato chromosome 7 was updated on the light of the integration of the data related to the physical and genetic positions of the markers used in the present study (Fig. 3). However, as the tomato genome sequence is still incomplete, we cannot exclude the possibility that other inversions might be revealed as the sequencing efforts progress further. Tomato chromosome 7 markers that show order discrepancy of particular importance for the present study are located in the extremity of the short and long arm, close to the telomeres. Comparison of the data presented in Fig. 3 and Table 3 reveals that four COSII markers are mis-ordered in the genetic map as revealed by their physical position determined by sequence analysis. Markers C2_At5g58200 and C2_At2g24270 were genetically mapped at position cM 0.4 and cM 0.3, respectively, while the analysis of the contig sequence shows that they are in the reverse order. Likewise, markers C2_At5g57655 maps at position cM 0.1 whereas it is found in a BAC contig containing marker T0256 located at cM 1.2. The position of the contig containing this marker was confirmed by FISH analysis to be lower than that of C2_At5g58200 and C2_At2g24270 (data not shown). Marker C2_At5g34850 located at cM 0.4 according to the tomato genetic map is contained in a BAC harboring also marker T1940 mapped at position cM 2.3. The order retained for these four markers on the basis of the position of the BAC contigs from top to bottom of the short arm of chromosome 7 is the following: C2_At5g58200, C2_At2g24270, C2_At5g57655, and C2_At5g34850. These loci order discrepancy is due to the low number of progenies used for tomato and coffee genetic mapping.

Relationship of chromosome structure and genome evolution

Location of heterochromatin and euchromatin areas on tomato chromosome 7 (Fig. 3) was deduced from both sequence data analysis and FISH experiments. Interestingly, syntenic blocks shared with E and F are localized mainly in the euchromatin regions of tomato chromosome 7. Moreover, the boundary of synteny between coffee linkage groups E and F on tomato chromosome 7 is located in or near the border between heterochromatin and euchromatin of the long arm.

Discussion

While the genomes of coffee and tomato have undergone extensive rearrangements (e.g., inversions and translocations) since divergence from their last common ancestor, the data presented in this study indicate that it is still possible to decipher synteny across most of their genomes. The majority (75%) of syntenic blocks are short (<4 cM); however, some conserved tracts extend >20 cM (Figs. 1 and 2). For example, the lower 50 cM of coffee-linkage group E matches, in near-perfect order, the loci on the long arm of tomato chromosome 7 (Fig. 3). Coffee-linkage group F matches the majority of the short arm of tomato 7 while linkage group E matches the majority of the long arm of tomato 7. However, there is no direct relationship between genetic and physical distances as illustrated by Xu et al. (2008) in cotton for example.

Our data show that the boundary of synteny between coffee linkage groups E and F and tomato chromosome 7 lies near the heterochromatin/euchromatin border of the long arm of this tomato chromosome strongly supporting the hypothesis of the occurrence of a major breakpoint rearrangement event in this region.

Studies, such as the one described here, would not be possible without some foundation of genomic information in the species being studied. In this case, we relied heavily on the many genetic and genomics resources now available for tomato (Mueller et al. 2009). But, we also took advantage of a publicly available EST database for coffee (Lin et al. 2005). The information on tomato chromosome 7 genomic sequence and physical mapping of the BACs were essential to perform the screening of a large number of COSII markers to select those useful for the present study. Also essential for this study was the availability of a set of single copy, orthologous genes that could be used for mapping in both tomato and coffee. We therefore utilized the set of previously published COSII genes that meet all of these criteria (Wu et al. 2006). These COSII genes have already been used for comparative mapping studies in Solanaceous species (Wu et al. 2009a, b, c). The current study demonstrates that these COSII genes can also be used for comparative mapping across plant families—at least for the closely related plant families Solanaceae (tomato) and Rubiaceae (coffee). Understanding the modes and consequences of genome evolution across higher level taxa (such as families) may provide new insights into angiosperm diversification (Timms et al. 2006). The current study, comparing the genomes of coffee and tomato, represents one of the few cases in which synteny could be well-deciphered across plant families.

In addition to providing insights about genome evolution across plant families, the current study also has applied implications. Sufficiently detailed syntenic maps provide a means by which to access genomic information across species. For example, relative to coffee, many more genetic and genomic resources exist for tomato—the genome of which is currently being sequenced by an international consortium of scientists (Mueller et al. 2009). The comparative maps provided here may allow coffee researchers to identify the orthologous counterparts in the tomato genome sequence. This could potentially expedite the identification of QTLs and cloning of genes in coffee. The value of this has already been demonstrated in other taxa (Kilian et al. 1997). Even through a large public EST database for coffee is now available, (Lin et al. 2005; Poncet et al. 2006), very few of these ESTs have been genetically mapped in coffee—limiting their use as candidate genes for cloning studies. The tomato-coffee syntenic map reported herein, combined with the tomato genome sequence, may allow the coffee map position of many of these ESTs to be predicted by in silico mapping. This could greatly increase both the number of mapped genetic markers in coffee and provide a rich source of candidate genes for QTL cloning experiments.

Conclusions

All COSII genes mapped on coffee genome were the result of publicly available information and databases: coffee EST database (http://sgn.cornell.edu/search/library_search.pl?w5c4_library_parameters = coffea), COSII sequences (Wu et al. 2006), tomato COSII map (Wu et al. 2006, http://sgn.cornell.edu/cview/), and Arabidopsis genome database (http://www.arabidopsis.org/). This highlights the importance and benefit of scientist working on orphan plants to have access to such data. Based on these data, we were able herein to produce the first comparative map for coffee and to link it to the intensive genetic and genome resources available from research in tomato and other Solanaceae species. This also represents the first detailed synteny study dedicated to a woody plant species. Because the Rubiaceae (coffee) and Solanaceae families are relatively closely related and share a common, diploid ancestor (with most probably x = n = 11 or 12), we are able to report significant genetic conservation between these two species. While the level of conservation between coffee and tomato is not as high as that observed among Solanaceae species, it is nevertheless sufficiently high to allow the prediction of orthologous positions between coffee and tomato across significant part of the genome.