Skip to main content
  • Research article
  • Open access
  • Published:

Identification of new polymorphic regions and differentiation of cultivated olives (Olea europaea L.) through plastome sequence comparison

Abstract

Background

The cultivated olive (Olea europaea L.) is the most agriculturally important species of the Oleaceae family. Although many studies have been performed on plastid polymorphisms to evaluate taxonomy, phylogeny and phylogeography of Olea subspecies, only few polymorphic regions discriminating among the agronomically and economically important olive cultivars have been identified. The objective of this study was to sequence the entire plastome of olive and analyze many potential polymorphic regions to develop new inter-cultivar genetic markers.

Results

The complete plastid genome of the olive cultivar Frantoio was determined by direct sequence analysis using universal and novel PCR primers designed to amplify all overlapping regions. The chloroplast genome of the olive has an organisation and gene order that is conserved among numerous Angiosperm species and do not contain any of the inversions, gene duplications, insertions, inverted repeat expansions and gene/intron losses that have been found in the chloroplast genomes of the genera Jasminum and Menodora, from the same family as Olea.

The annotated sequence was used to evaluate the content of coding genes, the extent, and distribution of repeated and long dispersed sequences and the nucleotide composition pattern. These analyses provided essential information for structural, functional and comparative genomic studies in olive plastids. Furthermore, the alignment of the olive plastome sequence to those of other varieties and species identified 30 new organellar polymorphisms within the cultivated olive.

Conclusions

In addition to identifying mutations that may play a functional role in modifying the metabolism and adaptation of olive cultivars, the new chloroplast markers represent a valuable tool to assess the level of olive intercultivar plastome variation for use in population genetic analysis, phylogenesis, cultivar characterisation and DNA food tracking.

Background

Olive is the main cultivated species belonging to the monophyletic Oleaceae family, within the clade of Asterids, in which the majority of nuclear and organellar genomic sequences are unknown. The Olea genus includes two sections, Olea and Ligustroides. The former comprises the six recognised subspecies of the olive complex, which can be found throughout the Mediterranean area as well as the temperate and subtropical regions of Africa and Asia. The Mediterranean form (Olea europaea, subspecies europaea) includes the wild (var. sylvestris) and cultivated (var. europaea) olives [1].

Recently, chloroplast genome sequencing of species belonging to this family from the tribe of Jasmineae revealed that two genera, Jasminum and Menodora, carry several distinctive rearrangements, including inversions, gene duplications, insertions, inverted repeat expansions and gene/intron losses [2]. One of these genomic features involves the duplication of the rpl23 protein-coding gene in Jasminum. A similar duplication has also been detected in the Poaceae, and in both Oleaceae and Poaceae, the duplicated copy has been inserted into the intergenic region between rbcL and psaI [3]. By comparative gene mapping and sequencing, Lee and co-workers also demonstrated that all other Oleaceae genera, including Olea, have an identical gene content and order as Nicotiana tabacum. A phylogenetic reconstruction of the entire family, based upon the sequences of the ndhF and rbcL genes, partially confirmed previous results obtained by the analysis of the trnL-F and rps16 chloroplast regions [4].

Intraspecies variation within other Oleaceae genera, such as Syringa [5], Forsythia [6], Ligustrum [7] and Fraxinus [8, 9] has also been examined.

Different chlorotypes have been identified among the six subspecies of O. europaea. Lumaret et al. [10] identified 12 distinct chlorotypes by RFLP analysis of DNA isolated from the purified chloroplasts of a wide set of O. europaea taxa. In other O. europaea subspecies Baldoni et al. [11] identified nine nucleotide substitutions, one insertion-deletion (indel) and a polymorphic poly-T SSR in the trnT-L region. Besnard et al. [12] in the O. europaea complex identified fourteen polymorphisms in three chloroplast regions (trnT-L, trnQ-R and matK), including five microsatellite motifs, two indels and eight nucleotide substitution sites. Recently, the analysis of four regions (trnL-F, trnT-L, trnS-G and matK) was used to demonstrate the polyphyletic origin of the Olea genus and estimate the divergence times for the major groups of Olea species and subspecies during the Tertiary period [13].

In cultivated olives chloroplasts are maternally inherited [14] and, in contrast to that seen at the subspecies level, a low plastidial variability was detected. A strong linkage disequilibrium between the chloroplast and mitochondrial genomes has been demonstrated, particularly for the Mediterranean cultivated and wild olives (subspecies europaea), suggesting that a low level of recurrent mutations occurs in both organellar genomes of the olive [15].

In particular, RFLP analysis of chloroplast DNA isolated from 72 cultivars revealed that most cultivars have a common chlorotype [16]. Besnard et al. [17], using two microsatellites and 13 RFLPs on more than 140 olive cultivars, were able to distinguish only four chlorotypes. The majority of cultivars was characterised by the chlorotype CE1, which likely originated from the wild olive populations of the Eastern Mediterranean and was spread to the Western part through cultivar dispersal by humans. Polymorphisms at the varietal level have been detected in the trnD-T locus [18], but only one polymorphism in this locus was found within a set of 12 cultivars [19].

Chloroplast DNA represents an ideal system for plant species DNA barcoding, and some chloroplast regions have been indicated as ideal for use in tests that discriminate between different land plants. Based on assessments of recoverability, sequence quality and discriminatory abilities at the species level, the two-locus combination of rbcL-matK has been recommended as a universal framework for plant barcoding [20]. The combination of trnH-psbA coupled with rbcL has been recommended for DNA barcoding to discriminate between lower taxonomic ranks such as genera or related species [21]. In highly valuable crop species, such as the olive, that have a variety of cultivars available in the market, however, typing at the species level is not sufficient. Thus, the development of reliable methods to rapidly and efficiently discriminate between cultivars has become a pressing need. In addition, DNA barcoding may have useful applications to tracking food products [22] and the analysis of archaeological remains [23].

In this respect, the availability of complete chloroplast genome sequences from a growing number of species offers the opportunity to evaluate many potentially polymorphic sites and identify new regions that could be used to define cultivar DNA barcodes.

There are numerous approaches to sequence chloroplast genomes: traditional sequence analysis of highly purified chloroplast DNA, as applied for Solanum lycopersicum [24], Lolium perenne [25], Trachelium caeruleum [26], Jasminum nudiflorum [2] and Parthenium argentatum [27]; Rolling Circle Amplification (RCA) of high-purity chloroplast DNA, as demonstrated in Cicer arietinum [28], Platanus occidentalis [29] and Welwitschia mirabilis [30]; shot gun sequence analysis of BAC clones containing chloroplast genomic inserts, as demonstrated in Vitis vinifera [31], Hordeum vulgare [32] and Brachypodium distachyon [33]; and the use of universal primers based on chloroplast sequences highly conserved among most Angiosperm species to amplify overlapping fragments [3436], as demonstrated in Cycas taitungensis [37] and two Bambusa species [38]. For this study, the last approach was used to sequence the entire chloroplast genome of the O. europaea subsp. europaea cv. Frantoio. The resulting availability of the entire plastome allowed to evaluate the sequence arrangement of the plastid genome in O. europaea and to identify new organellar polymorphisms that could discriminate between cultivated olive varieties.

Results and Discussion

Size, gene content and gene order of the olive chloroplast genome

The complete plastome of olive, cv. Frantoio has a total length of 155,889 bp (GenBank Accession Number GU931818), with the typical structure found in the unrearranged chloroplast genomes of Angiosperms. It includes an 86,614-bp Large Single Copy (LSC) and a 17,791-bp Small Single Copy (SSC) region separated by a pair of Inverted Repeats (IR), each 25,742 bp long (Figure 1). Coding DNA (92,095 bp) accounts for 59.08% of the genome and includes protein coding genes (80,252 bp), tRNAs (2,793) and rRNAs (9,050), while noncoding DNA (63,794 bp) accounts for the remaining 40.92% and includes introns (20,130 bp) and intergenic spacers (43,664 bp). The olive plastome contains 114 unique genes (80 CDS, 30 tRNA and 4 rRNA), with 19 of these genes (8 CDS, 7 tRNA and all 4 rRNA) duplicated in the IR for a total of 133 genes. In addition, the duplicated region includes a partial CDS for ycf1, as in other species like Typha [39]. There are 18 intron-containing genes, 15 of which contain one intron and 3 (ycf3, clpP and rps12) with two introns. The rps12 gene is trans-spliced, with the 5' end located in the LSC and the 3' end duplicated in the IR regions. The nucleotide composition of the olive chloroplast genome comprises 37.81% GC and 62.19% AT.

Figure 1
figure 1

Olea europaea chloroplast genome based on direct sequencing of the LSC, SSC and IR regions. Genes drawn inside the circle are transcribed clockwise, those outside are counterclockwise.

The in silico search for repetitive elements identified 633 mono-nucleotide SSRs with 5 or more repeat units (Table 1), with 276 poly-A, 303 poly-T, 31 poly-C and 23 poly-G repeats. In addition, six di-nucleotide SSRs with five or six repeat units, no tri-nucleotide SSRs, three tetra- and two penta-nucleotide SSRs were identified, for a total of 644 repetitive sequences. The distribution of SSRs across the chloroplast genome was as follows: 400 in the LSC (density = 0.0046), 126 in the SSC (density = 0.0071) and 59 (x2) in the IR region (density = 0.0022).

Table 1 Abundance and length of SSR motifs identified on the olive chloroplast genome.

The repeat analysis also identified 14 interspersed repetitive sequences longer than 30 bp, each having 2-6 repetitions and a sequence identity higher than 85% (Table 2, Figure 2). These long interspersed repetitive sequences included two tandem repeats in the ycf2 gene and five palindromic sequences (two in the LSC, one in the SSC and two in the IR regions). Three of the four repeats found within the ycf2 exon were tandem repeats, as previously observed in V. vinifera [31]. There were only two inverted repeats, all the others were direct repeats. Five repeats were located within CDS, two repeats were found in the introns of the ycf3 and ndhA genes and all others were in the intergenic spacers (Table 2). Interspersed repeats did not cause any uncertainty during the sequencing process because they were quite short (< 61 bp), with a low number of repetitions and primers were never constructed on the repeats.

Table 2
Figure 2
figure 2

Polymorphic regions identified in the olive chloroplast genome. Different colours indicate the four mono-nucleotide microsatellites (poly-T and poly-G are reported in the external circle, poly-A and poly-C in the internal circle), bar lengths correspond to the number of repetitions. Arrows indicate polymorphisms (base mutations, microsatellites and indels). The circle reports the interspersed repeats: to the same number corresponds the same repetition. External or internal number position corresponds to the sense or anti-sense sequence direction.

The actual size of the olive plastome is larger than the size estimated on the basis of RFLP analysis, which predicted a range from 132 to 134 Kb [16].

Olive chloroplast genome organisation

The sequence of the olive chloroplast genome represents one of the first contributions to deciphering the genetic background of this important tree crop species and was used to verify that rearrangements observed in the plastomes of other genera of Oleaceae, such as Jasminum and Menodora, were not represented in O. europaea. In fact, in contrast to what observed in the Jasminum and Menodora plastomes [2], the olive chloroplast maintains a size range, organisation and gene order typical of most land plants, such as members of the Vitis, Populus, Citrus, Eucalyptus, Coffea and Arabidopsis genera. Based on the phylogeny of Oleaceae inferred from the ndhF and rbcL genes [2], Jasminum and Menodora were already known to be unusual genera within the family, and all other tribes, including Oleae, to which the Olea genus belongs, do not share their combination of multiple mutational events. The highly conserved plastome organization of the olive allowed universal primers and genome walking with consensus primers to be used to amplify most of the LSC region.

Identification of new plastid markers to discriminate between olive cultivars

To detect intervarietal polymorphisms, a preliminary screening of the intergenic spacer trnS-GCU - trnG-UCC, previously demonstrated to be polymorphic among olive varieties [40], was performed on a set of 30 cultivars having different geographical distributions and representing a wide range of morphological and agronomical phenotypes (data not shown). A sub-set of eight highly variable cultivars (Table 4) was further examined for 100 potentially polymorphic regions.

The tested potential variant domains have shown different levels of variability. Fifteen of the analyzed intergenic spacers contained mutations within the sequence of the eight cultivars, ranging in number from one to six per region. These mutations were microsatellites, indels or single nucleotide polymorphisms (Table 3). One SNP was located within the intron of the rpoC1 gene, and three others were located in the coding regions (CDS) of the rpl14, ndhF and ycf1 genes. The CDS-SNPs resulted in substitutions at aminoacidic position 109 in rpl14 (leucine to phenylalanine), at 32 aa in ndhF (valine to alanine), and at 995 and 1,161 aa in ycf1 (leucine to isoleucine and isoleucine to arginine, respectively). Blast analyses revealed that the ndhF alanine and the ycf1 leucine, widely represented in other species, are present in Farga and Frantoio cultivars, respectively. Also the rpl14 polymorphism can be found in other species, as is the case for the phenylalanine aminoacid, present in the V. vinifera cv. Pinot Noir in the mitochondrial copy of this gene, due to the incorporation of more than 42% of the Vitis chloroplast genome into its mitochondrial genome [41]. On this respect, the risk that our chloroplast olive markers may reside on mitochondrial or nuclear genes has been prevented by amplifying coding regions anchored on the intergenic spacers and confirmed by the absence of sequence ambiguities.

Table 3 Chloroplast polymorphisms within olive (Olea europaea subsp. europaea var. europaea) cultivars.
Table 4 Chlorotypes detected on eight cultivars.

The comparison of the Frantoio chloroplast sequence with ESTs deriving from fruits of cvs. Coratina and Tendellone showed some sequence mismatches, but they were not confirmed by resequencing the corresponding genomic regions in Coratina and Tendellone cultivars.

Overall, the analysis of cpDNA sequences from the eight cultivars resulted in the identification of 40 polymorphic sites, 30 of which represent new and never-described plastid variants (Table 3, Table 4, Figure 2). Sixteen polymorphic sites were mono-nucleotide SSRs: eight poly-A, including one with an irregular motif; seven poly-T and one poly-C. The remaining polymorphisms included 20 SNPs and 4 indels. Thirty-three polymorphic sites (P1-P33) were located within the LSC region, one (P34) within the IR and six (P35-P40) within the SSC (Figure 2). The indel P32 was identified within the repeat of the rps8 - rpl14 spacer, but none of the other repetitive regions was polymorphic between cultivars.

The chloroplast sequence of cv. Frantoio was also compared with all previously sequenced regions of the olive chloroplast, particularly with the plastome sequence of cv. Bianchera, which has been recently deposited in the Genbank database (NC_013707.1). More than 200 mismatches were detected between the Bianchera and Frantoio sequences. Surprisingly, not one of these polymorphisms fell within the previously identified cultivar-specific polymorphic regions. To verify if these mismatches might represent real sequence differences between the two varieties, most of the ambiguous regions were reamplified and resequenced in both cultivars (Bianchera sample was provided by the CRA-OLI of Spoleto, Perugia, Italy). These analyses confirmed the sequence of cv. Frantoio and showed an absolute sequence identity with that obtained from the cv. Bianchera in all of these regions, including the exons of the rpoC1 and ndhF genes, carrying 27- and 9-bp indels, respectively. The differences detected between the two olive plastome sequences can not derive from an incorrect identification of the Bianchera genotype because, in that case, mutations should have been found in the polymorphic sites and not randomly along the chloroplast genome. More likely, divergences may be attributed to sequence uncertainties in the Bianchera plastome sequence deposited in GenBank.

The new markers identified in this study can distinguish six haplotypes among eight cultivars. Therefore, these new markers hold great promise for the identification of new cultivar haplotypes and for use in DNA barcoding systems to distinguish between different cultivars.

Comparison of plastome variation between cultivars and with other Oleataxa

Based on previous chloroplast sequence analyses, olive cultivars belong to the cp-II lineage and have been classified into three sublineages (E1, E2 and E3) and four chlorotypes (1, 2, 9 and 13) [19, 40]. These chlorotypes were defined by evaluating length variations in the psbK-psbI, trnS-trnG, rps2-rpoC2, trnE-trnT and atpB-rbcL regions among more than 140 cultivars [17, 19, 40].

Several polymorphisms had been previously identified in the partial sequence of the trnK intron (AF359497-AF359504) by analysing the subspecies cuspidata, laperrinei, maroccana, cerasiformis, guanchica, europaea var. sylvestris (wild olive) and the Cornicabra cultivar, but none of these polymorphisms were found among the cultivars we have analysed. The psbK-psbI and trnS-GCU-trnG-UCC regions, spanning the polymorphic sites P8, P9, P10, P11, P12, P13 and P14, were analyzed by Besnard et al. [12] as fragment length variation on a set of different O. europaea taxa including cultivars. That analysis revealed intercultivar variability only at P11, P12 and P13 but was unable to keep the C/T and G/T SNPs in the P9 and P14 sites, respectively. We treated the A/T/- polymorphism, closely linked to P11, as a different polymorphism (P10) because the A/- indel is present in most varieties while the T is a rare mutation carried by few cultivars.

The spacer rps2-rpoC2, spanning the polymorphic sites P16, P17 and P18, generated five different chlorotypes among the eight varieties analysed, demonstrating a high level of rearrangement within cultivars. This region corresponds to the ccmp5 microsatellite [42, 43], but previous studies that analysed only length polymorphisms were unable to capture the complexity of this region. P28 includes ccmp7 [40, 42] and an additional SNP polymorphism (P27) captured in the flanking region.

Intrieri et al. [18] reported the identification of 5 SNPs and 4 indels in the trnD-trnT region of 13 cultivars. Analyzing a different set of cultivars, Besnard [19] did not detect these polymorphisms. Similarly, only two polymorphisms were confirmed in our cultivar set: the poly-A SSR (P20) and the C/T SNP (P21).

Other regions previously analysed in different Olea taxa, such as trnL-F and rps16 [4], trnL-trnF [13], and trnT-trnL [11] were not polymorphic among our cultivars.

No differences between the eight cultivars were found within the matK and psbA exons or the rps16 intron, regions used for species barcoding. In contrast, the psbK-psbI and trnH-psbA barcoding regions, both representing markers for plant species identification [44, 45], correspond to our P8, P2, P3 and P4 polymorphisms. This observation indicates that these markers may not accurately discriminate between some species, given their potential intra-specific genetic variations [46].

Conclusions

The low level of cpDNA variation detected up to now within olive cultivars represented a serious obstacle to the widespread use of cpDNA markers for cultivar characterization, parentage analysis and population genetics. The most probable causes of the high level of sequence conservation may be related to the domestication process, by which most cultivars were likely derived from only a few different wild plants, and the low generation turnover resulting from the long life span of the trees, which reduces the rate of emergence of new mutations.

In this study, using eight cultivars, 30 new cpDNA markers were identified from the olive plastome sequence and 10 markers previously reported were confirmed. In fact, the availability of the entire chloroplast genome and systematic sequencing of candidate regions from selected cultivars resulted in the identification of many new polymorphisms, mostly represented by nucleotide substitutions and by rearrangements of different microsatellites. They were not discovered in previous analyses likely because these focused mostly on fragment length variations.

The 40 markers applied to eight cultivars were able to split them into six different chlorotypes. The ten known markers are able to establish to which lineages the olive varieties may correspond and to reconstruct their phylogeny with potential ancestors, while the new markers should allow to break down cultivated olives into new chlorotypes and to finely assign them to different lineages within the Mediterranean O. europaea complex. These markers could provide a valuable contribution to understanding the evolutionary and ecological processes involved in olive domestication as well as to increase the knowledge about the function of plastid genes on plant metabolism.

They could be used to screen olive genotypes, to assess the chlorotype distribution among cultivars and to better determine their phylogenetic relationships with the wild populations as well as with other O. europaea subspecies. This could help reconstruct the origin of the cultivated olive and to determine the timeline involved in the distribution of chlorotypes from traditional varieties throughout the Mediterranean region.

Most of these polymorphisms showed a high level of reorganization among cultivars, particularly in the intergenic regions such as psaA-ycf3, rps2-rpoC2 and trnS-GCU-trnG-UCC. This observation demonstrates that after rearrangements occurred within the plastid genome, these changes were fixed and maintained within cultivars by vegetative propagation. The putative functional role that these mutations may play in modifying the metabolism of olive cultivars and in developing adaptations to the environment, will also represent a further contribution to understanding the genetic background of the olive, providing insights into the evolution of plant phenotypes. The application of these polymorphisms as functional markers will also be considered.

Finally, these polymorphisms represent a new source of markers for olive DNA barcoding to distinguish between cultivars, for practical applications related to DNA-based tracking of olive oil and the identification of archaeological remains. One particular focus involves their potential use in DNA tracking of food products derived from the olive (e.g., olive oil and table olives), based on the assumptions that: i) the high number of chloroplasts per cell increases the probability that trace amounts of DNA can be amplified from these food products; ii) their maternal origin excludes the risk that DNA from pollinators would be amplified instead; iii) the haploid chloroplast genome can produce cultivar-specific single signals.

The identification of 30 new polymorphic sites, most of which are located in chloroplast regions previously unexplored in cultivated O. europaea, demonstrates that chloroplast variation in olive cultivars is higher than expected and that new chlorotypes could be discovered through the analysis of a larger number of cultivars.

Methods

Plant material and DNA extraction

For the plastome sequence analysis, leaves of cv. Frantoio were collected from the accession present at the CRA-OLI olive cultivars collection (Collececco, Spoleto).

For the detection of intervarietal polymorphisms, a subset of eight cultivars was used, chosen among 30 cultivars pre-selected on the basis of their haplotypes for the intergenic trnS-GCU - trnG-UCC spacer (Table 4).

Total DNA was extracted by the DNeasy Plant Mini Kit (Qiagen, Hilden, Germany), following the manufacturer's instructions.

Sequencing strategy: primer design and PCR amplification

Sequencing of the olive plastome was performed by designing a series of PCR primer pairs that produced partially overlapping amplicons and spanned the entire chloroplast genome.

For the Large Single Copy (LSC) region, 38 primer pairs located within conserved regions and designed by Grivet et al. [34] were used, avoiding gaps between successive fragments along the cpDNA molecule. Five primer pairs (5-14-22-27-38) produced double bands, and two (16-28) did not produce any amplification. Thus, new primers for those regions were constructed, following the strategy used for the amplification of the IR and SSC regions. For primer sequences see Additional File 1, Table S2.

For the SSC and the IR regions, primers were constructed from conserved sequences derived by the alignment of the plant chloroplast genomes of Jasminum nudiflorum (DQ673255), Populus tricocharpa (EF489041), Vitis vinifera (DQ424856), Eucaliptus globulus (AY780259), Arabidopsis thaliana (AP000423), Gossypium hirsutum (DQ345959), Citrus sinensis (DQ864733), Cucumis sativus (AJ970307), Morus indica (DQ226511), Panax ginseng (AY582139), Solanum lycopersicum (AM087200) and Nicotiana tabacum (Z00044). These sequences were retrieved from GenBank and aligned using Muscle V. 3.7 [47], and the primers were designed using PerlPrimer v1.1.6 [48]. Because the average size of the amplified fragments was approximately 2,500 bp, internal primers to sequence the entire amplicons were also designed. The primer sequences and positions, along with their respective amplicon lengths, are given in Additional File 1, Table S1.

PCR amplifications were performed in a final volume of 50 μL containing 1-20 ng of template DNA, 10× PCR buffer, 200 μM of each dNTP, 10 pmol of each primer and 2 U of EuroTaq polymerase (EuroClone). For those fragments that were longer than 5,000 bp, 1 unit of LA Taq polymerase (TaKaRa) was used instead. The amplifications were performed with the PCR System 9600 (Applied Biosystems, Foster City, CA) using the following cycling conditions: an initial denaturation step of 95°C for 5 min, followed by 35 cycles of 95°C for 30 sec, 60°C for 30 sec and 72°C for 25 sec and a final elongation step of 72°C for 30 min. For those amplifications including LA Taq polymerase in the PCR mix, the following cycling conditions were used instead: an initial denaturation step of 94°C for 1 min, followed by 30 cycles of 98°C for 60 s and 68°C for 10 min and a final extension step of 72°C for 10 min. Negative controls (no template DNA) were included in all experiments.

The PCR products were checked by electrophoresis on 2% agarose gels, then purified with the JetQuick PCR purification kit (Genomed) and directly sequenced in both directions using the ABI Prism BigDye Terminator V.3.1 Ready Reaction Cycle Sequencing Kit (Applied Biosystems) on an ABI 3130 Genetic Analyzer (Applied Biosystems-Hitachi). The sequences were assembled using BioEdit v7.0.9 software (Ibis Biosciences, Carlsbad, CA).

The DOGMA program [49] was used for the initial genome annotation, which was then manually refined using Artemis version 11 [50] and NCBI Blast searches. The annotation of tRNA genes was checked using tRNAscan version 1.21 [51]. The genome map was generated using OGDRAW software V. 1.0 [52].

Evaluation of repeat structures

Msatfinder v. 2.0.9 [53] was used to identify simple sequence repeats (SSR), with the following settings: a six-repeat threshold for mono-nucleotide SSRs, a five-repeat threshold for di- and tri-nucleotide SSRs and a three-repeat threshold for tetra-, penta- and esa-nucleotide SSRs. The SSR density in the different regions of the chloroplast genome was calculated by dividing the number of SSRs by the length of the given region. Interspersed repeats were identified with REPuter [54] by setting the minimum repeat size to 30 bp and the Hamming distance to 3. The presence and distribution of the repetitive element were verified manually using Artemis and computationally by performing an intragenomic Blast search. For this purpose, the sequence was interrogated using a local installation of NCBI Blast and a Blast database created with formatDB software http://0-www-ncbi-nlm-nih-gov.brum.beds.ac.uk/staff/tao/URLAPI/wwwblast/.

Identification of polymorphic regions among olive cultivars

To identify sequence polymorphisms, the following potentially variant domains were tested: i) regions containing mono-, di-, tetra- and penta-nucleotide microsatellites; ii) regions previously reported as polymorphic among Olea subspecies, iii) regions containing high sequence variations among 12 species (see materials and methods for chloroplast sequencing strategy); iv) barcoding regions previously identified for species discrimination that had never been tested in olive cultivars; and v) plastid ESTs derived from massive sequence analyses of fruit cDNAs [55].

Candidate SSRs were selected among those having the highest number of repeats (Table 1 and Figure 2). Although no mono-nucleotide SSRs with repeats shorter than 10 bp were considered, some were indirectly included in the analyses of other regions.

PCR amplifications were performed in a final volume of 25 μl containing 25 ng of template DNA, 2,5 μl of 10 × PCR buffer, 0.5 mM of each dNTP, 1 μM of each primer and 1.5 U/μl of PerfectTaq DNA Polymerase (5-PRIME). The amplifications were run on a thermal cycler Mastercycler Gradient (Eppendorf) using the same conditions as previously indicated for plastid sequencing.

After an initial evaluation by electrophoresis on a 2% agarose gel, amplicons were sequenced in both directions using the ABI Prism BigDye Terminator V.3.1 Ready Reaction Cycle Sequencing Kit and run on an ABI 3130 Genetic Analyzer (Applied Biosystems-Hitachi).

The sequences of each region were aligned to evaluate the presence of SNPs, indels or polymorphic microsatellites among the six cultivars. To use these polymorphisms as chloroplast markers able to distinguish olive cultivars from each other, specific primers localizing within conserved flanking regions were constructed (Additional File 1, Table S1). The resulting fragments ranged in size from 145 to 688 bp and could be amplified at an annealing temperature of 60°C. Some amplicons included from two to five polymorphisms. All 40 polymorphisms can be amplified by a set of 21 primer pairs.

Author details

1 CNR - Institute of Plant Genetics, 06128 Perugia, Italy

2 University of Cordoba - Dep. of Agronomy, 14071 Cordoba, Spain

References

  1. Green PS: A revision of Olea L. Kew Bulletin. 2002, 57: 91-140. 10.2307/4110824.

    Article  Google Scholar 

  2. Lee HL, Jansen RK, Chumley TW, Kim KJ: Gene relocations within chloroplast genomes of Jasminum and Menodora (Oleaceae) are due to multiple, overlapping inversions. Mol Biol Evol. 2007, 24 (5): 1161-1180. 10.1093/molbev/msm036.

    Article  PubMed  CAS  Google Scholar 

  3. Xiong AS, Peng RH, Zhuang J, Gao F, Zhu B, Fu XY, Xue Y, Jin XF, Tian YS, Zhao W, et al: Gene duplication, transfer, and evolution in the chloroplast genome. Biotechnol Adv. 2009, 27 (4): 340-347. 10.1016/j.biotechadv.2009.01.012.

    Article  PubMed  CAS  Google Scholar 

  4. Wallander E, Albert VA: Phylogeny and classification of Oleaceae based on rps16 and trnL-F sequence data. Am J Bot. 2000, 87 (12): 1827-1841. 10.2307/2656836.

    Article  PubMed  CAS  Google Scholar 

  5. Kim KJ, J RK: A chloroplast DNA phylogeny of lilacs (Syringa, Oleaceae): plastome groups show a strong correlation with crossing groups. American Journal of Botany. 1998, 85 (9): 1338-1351. 10.2307/2446643.

    Article  PubMed  CAS  Google Scholar 

  6. Kim KJ: Molecular phylogeny of Forsythia (Oleaceae) based on chloroplast DNA variation. Plant Systematics and Evolution. 1999, 218 (12): 113-123. 10.1007/BF01087039.

    Article  CAS  Google Scholar 

  7. Milne RI, Abbott RJ: Geographic origin and taxonomic status of the invasive Privet, Ligustrum robustum (Oleaceae), in the Mascarene Islands, determined by chloroplast DNA and RAPDs. Heredity. 2004, 92 (2): 78-87. 10.1038/sj.hdy.6800385.

    Article  PubMed  CAS  Google Scholar 

  8. Harbourne ME, Douglas GC, Waldren S, Hodkinson TR: Characterization and primer development for amplification of chloroplast microsatellite regions of Fraxinus excelsior. J Plant Res. 2005, 118 (5): 339-341. 10.1007/s10265-005-0223-5.

    Article  PubMed  CAS  Google Scholar 

  9. Heuertz M, Carnevale S, Fineschi S, Sebastiani F, Hausman JF, Paule L, Vendramin GG: Chloroplast DNA phylogeography of European ashes, Fraxinus sp. (Oleaceae): roles of hybridization and life history traits. Mol Ecol. 2006, 15 (8): 2131-2140. 10.1111/j.1365-294X.2006.02897.x.

    Article  PubMed  CAS  Google Scholar 

  10. Lumaret R, A M, Ouazzani N, Baldoni L: Chloroplast DNA variation in the cultivated and wild olive taxa of the genus Olea L. Theoretical Applied Genetics. 2000, 101: 547-553. 10.1007/s001220051514.

    Article  CAS  Google Scholar 

  11. Baldoni L, G C, Sossey-Aloui K, Abbott AG, Angiolillo A, Lumaret R: Phylogenetic relationships among Olea species based on nucleotide variation at a non-coding chloroplast DNA region. Plant Biology (Stuttg). 2002, 4: 346-351. 10.1055/s-2002-32338.

    Article  CAS  Google Scholar 

  12. Besnard G, R DCR, Vargas P: A set of primers for length and nucleotide-substitution polymorphism in chloroplastic DNA of Olea europaea L. (Oleaceae). Molecular Ecology Notes. 2003, 3: 651-653. 10.1046/j.1471-8286.2003.00547.x.

    Article  CAS  Google Scholar 

  13. Besnard G, Rubio de Casas R, Christin PA, Vargas P: Phylogenetics of Olea (Oleaceae) based on plastid and nuclear ribosomal DNA sequences: tertiary climatic shifts and lineage differentiation times. Ann Bot. 2009, 104 (1): 143-160. 10.1093/aob/mcp105.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  14. Besnard G, K B, Villemur P, Berville A: Cytoplasmic male sterility in the olive (Olea europaea L.). Theor Appl Genet. 2000, 100 (7): 1018-1024. 10.1007/s001220051383.

    Article  Google Scholar 

  15. Besnard G, Khadari B, Baradat P, Berville A: Combination of chloroplast and mitochondrial DNA polymorphisms to study cytoplasm genetic differentiation in the olive complex (Olea europaea L.). Theor Appl Genet. 2002, 105 (1): 139-144. 10.1007/s00122-002-0868-6.

    Article  PubMed  CAS  Google Scholar 

  16. Amane M, L R, Hany V, Ouazzani N, Debain C, Vivier G, Deguilloux MF: Chloroplast-DNA variation in cultivated and wild olive (Olea europaea L.). Theoretical and Applied Genetics. 1999, 99 (12): 133-139. 10.1007/s001220051217.

    Article  CAS  Google Scholar 

  17. Besnard G, Khadari B, Baradat P, Berville A: Olea europaea (Oleaceae) phylogeography based on chloroplast DNA polymorphism. Theor Appl Genet. 2002, 104 (8): 1353-1361. 10.1007/s00122-001-0832-x.

    Article  PubMed  CAS  Google Scholar 

  18. Intrieri MC, M R, Buiatti M: Chloroplast DNA polymorphisms as molecular markers to identify cultivars of Olea europaea L. Journal of Horticultural Science & Biotechnology. 2007, 82 (1): 109-113.

    CAS  Google Scholar 

  19. Besnard G: Chloroplast DNA variations in Mediterranean olive. Journal of Horticultural Science & Biotechnology. 2008, 83 (1): 51-54.

    CAS  Google Scholar 

  20. CBOL PWG: A DNA barcode for land plants. Proc Natl Acad Sci USA. 2009, 106 (31): 12794-12797. 10.1073/pnas.0905845106.

    Article  Google Scholar 

  21. Kress WJ, Erickson DL: A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnH-psbA spacer region. PLoS One. 2007, 2 (6): e508-10.1371/journal.pone.0000508.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Consolandi C, P L, Severgnini M, Maestri E, Marmiroli N, Agrimonti C, Baldoni L, Donini P, De Bellis G, Castiglioni B: A procedure for olive oil traceability and authenticity: DNA extraction, multiplex PCR and LDR-universal array analysis. European Food Research Technology. 2008, 227: 1429-1438. 10.1007/s00217-008-0863-5.

    Article  CAS  Google Scholar 

  23. Hansson MC, F BP: Ancient DNA fragments inside Classical Greek amphoras reveal cargo of 2400-year-old shipwreck. Journal of Archaeological Science. 2008, 35: 1169-1176. 10.1016/j.jas.2007.08.009.

    Article  Google Scholar 

  24. Kahlau S, Aspinall S, Gray JC, Bock R: Sequence of the tomato chloroplast DNA and evolutionary comparison of solanaceous plastid genomes. J Mol Evol. 2006, 63 (2): 194-207. 10.1007/s00239-005-0254-5.

    Article  PubMed  CAS  Google Scholar 

  25. Diekmann K, Hodkinson TR, Fricke E, Barth S: An optimized chloroplast DNA extraction protocol for grasses (Poaceae) proves suitable for whole plastid genome sequencing and SNP detection. PLoS One. 2008, 3 (7): e2813-10.1371/journal.pone.0002813.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Haberle RC, Fourcade HM, Boore JL, Jansen RK: Extensive rearrangements in the chloroplast genome of Trachelium caeruleum are associated with repeats and tRNA genes. J Mol Evol. 2008, 66 (4): 350-361. 10.1007/s00239-008-9086-4.

    Article  PubMed  CAS  Google Scholar 

  27. Kumar S, Hahn FM, McMahan CM, Cornish K, Whalen MC: Comparative analysis of the complete sequence of the plastid genome of Parthenium argentatum and identification of DNA barcodes to differentiate Parthenium species and lines. BMC Plant Biol. 2009, 9: 131-10.1186/1471-2229-9-131.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Jansen RK, Wojciechowski MF, Sanniyasi E, Lee SB, Daniell H: Complete plastid genome sequence of the chickpea (Cicer arietinum) and the phylogenetic distribution of rps12 and clpP intron losses among legumes (Leguminosae). Mol Phylogenet Evol. 2008, 48 (3): 1204-1217. 10.1016/j.ympev.2008.06.013.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  29. Moore MJ, Dhingra A, Soltis PS, Shaw R, Farmerie WG, Folta KM, Soltis DE: Rapid and accurate pyrosequencing of angiosperm plastid genomes. BMC Plant Biol. 2006, 6: 17-10.1186/1471-2229-6-17.

    Article  PubMed  PubMed Central  Google Scholar 

  30. McCoy SR, Kuehl JV, Boore JL, Raubeson LA: The complete plastid genome sequence of Welwitschia mirabilis: an unusually compact plastome with accelerated divergence rates. BMC Evol Biol. 2008, 8: 130-10.1186/1471-2148-8-130.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Jansen RK, Kaittanis C, Saski C, Lee SB, Tomkins J, Alverson AJ, Daniell H: Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC Evol Biol. 2006, 6: 32-10.1186/1471-2148-6-32.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Saski C, Lee SB, Fjellheim S, Guda C, Jansen RK, Luo H, Tomkins J, Rognli OA, Daniell H, Clarke JL: Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes. Theor Appl Genet. 2007, 115 (4): 571-590. 10.1007/s00122-007-0567-4.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  33. Bortiri E, Coleman-Derr D, Lazo GR, Anderson OD, Gu YQ: The complete chloroplast genome sequence of Brachypodium distachyon: sequence comparison and phylogenetic analysis of eight grass plastomes. BMC Res Notes. 2008, 1: 61-10.1186/1756-0500-1-61.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Grivet D, H B, Vendramin GG, Petit RJ: Genome walking with consensus primers: application to the large single copy region of chloroplast DNA. Molecular Ecology Notes. 2001, 1: 345-349. 10.1046/j.1471-8278.2001.00107.x.

    Article  CAS  Google Scholar 

  35. Dhingra A, Folta KM: ASAP: amplification, sequencing & annotation of plastomes. BMC Genomics. 2005, 6: 176-10.1186/1471-2164-6-176.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Heinze B: A database of PCR primers for the chloroplast genomes of higher plants. Plant Methods. 2007, 3: 4-10.1186/1746-4811-3-4.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Wu CS, Wang YN, Liu SM, Chaw SM: Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: insights into cpDNA evolution and phylogeny of extant seed plants. Mol Biol Evol. 2007, 24 (6): 1366-1379. 10.1093/molbev/msm059.

    Article  PubMed  CAS  Google Scholar 

  38. Wu FH, Kan DP, Lee SB, Daniell H, Lee YW, Lin CC, Lin NS, Lin CS: Complete nucleotide sequence of Dendrocalamus latiflorus and Bambusa oldhamii chloroplast genomes. Tree Physiol. 2009, 29 (6): 847-856. 10.1093/treephys/tpp015.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  39. Guisinger MM, Chumley TW, Kuehl JV, Boore JL, Jansen RK: Implications of the Plastid Genome Sequence of Typha (Typhaceae, Poales) for Understanding Genome Evolution in Poaceae. J Mol Evol. 2010,

    Google Scholar 

  40. Besnard G, R dCR, Vargas P: Plastid and nuclear DNA polymorphism reveals historical processes of isolation and reticulation in the olive tree complex (Olea europaea). Journal of Biogeography. 2007, 34: 736-752. 10.1111/j.1365-2699.2006.01653.x.

    Article  Google Scholar 

  41. Goremykin VV, Salamini F, Velasco R, Viola R: Mitochondrial DNA of Vitis vinifera and the issue of rampant horizontal gene transfer. Mol Biol Evol. 2009, 26 (1): 99-110. 10.1093/molbev/msn226.

    Article  PubMed  CAS  Google Scholar 

  42. Weising K, Gardner RC: A set of conserved PCR primers for the analysis of simple sequence repeat polymorphisms in chloroplast genomes of dicotyledonous angiosperms. Genome. 1999, 42 (1): 9-19. 10.1139/gen-42-1-9.

    Article  PubMed  CAS  Google Scholar 

  43. Besnard G, Berville A: On chloroplast DNA variations in the olive (Olea europaea L.) complex: comparison of RFLP and PCR polymorphisms. Theor Appl Genet. 2002, 104 (67): 1157-1163.

    PubMed  CAS  Google Scholar 

  44. Fazekas AJ, Burgess KS, Kesanakurti PR, Graham SW, Newmaster SG, Husband BC, Percy DM, Hajibabaei M, Barrett SC: Multiple multilocus DNA barcodes from the plastid genome discriminate plant species equally well. PLoS One. 2008, 3 (7): e2802-10.1371/journal.pone.0002802.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Kress WJ, Wurdack KJ, Zimmer EA, Weigt LA, Janzen DH: Use of DNA barcodes to identify flowering plants. Proc Natl Acad Sci USA. 2005, 102 (23): 8369-8374. 10.1073/pnas.0503123102.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  46. Lahaye R, van der Bank M, Bogarin D, Warner J, Pupulin F, Gigot G, Maurin O, Duthoit S, Barraclough TG, Savolainen V: DNA barcoding the floras of biodiversity hotspots. Proc Natl Acad Sci USA. 2008, 105 (8): 2923-2928. 10.1073/pnas.0709936105.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  47. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32 (5): 1792-1797. 10.1093/nar/gkh340.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  48. Marshall OJ: PerlPrimer: cross-platform, graphical primer design for standard, bisulphite and real-time PCR. Bioinformatics. 2004, 20 (15): 2471-2472. 10.1093/bioinformatics/bth254.

    Article  PubMed  CAS  Google Scholar 

  49. Wyman SK, Jansen RK, Boore JL: Automatic annotation of organellar genomes with DOGMA. Bioinformatics. 2004, 20 (17): 3252-3255. 10.1093/bioinformatics/bth352.

    Article  PubMed  CAS  Google Scholar 

  50. Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream MA, Barrell B: Artemis: sequence visualization and annotation. Bioinformatics. 2000, 16 (10): 944-945. 10.1093/bioinformatics/16.10.944.

    Article  PubMed  CAS  Google Scholar 

  51. Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997, 25 (5): 955-964. 10.1093/nar/25.5.955.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  52. Lohse M, Drechsel O, Bock R: OrganellarGenomeDRAW (OGDRAW): a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr Genet. 2007, 52 (56): 267-274. 10.1007/s00294-007-0161-y.

    Article  PubMed  CAS  Google Scholar 

  53. Thurston MI, F D: Msatfinder: detection and characterisation of microsatellites. Distributed by the authors at. CEH Oxford, Mansfield Road, Oxford OX1 3SR, 2005, [http://www.genomics.ceh.ac.uk/msatfinder/]

    Google Scholar 

  54. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R: REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29 (22): 4633-4642. 10.1093/nar/29.22.4633.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  55. Alagna F, D'Agostino N, Torchia L, Servili M, Rao R, Pietrella M, Giuliano G, Chiusano ML, Baldoni L, Perrotta G: Comparative 454 pyrosequencing of transcripts from two olive genotypes during fruit development. BMC Genomics. 2009, 10: 399-10.1186/1471-2164-10-399.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The work was supported by the Project FISR "Improving flavour and nutritional properties of plant food after first and second transformation" by the Italian Ministry of Research and by the Project OIGA- "DNA tracking of olive oil and new models of oil labelling" by the Ministry of Agriculture

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luciana Baldoni.

Additional information

Authors' contributions

CGNM1 and MDC2: contributed to the DNA sequencing of the entire plastome.

MR1: conducted all the experiments to establish chloroplast variation at varietal level.

AR1: conducted bioinformatic analyses, contributed to the DNA sequencing of the IR and SSC of plastome and revised the manuscript.

LB1: conceived the study and wrote the manuscript.

All authors read and approved the final manuscript.

Electronic supplementary material

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Mariotti, R., Cultrera, N.G., Díez, C.M. et al. Identification of new polymorphic regions and differentiation of cultivated olives (Olea europaea L.) through plastome sequence comparison. BMC Plant Biol 10, 211 (2010). https://0-doi-org.brum.beds.ac.uk/10.1186/1471-2229-10-211

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://0-doi-org.brum.beds.ac.uk/10.1186/1471-2229-10-211

Keywords