Next Article in Journal
Receptor Activator of Nuclear Factor Kappa B (RANK) and Clinicopathological Variables in Endometrial Cancer: A Study at Protein and Gene Level
Next Article in Special Issue
Comparative Chloroplast Genome Analyses of Species in Gentiana section Cruciata (Gentianaceae) and the Development of Authentication Markers
Previous Article in Journal
Dental Pulp Stem Cell-Derived, Scaffold-Free Constructs for Bone Regeneration
Previous Article in Special Issue
Candidate Genes for Yellow Leaf Color in Common Wheat (Triticum aestivum L.) and Major Related Metabolic Pathways according to Transcriptome Profiling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Analysis of the Chloroplast Genomes of the Chinese Endemic Genus Urophysa and Their Contribution to Chloroplast Phylogeny and Adaptive Evolution

Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, Sichuan, China
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2018, 19(7), 1847; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms19071847
Submission received: 13 April 2018 / Revised: 19 June 2018 / Accepted: 19 June 2018 / Published: 22 June 2018
(This article belongs to the Special Issue Chloroplast)

Abstract

:
Urophysa is a Chinese endemic genus comprising two species, Urophysa rockii and Urophysa henryi. In this study, we sequenced the complete chloroplast (cp) genomes of these two species and of their relative Semiquilegia adoxoides. Illumina sequencing technology was used to compare sequences, elucidate the intra- and interspecies variations, and infer the phylogeny relationship with other Ranunculaceae family species. A typical quadripartite structure was detected, with a genome size from 158,473 to 158,512 bp, consisting of a pair of inverted repeats separated by a small single-copy region and a large single-copy region. We analyzed the nucleotide diversity and repeated sequences components and conducted a positive selection analysis by the codon-based substitution on single-copy coding sequence (CDS). Seven regions were found to possess relatively high nucleotide diversity, and numerous variable repeats and simple sequence repeats (SSR) markers were detected. Six single-copy genes (atpA, rpl20, psaA, atpB, ndhI, and rbcL) resulted to have high posterior probabilities of codon sites in the positive selection analysis, which means that the six genes may be under a great selection pressure. The visualization results of the six genes showed that the amino acid properties across each column of all species are variable in different genera. All these regions with high nucleotide diversity, abundant repeats, and under positive selection will provide potential plastid markers for further taxonomic, phylogenetic, and population genetics studies in Urophysa and its relatives. Phylogenetic analyses based on the 79 single-copy genes, the whole complete genome sequences, and all CDS sequences showed same topologies with high support, and U. rockii was closely clustered with U. henryi within the Urophysa genus, with S. adoxoides as their closest relative. Therefore, the complete cp genomes in Urophysa species provide interesting insights and valuable information that can be used to identify related species and reconstruct their phylogeny.

1. Introduction

The genus Urophysa (Ranunculaceae) is a Chinese endemic genus with only two species, Urophysa rockii Ulbr. and Urophysa henryi (Oliv.) Ulbr. U. rockii is an extremely rare species with fewer than 2000 individuals living in Jiangyou, a Sichuan province of China, and U. henryi is distributed in Guizhou, south Chongqing, north Hunan, and west Hubei [1]. The two species’ natural populations are restricted to small and isolated areas separated by high mountains and deep valleys and grow in steep and karstic cliffs with dramatically shrinking and fragmenting natural distributions [2]. In addition, the plants are collected for Chinese traditional medicine for the treatment of contusions and bruises, which contributed to the decline of their populations [3]. Previous studies on the genus Urophysa are scarce and mainly focused on the endangered U. rockii, its growing environment and conservation strategies [4], its biological and ecological characteristics, and its reproductive biology [5,6]. A recent study suggested that the uplift of the Yungui Plateau played an important role in the species divergence of Urophysa [2]. However, the chloroplast DNA (cpDNA) phylogeny showed inconsistency with the nuclear ribosomal DNA (nrDNA). Hence, to gain a better insight into the relationship of these two species and understand their genome structure so as to facilitate their speciation process and the conservation of U. rockii, we assembled and characterized the complete chloroplast genome sequence of U. rockii and U. henryi using the Illumina paired-end sequencing reads.
The angiosperm cp genome is one of the three DNA genomes (the other two are nuclear and mitochondrial genome), is uniparentally inherited, and has a high conserved circular DNA arrangement [7]. It is widely considered an informative and valuable resource for investigating evolutionary biology because of its relatively stable genome structure, gene content, and gene order [8,9,10,11,12,13]. The cp genome of plants always ranges from 115 to 210 kb and has a quadripartite structure that is typically composed of two copies of inverted repeat (IR) regions, which are separated by a large single-copy (LSC) region and a small single-copy (SSC) region [14,15,16]. Because of its compact size, less recombination, and maternal inheritance, the cp genome has been used to generate genetic markers for phylogenetic analysis [17,18], molecular identification [19], and divergence dating [20]. Especially, the low evolutionary rate of the cp genome in taxa that are not very young makes it an ideal system for assessing plant phylogeny [21].
In the present study, we report the complete chloroplast genome sequences of these two Urophysa species and their relative Semiquilegia adoxoides for the first time. Combining previously reported cp genome sequences, we performed phylogenetic analyses according to the whole cp genome and shared single-copy genes. Our findings will contribute to our understanding of the evolutionary history of the genus Urophysa. Additionally, highly variable regions and genes that were detected to be under positive selection could be employed to develop potential markers for phylogenetic analyses or candidates for DNA barcoding in future studies.

2. Results and Discussion

2.1. Complete Chloroplast Genomes of Three Species

The complete chloroplast genome of U. rockii, U. henryi, and S. adoxoides showed a single circular molecule with a typical quadripartite structure (Figure 1). The sizes of the U. rockii, U. henry, and S. adoxoides cp genomes were found to be 158,512 bp, 158,303, and 158,340 bp, respectively, which are in the range of most angiosperm plastid genomes [22]. The cp genome consists of a pair of IRs (IRa and IRb, with length 26,473–26,584 bp), separated by a LSC (87,031–87,202 bp) region and one SSC (18,192–18,220 bp) region (Table 1). The GC content of each species was very similar in the whole cp genome and the same region (LSC, SSC, and IR), but in the IR regions it was clearly higher than in the other regions, possibly because of the high GC content of the rRNA (55.8%) that was located in the IR regions (Table 2). These results are similar to a previously reported high GC percentage in IR regions [23,24,25].
The genomes contain 87 coding genes, 36 transfer RNA genes (tRNA), and 8 ribosomal RNA genes (rRNA) (Table 3). Most of the genes occur as a single copy in LSC or SSC regions, while 18 genes are duplicated in the IR regions, including seven protein-coding genes (ndhB, rpl2, rpl23, rps7, rps12, rps19, ycf2), seven tRNA species (trnA-UGC, trnI-CAU, trnI-GAU, trnL-CAA, trnN-GUU, trnR-ACG, and trnV-GAC) and four rRNA species (rrn4.5, rrn5, rrn16, and rrn23). The gene ycf1 straddles the SSC and IRs, while rps12 locates its first exon in the LSC region and two other exons in the IRs. The LSC region comprises 63 protein-coding genes and 21 tRNA genes, whereas the SSC and IR regions include 12 and 7 protein-coding genes, with one and seven tRNA, respectively. The protein-coding genes present in the U. rockii cp genome include 9 genes encoding large ribosomal proteins (rpl2, rpl14, rpl16, rpl20, rpl22, rpl23, rpl32, rpl33, rpl36) and 12 genes encoding small ribosomal proteins (rps2, rps3, rps4, rps7, rps8, rps11, rps12, rps14, rps15, rps16, rps18, rps19). There are 5 genes encoding phytosystem I subunits (psaA, psaB, psaC, psaI, psaJ), along with 15 genes related to photosystem II subunits (psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ) (Table 3). Six genes (atpA, atpB, atpE, atpF, atpH, atpI) encode ATP synthase and electron transport chain components (Table 3). A similar pattern of protein-coding genes is also present in U. henryi and S. adoxoides. There are eight intron-containing genes, six of which contain one intron; only the genes clpP and ycf3 have two introns (Table S1). All these eight genes possess at least two exons, and ycf3 has three exons. The rps16 gene has the longest intron (866 bp), and rpoC1 has the longest exon (1613 bp).

2.2. Repeat Analysis

Chloroplast repeats are potentially useful genetic resources to investigate population genetics and biogeography of allied taxa [26]. Analyses of various cp genomes revealed that repeat sequences are essential to induce indels and substitutions [27]. Repeat analysis of the U. rockii cp genome revealed 22 palindromic repeats, 23 forward repeats, 5 reverse, and 1 complement repeats. Among them, 16 palindromic, 18 forward, and 5 reverse repeats are 20–40 bp in length. Six palindromic and five forward repeats are 41–60 in length (Figure 2). Similarly, 23 and 25 palindromic repeats, 21 and 22 forward repeats, 5 and 2 reverse repeats, and 1 complement repeats were detected, and the detailed repeats length distributions are shown in Figure 2. The number and length of the repeats indicate that U. rockii is more similar to U. henryi than to S. aquilegia. Previous studies suggested that the slipped-strand mispairing and improper recombination of repeat sequences can result in sequence variation and genome rearrangement [28,29,30]. These repeats are informative sources for developing genetic markers for phylogenetic and population studies [31].
Simple sequence repeats (SSRs) in the cp genome can be highly variable at the intra-specific level and are therefore often used as genetic markers in population genetic and evolutionary studies [12,32,33,34]. Because of a high polymorphism rate at the species level, SSRs have been recognized as one of the main sources of molecular markers and have been extensively researched in phylogenetic and biogeographic studies of populations [35,36,37]. In this study, we analyzed the SSRs in the cp genomes. Five categories of perfect SSRs (mono-, di-, tri-, tetra-, and penta-nucleotide repeats) were detected in the cp genome of these three species, with an overall length ranging from 10 to 26 bp (Figure 3, Table S2). Certain parameters were set, because SSRs of 10 bp or longer are prone to slipped-strand mispairing, which is believed to be the main mutational mechanism for polymorphism [38,39,40].
A total of 169 microsatellites were detected in the U. rockii cp genome on the basis of the SSR analysis. Similarly, 171 and 174 SSRs were detected in U. henryi and S. adoxoides, respectively (Figure 3A). The most abundant were tri-nucleotide repeats, which accounted for about 33.85% of the total SSRs, and whose number varies from 56 in U. rockii to 60 in S. adoxoides, followed by mono-nucleotide repeats (27.63%), di-nucleotide repeats (26.46%), and tetra-nucleotides repeats (11.28%). Penta-nucleotide repeats were the least abundant (0.78%; Figure 3, Table S2). Most previous studies revealed that the richness of SSR types varies between species. In Quercus species, mono-nucleotide repeats are the most abundant, accounting for about 80% of the total SSRs [34]. In the cp genome of Forthysia, the number of di-nucleotide repeat is the highest [41]. Tri-nucleotide SSRs are most abundant in Nicotiana species, accounting for approximately 43.03% [42]. These results suggest that different repeats may contribute to the genetic variations differently among species. Thus, the SSR information will be important for understanding the genetic diversity status of Urophysa and its relatives.
In U. rockii, more than 96.2% mono-nucleotides are composed of A/T, and a majority of di-nucleotides (84.9%) is composed of A/T (Figure 3B, Table S2), which is consistent with U. henryi (97.8% mono-nucleotides and 83.0% di-nucleotides) and S. aquilegia (97.9% mono-nucleotides and 85.6% di-nucleotides). Our findings are comparable to previously reported observations that SSRs found in the chloroplast genome are generally composed of poly-thymine (polyT) or poly-adenine (polyA) repeats and infrequently contain tandem cytosine (C) and guanine (G) repeats [43]. Therefore, these SSRs contribute to the AT richness of the three species cp genome, as previously reported for different species [43,44]. SSRs were also detected in CDS regions of the U. rockii cp genome. The CDS regions account for approximately 49% of the total length. About 68.6% of SSRs (68.4% for U. henryi and 67.2% for S. adoxoides) were detected in non-coding regions, whereas only 28.9%of SSRs (29.2% for U. henryi and 30.5% for S. adoxoides) are present in the protein-coding region of U. rockii. Furthermore, about 62.1% of SSRs are present in the LSC region of U. rockii (66.1% for U. henryi and 68.9% for S. adoxoides), and a minority of SSRs exist in IR regions (17.8% in IRa and IRb in total). It was observed that 49 SSRs (28.9%) were located in 19 genes (CDS) regions (atpF, rpoC1, rpoC2, rps14, rps15, rps19, psaB, psaA, rbcL, rpl33, rpl22, ndhB, ndhD, ndhF, ndhH, ccsA, ycf1, ycf2, ycf3) in U. rockii. The detailed SSR location information is listed in Table S2. These results suggest an uneven distribution of SSRs in the U. rockii, U. henryi, and S. adoxoides cp genomes, as was also reported in different angiosperm cp genomes [44]. Moreover, the cp SSRs of the three species presented abundant variation and are useful for detecting genetic polymorphisms at population, intraspecific, and cultivar levels, as well as for comparing more distant phylogenetic relationships among species.

2.3. Genomes Sequence Divergence among the Three Species

In order to calculate the sequence divergence level, the nucleotide diversity values in the LSC, SSC, and IR regions of the chloroplast genomes were calculated (Figure 4, Table S3). In the LSC regions, these values varied from 0 to 0.05496, with a mean of 0.00705, in the IR regions they varied from 0 to 0.01265, with a mean of 0.00363, and only the SSC region had >0.010 average sequence nucleotide diversity, and its values varied from 0 to 0.02369, with a mean of 0.01048. All these results indicated that the differences among these genome regions were small. However, some highly variable loci, including trnK-UUU, trnG-UCC, trnD-GUC, atpF, rps4, trnL-UAA, accD, cemA, rpl36, rpl22, rps19, ndhF, trnL-UAG, ccsA, ndhA, and ycf3 were more precisely located (Figure 4, Table S3). All these regions displayed higher nucleotide diversity values than other regions (value > 0.015). Twelve of these loci were found to be located in the LSC region, and four in the SSC region, but the nucleotide diversity in the IR regions appeared small, less than 0.015. Among these loci, atpF, accD, ndhF, rpl22, ccsA, and ycf3 have been detected as highly variable regions in different plants [19,23,45,46]. On the basis of these results, we believe that accD, rps4, ccsA, rpl36, and ndhF, which have comparatively high sequence deviation, are good sources for interspecies phylogenetic analysis, as shown in previous studies [42,44].
Expansion and contraction at the borders of IR regions is the main reason for size variations in the cp genome and plays a vital role in its evolution [39,47,48]. The IR/LSC and IR/SSC junction regions were compared to identify IR expansion or contraction. The rps19, ndhF, ycf1, and psbA genes were located in the junctions of the LSC/IRa, IRa/SSC, SSC/IRb, and IRb/LSC regions, respectively (Figure 5). Despite the similar length of these three species IR regions, from 26,473 to 26,584 bp, some IR expansion and contraction were observed. The rps19 gene traverses the LSC and IRb regions (LR line), with 104 bp located in the IR region. The RS line (the junction line between IRb and SSC) is located between ycf1 and ndhF, and the variation in distances between the RS line and ndhF ranges from 33 to 36 bp across the three species. The SR line (the junction line between SSC and IRa) intersects the ycf1 gene, the SSC and IRa regions are the same in U. rockii and U. henryi (4259 bp in SSC and 1081 bp in IRb), while different in S. adoxoides (4229 bp in SSC and 1084 bp in IRb) (Figure 5). The distance between the psbA and RL line varies from 386 to 403 bp. Compared to species of other genera, the IRb/SSC and SSC/IRa regions of Urophysa showed an expansion in ycf1, but a contraction in rps19 (Figure 5). The expansion and contraction detected in the IR regions may act as a primary mechanism in creating the length variation of the cp genomes in U. rockii, U. henryi, and S. adoxoides, as previous studies suggested [32,34,42,49].

2.4. Phylogenetic Analysis

To study the phylogenetic position of U. rockii and U. henryi within the Ranunculaceae family, we used 79 single-copy genes shared by the cp genomes of 12 Ranunculaceae members, representing seven genera (Figure 6). For Bayesian inference (BI) and maximum parsimony (MP), the posterior probabilities and bootstrap values were very high for each lineage, with all values ≥98%. Both the maximum likelihood (ML), BI, and MP phylogenetic results strongly supported that U. rockii is closely clustered with U. henryi within the genus Urophysa, with S. adoxoides as their closest relative with 100% bootstrap value (Figure 6), which is consistent with the results of previous molecular studies [50,51,52]. Furthermore, the species in each genus formed a single clade. The first clade is formed by species of the genera Urophysa, Semiaquilegia, and Trollius, the second clade was divided into two clades: one clade includes the Ranunculus and Clematis species, and the other clade consists of just the Aconitum species. Additionally, the topological structures from the whole complete chloroplast genome sequences and the CDS sequences are similar to that from single-copy genes (Figure S1), and all lineages possess high bootstrap values. These results suggest that there is no conflict among the entire genome data set, CDS sequences, and 79 shared single-copy genes of these cp genomes. Furthermore, these results are in accord with previous phylogeny research [53]. All these phylogenetic analyses are substantially increasing our understanding of the evolutionary relationship among species in Ranunculaceae.

2.5. Positive Selected Analysis

Of 57 single-copy CDS genes initially considered for the positive selection analysis (Table S4), 47 were eventually selected (Table 4). No significant positive selection was detected for all genes (p-value > 0.05), but six genes that possess high posterior probabilities for codon sites were found in the Bayesian Empirical Bayes (BEB) test (atpA, rpl20, psaA, atpB, ndhI, and rbcL) (Figure 7, Figure S2 and Table 4). Previous studies suggested that codon sites with a high posterior probability should be regarded as positively selected sites [54], which means that these six genes may be under positive selection pressure [55]. After Jalview visualization, the results of the amino acid properties across each column of all species revealed that many amino acids vary between different genera, such as the 88th amino acid (G in U. rockii and U. henryi, R in other species) of the rpl20 gene (Figure 7A) and other amino acids (marked with red blocks in Figure 7A). In the ndhI gene, two amino acids (the A in 168th and the P in 174th) were specific for U. rockii and U. henryi, and three amino acids (the 9th, 148th, and 165th, marked with red blocks in Figure 7B) were only possessed by U. rockii, U. henryi, and S. adoxoides. The amino acid properties of the other four genes (atpA, atpB, rbcL, and psaA) are shown in Figure S2. As we know, most amino acids may be under strong structural and functional constraints and not free to change [55]. We detected six genes with high posterior probability in codon site and many different amino acids among species, which may play an important role in Urophysa species evolution and environment adaptation. Populations of U. rockii and U. henryi are distributed only in karst regions of southern China, and the karst environments are characterized by low soil water content, insufficient light, and poor nutrient availability, which might have exerted strong selective forces on plant evolution [56].
However, five of the abovementioned six genes are involved in photosynthesis (atpA, psaA, atpB, ndhI, and rbcL) (Table 3). The gene rpl20 is involved in translation, which is an important part of protein synthesis [57]. The genes atpA and atpB participate in ATP synthesis, which is the main source of energy for the functioning of living cells and all multicellular organisms [58]. Additionally, rbcL is the gene for the Rubisco large subunit protein, which is an important component of photosynthetic electron transport [59,60]. Most previous research has revealed that positive selection of the rbcL gene in land plants may be a common phenomenon [61]. All these genes might play important roles when founder effects occur in populations; both changes in selection pressures and genetic drift result in the rapid shift of these genes to a new, coadapted combination. Therefore, all these genes under positive selection give an indication of why U. rockii and U. henryi could adapt to the harsh environment of karst (characterized by low soil water content, periodic water deficiency, and poor nutrient availability). Moreover, the results of the gene effectiveness test (rbcL and rpl20) (Figure S3) suggested that these genes can distinguish the species of Urophysa and its relatives and can be used for future phylogenetic analyses. The six genes will not only provide insights into chloroplast genome evolution of species of Urophysa, but also offer valuable genetic markers for population phylogenomic studies of Urophysa and its close lineages.

3. Materials and Methods

3.1. Plant Materials and DNA Extraction

Fresh leaves of U. rockii, U. henryi, and S. aquilegia were collected from Jiangyou (Sichuan, China; coordinates: 31°59′ N, 104°51′ E), Yichang (Hubei, China; coordinates: 30°42′ N, 111°17′ E), and Nanchuan (Chongqing, China; coordinates: 30°04′ N, 90°33′ E), respectively. The fresh leaves from each site were immediately dried with silica gel for further DNA extraction. The total genomic DNA was extracted from leaf tissues with a modified Cetyl Trimethyl Ammonium (CTAB) method [62].

3.2. Chloroplast Genome Sequencing and Assembling

All cp genomes were sequenced using an Illumina Hiseq 2500 platform by Biomarker Technologies, Inc. (Beijing, China) In order to eliminate the interference from mitochondrial or nuclear DNAs, all the cp genome reads were extracted by mapping all raw reads to the reference cp genome of Trollius chinensis (KX752098) with Burrows Wheeler Alignment (BWA) [63]. High-quality reads were obtained using the CLC Genomics Workbench v7.5 (CLC Bio, Aarhus, Denmark) with the default parameters set. A few gaps in the assembled cp genomes were corrected by Sanger sequencing. The primers were designed using Lasergene 7.1 (DNASTAR, Madison, WI, USA). Primer synthesis and the sequencing of the polymerase chain reaction products were conducted by Sangon Biotech (Shanghai, China). The primers and amplifications are shown in Supplementary Table S5.

3.3. Genome Annotation and Analysis

The complete cp genomes were annotated using the online program DOGMA [64]. The annotation results were checked manually, and the codon positions were adjusted by comparing to a previously homologous gene from various chloroplast genomes present in the database using Geneious R11 (Biomatters, Ltd., Auckland, New Zealand). Furthermore, the OGDRAW1 program [65] was used to draw the circular plastid genome maps. GC content and codon usage were analyzed by the MEGA 6 software [66]. The complete cp genomes of U. rockii, U. henryi, and S. adoxoides are deposited in the GenBank under the accession numbers MH006686, MH142266, and MH142265, respectively.

3.4. Repeat Sequence Characterization and SSRs

Perl script MISA [67] was used to search for microsatellites (mono-, di-, tri-, tetra-, penta-, and hexa-nucleotides) loci in the cp genomes. The minimum numbers (thresholds) of the SSRs were 10, 5, 4, 3, 3, and 3 for mono-, di-, tri-, tetra-, penta-, and hexa-nucleotides, respectively. All the repeats were manually verified, and redundant results were removed. REPuter was employed to identify repeat sequences, including palindromic, forward, reverse, and complement, within the cp genome [68]. The following conditions for repeat identification were used: (1) Hamming distance of 3; (2) 90% or greater sequence identity; (3) a minimum repeat size of 30 bp.

3.5. Phylogenetic Analysis

Phylogenetic analysis was conducted using the single-copy genes of the three taxa, together with nine species downloaded from the NCBI GenBank (Tables S6 and S7). The sequences were aligned using MAFFT v5 [69] in GENEIOUS R11 (Biomatters, Ltd.) with the default parameters set and were manually adjusted in MEGA 6.0 [66]. Maximum parsimony (MP) analyses were conducted using PAUP [70]. All characters were equally weighted, gaps were treated as missing, and character states were treated as unordered. Heuristic search was performed with MULPARS option, tree bisection-reconnection (TBR) branch swapping, and random stepwise addition with 1000 replications. The maximum likelihood (ML) analyses were performed using RAxML 8.0 [71]. For ML analyses, the best-fit model, general time reversible (GTR) + G was used with 1000 bootstrap replicates. Bayesian inference (BI) was performed with Mrbayes v3.2 [72]. The Markov chain Monte Carlo (MCMC) analysis was run for 1 × 108 generations. The trees were sampled at every 1000 generations with the first 20% discarded as burn-in. The remaining trees were used to build a 50% majority-rule consensus tree. The stationarity was considered to be reached when the average standard deviation of split frequencies remained below 0.001. Additionally, in order to test the utility of different cp regions, phylogenetic analyses were performed for the complete chloroplast genome sequences and the CDS sequences, respectively.

3.6. Chloroplast Genome Nucleotide Diversity and Positive Selected Analysis

The cp genome sequences were aligned using MAFFT v5 [69] and adjusted manually. Furthermore, a sliding window analysis was conducted for nucleotide diversity in LSC, SSC, and IR regions of the cp genomes using the DnaSP version 5.1 [73]. In addition, to identify the genes under positive selection in U. rockii and U. henryi, endemic to special karst environment, an optimized branch-site model [74] combined with Bayesian Empirical Bayes (BEB) methods [55] were used by comparison with their relatives. We firstly extracted all CDS sequences from U. rockii, U. henryi, S. adoxoides, and nine closely related species downloaded from GenBank (Table S6). The single-copy CDS sequences between these twelve species were obtained (see the Table S4). Each single-copy CDS sequence of these twelve species was aligned according to their amino acid sequence alignment generated by MUSCLE [75], and the “number of gaps” in the alignments was further checked. Then, the alignments of the corresponding DNA codon sequences were further trimmed by TRIMAL [76], and the bona fide alignments were used to support the subsequent positive selection analysis. The optimized branch-site model in the CODEML program implemented in the PAML 4 package [77] was used to assess potential positive selection affecting individual codons along a specifically designated lineage, which was set as U. rockii and U. henryi. Selective pressure is measured by the ratio (ω) of the nonsynonymous substitution rate (dN) to the synonymous substitutions rate (dS). A ratio ω > 1 indicates positive selection, ω = 1 implies neutral selection, and ω < 1 suggests negative selection [78]. Log-likelihood values were calculated in an alternative branch-site model (Model = 2; NSsites = 2; and Fix = 0) that allowed ω to vary among different codons along particular lineages and a neutral branch-site model (Model = 2; NSsites = 2; Fix = 1; Fix ω = 1) that confined the codon sites under neutral selection (ω = 1) on the basis of the likelihood ratio tests (LRT). The right-tailed chi-square test was performed to calculate the p values based on the difference in log-likelihood values between the alternative model and the neutral model with one degree of freedom to assess the model fit. Then, the p values were further adjusted according to multiple statistical tests [79]. A gene with an adjusted p value smaller than 0.05 and with positively selected sites was considered a positively selected gene (PSG). Moreover, in order to identify specific amino acid sites that are potentially under positive selection, a BEB method was implemented to calculate the posterior probabilities for sites classes. Codon sites with a high posterior probability were regarded as positively selected sites [54]. Jalview [80] was used to view the amino acid sequences of positively selected genes. In the end, in order to test the effectiveness of genes under positive selection, we randomly chose two genes to conduct the phylogenetic analyses.

Supplementary Materials

Supplementary Materials are available online at https://0-www-mdpi-com.brum.beds.ac.uk/1422-0067/19/7/1847/s1.

Author Contributions

D.-F.X., Y.Y., S.-D.Z., and X.-J.H. conceived and designed the experiment; D.-F.X., J.L., and S.-D.Z. collected the materials; D.-F.X., Y.-Q.D., Y.Y., and H.-Y.L. participated in data analysis and manuscript drafting; D.-F.X., Y.-Q.D., X.-J.H., and S.-D.Z. revised the manuscript; all authors read and approved the final manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. 31470009, 31570198, 31500188), the Specimen Platform of China, Teaching Specimen’s sub-platform (Available website: http://mnh.scu.edu.cn/), the Science and Technology Basic Work (Grant No. 2013FY112100).

Acknowledgments

We acknowledge Fang-Yu Jin, Hao Li, Fu-Min Xie, and Xin Yang for their help in materials collection.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fu, D.Z.; Orbelia, R.R. Flora of China; Science Press: Beijing, China, 2001; Volume 6, pp. 277–278. [Google Scholar]
  2. Xie, D.F.; Li, M.J.; Tan, J.B.; Price, M.; Xiao, Q.Y.; Zhou, S.D.; He, X.J. Phylogeography and genetic effects of habitat fragmentation on endemic Urophysa (Ranunculaceae) in Yungui Plateau and adjacent regions. PLoS ONE 2017, 12, e0186378. [Google Scholar] [CrossRef] [PubMed]
  3. Du, B.G.; Zhu, D.Y.; Yang, Y.J.; Shen, J.; Yang, F.L.; Su, Z.Y. Living situation and protection strategies of endangered Urophysa rockii. Jiangsu J. Agri. Sci. 2010, 1, 324–325. [Google Scholar]
  4. Wang, J.X.; He, X.J.; Xu, W.; Meng, W.K.; Su, Z.Y. Preliminary study on Urophysa rockii. II. Biological characteristics, ecological characteristics and community analysis. J. Sichuan For. Sci. Technol. 2011, 32, 28–39. [Google Scholar]
  5. Zhang, Y.X.; Hu, H.Y.; He, X.J. Genetic diversity of Urophysa rockii Ulbrich, an endangered and rare species, detected by ISSR. Acta Bot. Boreal.-Occident. Sin. 2013, 33, 1098–1105. [Google Scholar]
  6. Zhang, Y.X.; Hu, H.Y.; Yang, L.J.; Wang, C.B.; He, X.J. Seed dispersal and germination of an endangered and rare species Urophysa rockii (Ranunculaceae). Acta Bot. Boreal.-Occident. Sin. 2013, 35, 303–309. [Google Scholar]
  7. Park, M.; Park, H.; Lee, H.; Lee, B.H.; Lee, J. The complete plastome sequence of an antarctic bryophyte Sanionia uncinata (hedw.) loeske. Int. J. Mol. Sci. 2018, 19, 709. [Google Scholar] [CrossRef] [PubMed]
  8. Dong, W.P.; Liu, H.; Xu, C.; Zuo, Y.J.; Chen, Z.J.; Zhou, S.L. A chloroplast genomic strategy for designing taxon specific DNA mini-barcodes: A case study on ginsengs. BMC Genet. 2014, 15, 138. [Google Scholar] [CrossRef] [PubMed]
  9. Curci, P.L.; de Paola, D.; Danzi, D.; Vendramin, G.G.; Sonnante, G. Complete chloroplast genome of the multifunctional crop Globe artichoke and comparison with other Asteraceae. PLoS ONE 2015, 10, e0120589. [Google Scholar] [CrossRef] [PubMed]
  10. Downie, S.R.; Jansen, R.K. A comparative analysis of whole plastid genomes from the Apiales: Expansion and contraction of the inverted repeat, mitochondrial to plastid transfer of DNA, and identification of highly divergent noncoding regions. Syst. Bot. 2015, 40, 336–351. [Google Scholar] [CrossRef]
  11. Nadachowska-Brzyska, K.; Li, C.; Smeds, L.; Zhang, G.J.; Ellegren, H. Temporal dynamics of avian populations during pleistocene revealed by whole-genome sequences. Curr. Biol. 2015, 25, 1375–1380. [Google Scholar] [CrossRef] [PubMed]
  12. Suo, Z.L.; Li, W.Y.; Jin, X.B.; Zhang, H.J. A new nuclear DNA marker revealing both microsatellite variations and single nucleotide polymorphic loci: A case study on classification of cultivars in Lagerstroemia indica L. J. Microb. Biochem. Technol. 2016, 8, 266–271. [Google Scholar] [CrossRef]
  13. Saina, J.K.; Li, Z.Z.; Gichira, A.W.; Liao, Y.Y. The complete chloroplast genome sequence of tree of heaven (Ailanthus altissima (mill.) (Sapindales: Simaroubaceae), an important pantropical tree. Int. J. Mol. Sci. 2018, 19, 929. [Google Scholar] [CrossRef] [PubMed]
  14. Yurina, N.P.; Odintsova, M.S. Comparative structural organization of plant chloroplast and mitochondrial genomes. Genetika 1998, 34, 5–22. [Google Scholar]
  15. Jansen, R.K.; Raubeson, L.A.; Boore, J.L.; DePamphilis, C.W.; Chumley, T.W.; Haberle, R.C.; Wyman, S.K.; Alverson, A.; Peery, R.; Herman, S.J.; et al. Methods for obtaining and analyzing whole chloroplast genome sequences. Method Enzymol. 2005, 395, 348–384. [Google Scholar]
  16. Jansen, R.K.; Ruhlman, T.A. Plastid Genomes of Seed Plants. In Genomics of Chloroplasts and Mitochondria; Bock, R., Knoop, V., Eds.; Springer: Dordrecht, The Netherlands, 2012; pp. 103–126. [Google Scholar]
  17. Choi, K.S.; Chung, M.G.; Park, S. The complete chloroplast genome sequences of three Veroniceae species (Plantaginaceae): Comparative analysis and highly divergent regions. Front. Plant Sci. 2016, 7, 355. [Google Scholar] [CrossRef] [PubMed]
  18. Dong, W.L.; Wang, R.N.; Zhang, N.Y.; Fan, W.B.; Fang, M.F.; Li, Z.H. Molecular evolution of chloroplast genomes of orchid species: Insights into phylogenetic relationship and adaptive evolution. Int. J. Mol. Sci. 2018, 19, 716. [Google Scholar] [CrossRef] [PubMed]
  19. Dong, W.; Liu, J.; Yu, J.; Wang, L.; Zhou, S. Highly variable chloroplast markers for evaluating plant phylogeny at low taxonomic levels and for DNA barcoding. PLoS ONE 2012, 7, e35071. [Google Scholar] [CrossRef] [PubMed]
  20. Krak, K.; Vít, P.; Belyayev, A.; Douda, J.; Hreusová, L.; Mandák, B. Allopolyploid origin of Chenopodium album s. str. (Chenopodiaceae): A molecular and cytogenetic insight. PLoS ONE 2016, 11, e0161063. [Google Scholar] [CrossRef] [PubMed]
  21. Smith, D.R. Mutation rates in plastid genomes: They are lower than you might think. Genome Biol. Evol. 2015, 7, 1227–1234. [Google Scholar] [CrossRef] [PubMed]
  22. Jansen, R.K.; Cai, Z.; Raubeson, L.A.; Daniell, H.; Depamphilis, C.W.; Leebensmack, J.; Müller, K.F.; Guisinger-Bellian, M.; Haberle, R.C.; Chumley, T.W.; et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc. Natl. Acad. Sci. USA 2007, 104, 19369–19374. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Qian, J.; Song, J.; Gao, H.; Zhu, Y.; Xu, J.; Pang, X. The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza. PLoS ONE 2013, 8, e57607. [Google Scholar] [CrossRef] [PubMed]
  24. Asaf, S.; Waqas, M.; Khan, A.L.; Khan, M.A.; Kang, S.M.; Imran, Q.M.; Shahzad, R.; Bilal, S.; Yun, B.W.; Lee, I.J.; et al. The complete chloroplast genome of wild rice (Oryza minuta) and its comparison to related species. Front. Plant Sci. 2017, 8, 304. [Google Scholar] [CrossRef] [PubMed]
  25. Gu, C.; Tembrock, L.R.; Zheng, S.; Wu, Z. The complete chloroplast genome of Catha edulis: A comparative analysis of genome features with related species. Int. J. Mol. Sci. 2018, 19, 525. [Google Scholar] [CrossRef] [PubMed]
  26. Huang, J.; Chen, R.; Li, X. Comparative analysis of the complete chloroplast genome of four known Ziziphus species. Genes 2017, 8, 340. [Google Scholar] [CrossRef] [PubMed]
  27. Yi, X.; Gao, L.; Wang, B.; Su, Y.J.; Wang, T. The complete chloroplast genome sequence of Cephalotaxus oliveri (Cephalotaxaceae): Evolutionary comparison of Cephalotaxus chloroplast DNAs and insights into the loss of inverted repeat copies in gymnosperms. Genome Biol. Evol. 2013, 5, 688–698. [Google Scholar] [CrossRef] [PubMed]
  28. Cavalier-Smith, T. Chloroplast evolution: Secondary symbiogenesis and multiple losses. Curr. Biol. 2002, 12, 62–64. [Google Scholar] [CrossRef]
  29. Asano, T.; Tsudzuki, T.; Takahashi, S.; Shimada, H.; Kadowaki, K. Complete nucleotide sequence of the sugarcane (Saccharum officinarum) chloroplast genome: A comparative analysis of four monocot chloroplast genomes. DNA Res. 2004, 11, 93–99. [Google Scholar] [CrossRef] [PubMed]
  30. Timme, R.E.; Kuehl, J.V.; Boore, J.L.; Jansen, R.K. A comparative analysis of the Lactuca and Helianthus (Asteraceae) plastid genomes: Identification of divergent regions and categorization of shared repeats. Am. J. Bot. 2007, 94, 302–312. [Google Scholar] [CrossRef] [PubMed]
  31. Nie, X.J.; Lv, S.Z.; Zhang, Y.X.; Du, X.H.; Wang, L.; Biradar, S.S.; Tan, X.F.; Wan, F.H.; Weining, S. Complete chloroplast genome sequence of a major invasive species, crofton weed (Ageratina adenophora). PLoS ONE 2012, 7, e36869. [Google Scholar] [CrossRef] [PubMed]
  32. Dong, W.P.; Xu, C.; Li, D.L.; Jin, X.B.; Lu, Q.; Suo, Z.L. Comparative analysis of the complete chloroplast genome sequences in psammophytic Haloxylon species (Amaranthaceae). Peer J. 2016, 4, e2699. [Google Scholar] [CrossRef] [PubMed]
  33. Kaur, S.; Panesar, P.S.; Bera, M.B.; Kaur, V. Simple sequence repeat markers in genetic divergence and marker-assisted selection of rice cultivars: A review. Crit. Rev. Food Sci. Nutr. 2015, 55, 41–49. [Google Scholar] [CrossRef] [PubMed]
  34. Yang, Y.; Zhou, T.; Duan, D.; Yang, J.; Feng, L.; Zhao, G. Comparative analysis of the complete chloroplast genomes of five Quercus species. Front. Plant Sci. 2016, 7, 959. [Google Scholar] [CrossRef] [PubMed]
  35. Powell, W.; Morgante, M.; McDevitt, R.; Vendramin, G.G.; Rafalski, J.A. Polymorphic simple sequence repeat regions in chloroplast genomes-applications to the population genetics of pines. Proc. Natl. Acad. Sci. USA 1995, 92, 7759–7763. [Google Scholar] [CrossRef] [PubMed]
  36. Provan, J.; Corbett, G.; McNicol, J.W.; Powell, W. Chloroplast DNA variability in wild and cultivated rice (Oryza spp.) revealed by polymorphic chloroplast simple sequence repeats. Genome 1997, 40, 104–110. [Google Scholar] [CrossRef] [PubMed]
  37. Pauwels, M.; Vekemans, X.; Gode, C.; Frerot, H.; Castric, V.; Saumitou-Laprade, P. Nuclear and chloroplast DNA phylogeography reveals vicariance among European populations of the model species for the study of metal tolerance, Arabidopsis halleri (Brassicaceae). New Phytol. 2012, 193, 916–928. [Google Scholar] [CrossRef] [PubMed]
  38. Rose, O.; Falush, D. A threshold size for microsatellite expansion. Mol. Biol. Evol. 1998, 15, 613–615. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  39. Raubeson, L.A.; Peery, R.; Chumley, T.W.; Dziubek, C.; Fourcade, H.M.; Boore, J.L.; Jansen, R.K. Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus. BMC Genom. 2007, 8, 174. [Google Scholar] [CrossRef] [PubMed]
  40. Huotari, T.; Korpelainen, H. Complete chloroplast genome sequence of Elodea Canadensis and comparative analyses with other monocot plastid genomes. Gene 2012, 508, 96–105. [Google Scholar] [CrossRef] [PubMed]
  41. Wang, W.B.; Yu, H.; Wang, J.H.; Lei, W.J.; Gao, J.H.; Qiu, X.P.; Wang, J.S. The complete chloroplast genome sequences of the medicinal plant Forsythia suspensa (Oleaceae). Int. J. Mol. Sci. 2017, 18, 2288. [Google Scholar] [CrossRef] [PubMed]
  42. Asaf, S.; Khan, A.L.; Khan, A.R.; Waqas, M.; Kang, S.M.; Khan, M.A.; Lee, S.M.; Lee, I.J. Complete chloroplast genome of Nicotiana otophora and its comparison with related species. Front. Plant Sci. 2016, 7, 447. [Google Scholar] [CrossRef] [PubMed]
  43. Kuang, D.Y.; Wu, H.; Wang, Y.L.; Gao, L.M.; Zhang, S.Z.; Lu, L. Complete chloroplast genome sequence of Magnolia kwangsiensis (Magnoliaceae): Implication for DNA barcoding and population genetics. Genome 2011, 54, 663–673. [Google Scholar] [CrossRef] [PubMed]
  44. Chen, J.; Hao, Z.; Xu, H.; Yang, L.; Liu, G.; Sheng, Y. The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng. Front. Plant Sci. 2015, 6, 447. [Google Scholar] [CrossRef] [PubMed]
  45. Kim, K.J.; Lee, H.L. Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative analysis of sequence evolution among 17 vascular plants. DNA Res. 2004, 11, 247–261. [Google Scholar] [CrossRef] [PubMed]
  46. Hu, Y.; Woeste, K.E.; Zhao, P. Completion of the chloroplast genomes of five Chinese Juglans and their contribution to chloroplast phylogeny. Front. Plant Sci. 2017, 7, 1955. [Google Scholar] [CrossRef] [PubMed]
  47. Wang, R.J.; Cheng, C.L.; Chang, C.C.; Wu, C.L.; Su, T.M.; Chaw, S.M. Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots. BMC Evol. Biol. 2008, 8, 36. [Google Scholar] [CrossRef] [PubMed]
  48. Yang, M.; Zhang, X.; Liu, G.; Yin, Y.; Chen, K.; Yun, Q. The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.). PLoS ONE 2010, 5, e12762. [Google Scholar] [CrossRef] [PubMed]
  49. Li, Z.Z.; Saina, J.K.; Gichira, A.W.; Kyalo, C.M.; Wang, Q.F.; Chen, J.M. Comparative genomics of the balsaminaceae sister genera Hydrocera triflora and Impatiens pinfanensis. Int. J. Mol. Sci. 2018, 19, 319. [Google Scholar] [CrossRef] [PubMed]
  50. Li, C.Y. Classification and Systematics of the Aquilegiinae Tamura; The Chinese Academy of Science: Beijing, China, 2006. [Google Scholar]
  51. Bastida, J.M.; Alcántara, J.M.; Rey, P.J.; Vargas, P.; Herrera, C.M. Extended phylogeny of Aquilegia: The biogeographical and ecological patterns of two simultaneous but contrasting radiations. Plant Syst. Evol. 2010, 284, 171–185. [Google Scholar] [CrossRef]
  52. Fior, S.; Li, M.; Oxelman, B.; Viola, R.; Hodges, S.A.; Ometto, L.; Varotto, C. Spatiotemporal reconstruction of the Aquilegia rapid radiation through next-generation sequencing of rapidly evolving cpDNA regions. New Phytol. 2013, 198, 579–592. [Google Scholar] [CrossRef] [PubMed]
  53. Wei, W.; Lu, A.M.; Yi, R.; Endress, M.E.; Chen, Z.D. Phytogeny and classification of Ranunculales: Evidence from four molecular loci and morphological data. Perspect. Plant Ecol. Evol. Syst. 2009, 11, 81–110. [Google Scholar]
  54. Lan, Y.; Sun, J.; Tian, R.M.; Bartlett, D.H.; Li, R.S.; Wong, Y.H.; Zhang, W.P.; Qiu, J.W.; Xu, T.; He, L.S.; et al. Molecular adaptation in the world’s deepest-living animal: Insights from transcriptome sequencing of the hadal amphipod Hirondellea gigas. Mol. Ecol. 2017, 26, 3732–3743. [Google Scholar] [CrossRef] [PubMed]
  55. Yang, Z.; Wong, W.S.; Nielsen, R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol. Biol. Evol. 2005, 22, 1107–1118. [Google Scholar] [CrossRef] [PubMed]
  56. Ai, B.; Gao, Y.; Zhang, X.; Tao, J.; Kang, M.; Huang, H. Comparative transcriptome resources of eleven Primulina species, a group of ‘stone plants’ from a biodiversity hot spot. Mol. Ecol. Resour. 2015, 15, 619–632. [Google Scholar] [CrossRef] [PubMed]
  57. Muto, A.; Ushida, C. Transcription and translation. Methods Cell Biol. 1995, 48, 483. [Google Scholar]
  58. Romanovsky, Y.M.; Tikhonov, A.N. Molecular energy transducers of the living cell. Proton ATP synthase: A rotating molecular motor. Physics-Uspekhi 2010, 53, 931–956. [Google Scholar] [CrossRef]
  59. Allahverdiyeva, Y.; Mamedov, F.; Mäenpää, P.; Vass, I.; Aro, E.M. Modulation of photosynthetic electron transport in the absence of terminal electron acceptors: Characterization of the rbcL deletion mutant of tobacco. Biochim. Biophys. Acta Bioenerg. 2005, 1709, 69–83. [Google Scholar] [CrossRef] [PubMed]
  60. Piot, A.; Hackel, J.; Christin, P.A.; Besnard, G. One-third of the plastid genes evolved under positive selection in PACMAD grasses. Planta 2018, 247, 255–266. [Google Scholar] [CrossRef] [PubMed]
  61. Kapralov, M.V.; Filatov, D.A. Widespread positive selection in the photosynthetic Rubisco enzyme. BMC Evol. Biol. 2007, 7, 73–82. [Google Scholar] [CrossRef] [PubMed]
  62. Doyle, J.J.; Doyle, J.L. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem Bull. 1987, 19, 11–15. [Google Scholar]
  63. Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef] [PubMed]
  64. Wyman, S.K.; Jansen, R.K.; Boore, J.L. Automatic annotation of organellar genomes with DOGMA. Bioinformatics 2004, 20, 3252–3255. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  65. Lohse, M.; Drechsel, O.; Kahlau, S.; Bock, R. Organellar genome draw—A suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 2013, 41, 575. [Google Scholar] [CrossRef] [PubMed]
  66. Kumar, S.; Nei, M.; Dudley, J.; Tamura, K. MEGA: A biologist centric software for evolutionary analysis of DNA and protein sequences. Brief. Bioinform. 2008, 9, 299–306. [Google Scholar] [CrossRef] [PubMed]
  67. Thiel, T.; Michalek, W.; Varshney, R.; Graner, A. Exploiting EST databases for the development and characterization of gene derived SSR-markers in barley (Hordeum vulgare L.). Theor. Appl. Genet. 2003, 106, 411–422. [Google Scholar] [CrossRef] [PubMed]
  68. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [Google Scholar] [CrossRef] [PubMed]
  69. Katoh, K.; Standley, D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef] [PubMed]
  70. Swofford, D.L. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods); Version 4b10; Sinauer: Sunderland, MA, USA, 2003. [Google Scholar]
  71. Stamatakis, A. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 2006, 22, 2688–2690. [Google Scholar] [CrossRef] [PubMed]
  72. Ronquist, F.; Teslenko, M.; van der Mark, P.; Ayres, D.L.; Darling, A.; Hohna, S.; Larget, B.; Liu, L.; Suchard, M.A.; Huelsenbeck, J. MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 2012, 61, 539–542. [Google Scholar] [CrossRef] [PubMed]
  73. Librado, P.; Rozas, J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics 2009, 25, 1451–1452. [Google Scholar] [CrossRef] [PubMed]
  74. Yang, Z.; dos Reis, M. Statistical properties of the branch-site test of positive selection. Mol. Biol. Evol. 2011, 28, 1217–1228. [Google Scholar] [CrossRef] [PubMed]
  75. Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef] [PubMed]
  76. Capella-Gutierrez, S.; Silla-Martínez, J.M.; Gabaldon, T. TrimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 2009, 25, 1972–1973. [Google Scholar] [CrossRef] [PubMed]
  77. Yang, Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007, 24, 1586–1591. [Google Scholar] [CrossRef] [PubMed]
  78. Yang, Z.; Nielsen, R. Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 2002, 19, 908–917. [Google Scholar] [CrossRef] [PubMed]
  79. Benjamini, Y.; Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. B 1995, 57, 289–300. [Google Scholar]
  80. Clamp, M.; Cuff, J.; Searle, S.M.; Barton, G.J. The Jalview java alignment editor. Bioinformatics 2004, 20, 426–427. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Gene maps of the Urophysa rockii, Urophysa henryi and Semiquilegia adoxoides chloroplast (cp) genomes. Genes shown inside the circle are transcribed clockwise, and those outside are transcribed counterclockwise. Genes belonging to different functional groups are color-coded. The darker gray color in the inner circle corresponds to the GC content, and the lighter gray color corresponds to the AT content. SSU: small subunit; LSU: large subunit; ORF: open reading frame.
Figure 1. Gene maps of the Urophysa rockii, Urophysa henryi and Semiquilegia adoxoides chloroplast (cp) genomes. Genes shown inside the circle are transcribed clockwise, and those outside are transcribed counterclockwise. Genes belonging to different functional groups are color-coded. The darker gray color in the inner circle corresponds to the GC content, and the lighter gray color corresponds to the AT content. SSU: small subunit; LSU: large subunit; ORF: open reading frame.
Ijms 19 01847 g001
Figure 2. Analysis of repeated sequences in U. rockii, U. henryi, and S. adoxoides chloroplast genomes. (A) Total of four repeat types; (B) Frequency of the palindromic repeat by length; (C) Frequency of the forward repeat by length; (D) Frequency of the reverse repeat by length.
Figure 2. Analysis of repeated sequences in U. rockii, U. henryi, and S. adoxoides chloroplast genomes. (A) Total of four repeat types; (B) Frequency of the palindromic repeat by length; (C) Frequency of the forward repeat by length; (D) Frequency of the reverse repeat by length.
Ijms 19 01847 g002
Figure 3. Analysis of simple sequence repeats (SSRs) in chloroplast genomes of the three species. (A) Number of different SSR types detected in each species; (B) type and frequency of each identified SSR.
Figure 3. Analysis of simple sequence repeats (SSRs) in chloroplast genomes of the three species. (A) Number of different SSR types detected in each species; (B) type and frequency of each identified SSR.
Ijms 19 01847 g003
Figure 4. The nucleotide diversity of the whole chloroplast genomes of the three species. LSC: large single-copy region; IRs: inverted repeats region; SSC: small single-copy region.
Figure 4. The nucleotide diversity of the whole chloroplast genomes of the three species. LSC: large single-copy region; IRs: inverted repeats region; SSC: small single-copy region.
Ijms 19 01847 g004
Figure 5. Comparison of the borders of the LSC, SSC, and IR regions of the chloroplast genomes of the three species. LR: junction line between LSC and IRb; RS: junction line between IRb and SSC; SR: junction line between SSC and IRa; RL: junction line between IRa and LSC.
Figure 5. Comparison of the borders of the LSC, SSC, and IR regions of the chloroplast genomes of the three species. LR: junction line between LSC and IRb; RS: junction line between IRb and SSC; SR: junction line between SSC and IRa; RL: junction line between IRa and LSC.
Ijms 19 01847 g005
Figure 6. Phylogenetic relationship of Urophysa with related species based on 79 single-copy genes shared by all cp genomes. Tree constructed by (A) maximum likelihood (ML) with the bootstrap values of ML above the branches; (B) maximum parsimony (MP) and Bayesian inference (BI) with bootstrap values of MP and posterior probabilities of BI above the branches, respectively.
Figure 6. Phylogenetic relationship of Urophysa with related species based on 79 single-copy genes shared by all cp genomes. Tree constructed by (A) maximum likelihood (ML) with the bootstrap values of ML above the branches; (B) maximum parsimony (MP) and Bayesian inference (BI) with bootstrap values of MP and posterior probabilities of BI above the branches, respectively.
Ijms 19 01847 g006
Figure 7. Two of the amino acids sequences that showed positive selection in the branch-site model test. (A) Amino acids sequences of the rpl20 gene; (B) amino acids sequences of the ndhI gene. The red blocks represent the different amino acids.
Figure 7. Two of the amino acids sequences that showed positive selection in the branch-site model test. (A) Amino acids sequences of the rpl20 gene; (B) amino acids sequences of the ndhI gene. The red blocks represent the different amino acids.
Ijms 19 01847 g007
Table 1. Summary of complete chloroplast genomes. LSC, large single-copy; SSC, small single-copy; IR, inverted repeat
Table 1. Summary of complete chloroplast genomes. LSC, large single-copy; SSC, small single-copy; IR, inverted repeat
SpeciesLSCSSCIRTotal
Length (bp)GC%Length (%)Length (bp)GC%Length (%)Length (bp)GC%Length (%)Length (bp)GC%
U. rockii87,12837.255.018,21632.511.526,58443.716.8158,51238.8
U. henryi87,03137.255.018,26032.611.526,50643.616.7158,30338.8
S. adoxoides87,20237.255.118,19232.511.526,47343.716.7158,34038.9
Tsuga chinensis88,52236.355.318,40532.011.526,63243.116.6160,19138.1
Aconitum austrokoreense86,36236.255.416,94832.710.926,29143.016.9155,89238.1
A. kusnezoffii86,33536.255.416,94532.710.926,29143.016.9155,86238.1
A. volubile86,34836.255.416,94432.610.926,29043.016.9155,87238.1
Ranunculus macranthus84,63736.054.618,90931.012.225,79143.516.6155,12937.9
R. occidentalis83,53235.954.121,26931.613.824,83143.616.1154,47437.8
R. austro-oreganus83,58235.954.121,24931.613.824,83143.616.1154,49337.8
Clematis terniflora79,32836.349.718,11031.411.431,04542.019.5159,52838.0
Coptis chinensis84,56736.454.417,37632.111.226,76243.017.2155,48438.2
Table 2. Comparison of the sizes of coding and non-coding regions among species.
Table 2. Comparison of the sizes of coding and non-coding regions among species.
SpeciesProtein-CodingtRNArRNA
Length (bp)GC%Length (%)Length (bp)GC%Length (%)Length (bp)GC%Length (%)
U. rockii78,86739.249.8268753.21.7860255.85.4
U. henryi78,76939.249.8269553.31.7860255.85.4
S. adoxoides78,49839.349.6270653.61.7860255.85.4
T. chinensis78,90338.449.3271653.11.7905055.45.6
A. austrokoreense79,57538.351.0281053.01.8905055.45.8
A. kusnezoffii78,29438.450.2281352.91.8904655.35.8
A. volubile79,56038.351.0281053.01.8905055.55.8
R. macranthus78,61538.250.7273853.11.8755955.24.9
R. occidentalis69,29438.644.9271753.11.8905055.45.9
R. austro-oreganus74,35538.148.1279652.91.8905055.45.9
C. terniflora81,81938.351.3271853.41.7905055.45.7
C. chinensis71,63739.046.1271653.21.7905055.55.8
Table 3. List of genes encoded in two Urophysa species and S. adoxoides.
Table 3. List of genes encoded in two Urophysa species and S. adoxoides.
Category for GenesGroup of GenesName of Genes
Self-replicationtransfer RNAstrnA-UGC *, trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-GCC, trnG-UCC, trnI-CAU *, trnI-GAU *, trnK-UUU, trnL-CAA *, trnL-UAA, trnL-UAG, trnM-CAU, trnN-GUU *, trnP-UGG, trnQ-UUG, trnR-ACG *, trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU, trnT-UGU, trnV-GAC *, trnV-UAC, trnW-CCA, trnY-GUA
ribosomal RNAsrrn4.5 *, rrna5 *, rrn16 *, rrn23 *
RNA polymeraserpoA, rpoB, rpoC1, rpoC2
Small subunit of ribosomal proteins (SSU)rps2, rps3, rps4, rps7 *, rps8, rps11, rps12 *, rps14, rps15, rps16, rps18, rps19 *
Large subunit of ribosomal proteins (LSU)rpl2 *, rpl14, rpl16, rpl20, rpl22, rpl23 *, rpl32, rpl33, rpl36
Genes for photosynthesisSubunits of NADH-dehydrogenasendhA, ndhB *, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
Subunits of photosystem IpsaA, psaB, psaC, psaI, psaJ
Subunits of photosystem IIpsbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
Subunits of cytochrome b/f complexpetA, petB, petD, petG, petL, petN
Subunits of ATP synthaseatpA, atpB, atpE, atpF, atpH, atpI
Large subunit of rubiscorbcL
Other genesTanslational initiation factorinfA
ProteaseclpP
MaturasematK
Subunit of Acetyl-CoA-carboxylaseaccD
Envelope membrane proteincemA
C-type cytochrome synthesis geneccsA
Genes of unknown functionhypothetical chloroplast reading frames (ycf)ycf1 *, ycf2 *, ycf3, ycf4
* Gene with two copies.
Table 4. The potential positive selection test based on the branch-site model.
Table 4. The potential positive selection test based on the branch-site model.
Gene NameNull HypothesisAlternative HypothesisSignificance Test
lnLdfOmega (ω = 1)lnLdfOmega (ω > 1)BEBNEBp-Value
psbI−188.6475261−188.6475273.40383NANA1
psbL−164.11693261−164.1169273.40719NANA1
rps14−621.64162261−621.6416273.40833NANA1
psaI−214.67663261−214.6766273.38764NANA1
atpH−434.45059261−434.4506273.35869NANA1
psaJ−318.52192261−318.5219273.4089NANA1
atpE−868.20243261−868.2024273.40891NANA1
atpA−3297.629261−3297.412769.43581220, E, 0.794NA5.04 × 10-1
petN−126.25816261−126.2582273.40693NANA1
rps11−920.92455261−920.9246271NANA1
psbT−216.52331261−216.5233271NANA1
ndhG−1238.1161261−1238.116273.33667NANA9.99 × 10-1
ycf4−1275.4093261−1275.409273.40886NANA1
rps18−567.98294261−567.9829273.39414NANA1
petB−1274.0507261−1274.051273.403NANA1
rpl20−1000.285261−999.94127112.3031688, R, 0.683NA4.07 × 10-1
psbN−223.7602261−223.7602273.40292NANA1
psbF−198.46733261−198.4673273.38407NANA1
petG−206.74878261−206.7488273.42095NANA1
psbK−375.13705261−375.1371273.4063NANA1
rpl36−267.8099261−267.8099271NANA1
rps2−1620.734261−1620.734273.40891NANA1
psbM−179.71897261−179.719273.4064NANA1
rpoB−6830.0894261−6830.089273.40847NANA9.99 × 10-1
psaA−4245.754261−4245.492763.4737928, R, 0.778NA4.66 × 10-1
psbH−540.92362261−540.9236273.40123NANA1
ndhE−616.75534261−616.7553273.40218NANA1
atpB−3133.747261−3133.75271115, N, 0.828NA1
ndhI−1307.986261−1307.6827575.22179174, S, 0.696NA4.35 × 10-1
cemA−1787.561261−1787.561273.40891NANA1
ndhJ−1001.4075261−1001.407271NANA1
psbJ−209.10513261−209.1051273.38566NANA1
petA−1331.3789261−1331.379273.4089NANA1
psbC−2760.6743261−2760.674271NANA1
ndhH−2643.2896261−2643.29271NANA9.98 × 10-1
rbcL−2937.477261−2937.41275.22178440, E, 0.736NA7.20 × 10-1
clpP−1301.1173261−1301.117273.40876NANA1
ndhC−731.03212261−731.0321273.33544NANA1
ycf3−935.76375261−935.7638273.40891NANA1
psbD−1922.7755261−1922.775273.38592NANA1
psbA−1960.3785261−1960.379273.39639NANA1
petL−172.24809261−172.2481273.40087NANA1
rpl33−413.59385261−413.5939273.4089NANA1
psbE−435.90511261−435.9051273.40785NANA1
psaC−498.98549261−498.9855273.408NANA1
atpI−1445.5558261−1445.556273.39588NANA1
psaB−4069.2947261−4069.295273.41513NANA1
Bold types are positively selected sites. BEB: Bayesian Empirical Bayes; NEB: Naïve Empirical Bayes; Amino acid: (E: Glu; R: Arg; N: Asn; S: Ser).

Share and Cite

MDPI and ACS Style

Xie, D.-F.; Yu, Y.; Deng, Y.-Q.; Li, J.; Liu, H.-Y.; Zhou, S.-D.; He, X.-J. Comparative Analysis of the Chloroplast Genomes of the Chinese Endemic Genus Urophysa and Their Contribution to Chloroplast Phylogeny and Adaptive Evolution. Int. J. Mol. Sci. 2018, 19, 1847. https://0-doi-org.brum.beds.ac.uk/10.3390/ijms19071847

AMA Style

Xie D-F, Yu Y, Deng Y-Q, Li J, Liu H-Y, Zhou S-D, He X-J. Comparative Analysis of the Chloroplast Genomes of the Chinese Endemic Genus Urophysa and Their Contribution to Chloroplast Phylogeny and Adaptive Evolution. International Journal of Molecular Sciences. 2018; 19(7):1847. https://0-doi-org.brum.beds.ac.uk/10.3390/ijms19071847

Chicago/Turabian Style

Xie, Deng-Feng, Yan Yu, Yi-Qi Deng, Juan Li, Hai-Ying Liu, Song-Dong Zhou, and Xing-Jin He. 2018. "Comparative Analysis of the Chloroplast Genomes of the Chinese Endemic Genus Urophysa and Their Contribution to Chloroplast Phylogeny and Adaptive Evolution" International Journal of Molecular Sciences 19, no. 7: 1847. https://0-doi-org.brum.beds.ac.uk/10.3390/ijms19071847

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop