Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Genome-wide association study for grain yield and related traits in elite wheat varieties and advanced lines using SNP markers

  • Sheng-Xing Wang ,

    Contributed equally to this work with: Sheng-Xing Wang, Yu-Lei Zhu

    Roles Data curation, Writing – original draft

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Yu-Lei Zhu ,

    Contributed equally to this work with: Sheng-Xing Wang, Yu-Lei Zhu

    Roles Data curation, Formal analysis, Validation, Writing – review & editing

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • De-Xin Zhang,

    Roles Data curation

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Hui Shao,

    Roles Data curation, Formal analysis

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Peng Liu,

    Roles Data curation, Investigation

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Jian-Bang Hu,

    Roles Data curation, Investigation, Methodology

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Heng Zhang,

    Roles Data curation, Methodology

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Hai-Ping Zhang ,

    Roles Funding acquisition, Supervision, Writing – review & editing

    zhhp20@163.com (HPZ); changtgw@126.com (CC)

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Cheng Chang ,

    Roles Funding acquisition, Resources, Supervision, Writing – review & editing

    zhhp20@163.com (HPZ); changtgw@126.com (CC)

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Jie Lu,

    Roles Data curation, Investigation

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

  • Xian-Chun Xia,

    Roles Methodology, Resources, Software, Writing – review & editing

    Affiliation Institute of Crop Science, National Wheat Improvement Centre/The National Key Facility for Crop Gene Resources and Genetic Improvement, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China

  • Gen-Lou Sun,

    Roles Writing – review & editing

    Affiliations College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China, Department of Biology, Saint Mary’s University, Halifax, Nova Scotia, Canada

  • Chuan-Xi Ma

    Roles Funding acquisition, Supervision, Writing – review & editing

    Affiliation College of Agronomy, Anhui Agricultural University; Key Laboratory of Wheat Biology and Genetic Improvement in the Southern Yellow & Huai River Valley, Ministry of Agriculture, Hefei, Anhui, China

Abstract

Genetic improvement of grain yield is always an important objective in wheat breeding. Here, a genome-wide association study was conducted to parse the complex genetic composition of yield-related traits of 105 elite wheat varieties (lines) using the Wheat 90K Illumina iSelect SNP array. Nine yield-related traits, including maximum number of shoots per square meter (MSN), effective number of spikes per square meter (ESN), percentage of effective spike (PES), number of kernels per spike (KPS), thousand-kernel weight (TKW), the ratio of kernel length/kernel width (RLW), leaf-area index (LAI), plant height (PH), and grain yield (GY), were evaluated across four environments. Twenty four highly significant marker-trait associations (MTAs) (P < 0.001) were identified for nine yield-related traits on chromosomes 1A, 1D, 2A (2), 3B, 4A (2), 4B, 5A (4), 5B (4), 5D, 6B (2), 7A (2), and 7B (3), explaining 10.86–20.27% of the phenotypic variations. Of these, four major loci were identified in more than three environments, including one locus for RLW (6B), one locus for TKW (7A), and two loci for PH (7B). A cleaved amplified polymorphic sequence (CAPS) marker Td99211 for TKW on chromosome 5A was developed and validated in both a natural population composed of 372 wheat varieties (lines) and a RIL population derived from the cross of Yangxiaomai × Zhongyou 9507. The CAPS marker developed can be directly used for marker-assisted selection in wheat breeding, and the major MTAs identified can provide useful information for fine-mapping of the target genes in future studies.

Introduction

Wheat (Triticum aestivum L.) is one of the most important and widely-grown staple crops. The continuous decrease in farmland and rapid increase in population results in big problems regarding the production of sufficient food to meet the global demand. A previous study examining food security suggested that food production would need to increase by 70–100% in 2050 [1]. Wheat producers and breeders have managed a considerable increase of yields over the last 50 years, reaching more than 700 Mt in 2016. Nevertheless, increasing grain yields is still a primary objective in wheat breeding [2]. The complex genetic relationships between yield and related traits (e.g., plant height, number of spikes per hectare, number of kernels per spike, and thousand-kernel weight) need to be clarified to achieve further breakthroughs to develop high-yielding wheat varieties.

Wheat yield and related traits are controlled by multiple quantitative trait loci (QTLs), and vulnerable to environmental factors. To date, a considerable number of QTLs associated with grain yield and related traits have been detected on almost all 21 wheat chromosomes [310]. On the other hand, many marker-trait associations (MTAs) for grain yield and related traits have also been identified under various genetic backgrounds [1120]. Compared with bi-parental population mapping largely limited by bi-parental genetic background, genome-wide association study (GWAS) based on linkage disequilibrium is an effective approach to identify abundant genetic loci for complex traits in diverse natural populations because of their abundant genetic backgrounds [2021].

The available genotyping tools have rapidly developed for QTL mapping and GWAS, from wheat simple sequence repeat (SSR) markers to diversity array technology, then to 9K, 90K, 660K and 820K SNP arrays. The high-density SNP arrays have been widely used to identify MTAs in GWAS for grain yield and related traits, and most MTAs for these traits have been identified on chromosomes 1A, 1B, 2B, 3A, 3B, 4A, 5A, 5B, 6A, 7A, and 7B [1120]. These MTAs provided useful information for identification of grain yield genes in wheat.

In our previous research, genetic diversities of 190 wheat varieties and advanced lines were characterized using SSR markers; these varieties were selected from three major wheat-growing regions in China, i.e., Yellow & Huai Rivers Valley, Middle and Lower reaches of the Yangtze River, and southwestern China [22]. Among these, we further selected 105 representatives with large genetic diversity and wide use in breeding program to identify major MTAs associated with grain yield and related traits via GWAS using wheat 90K Illumina iSelect SNP array. A cleaved amplified polymorphic sequence (CAPS) marker for thousand-kernel weight (TKW) was further validated in 372 wheat varieties (lines) and 188 lines from Yangxiaomai × Zhongyou 9507 RIL population. The results will be useful for improvement of grain yield in wheat breeding.

Materials and methods

Plant materials and field trials

The association mapping panel comprised 105 elite wheat varieties and advanced lines with abundant phenotypic variations (S1 Fig) from three major winter wheat -growing regions in China, i.e., Yellow & Huai Rivers Valley, Middle and Lower reaches of the Yangtze River, and southwestern China, based on genetic diversities [22]. A natural population composed of 372 wheat varieties (lines) and a RIL population (188 lines) derived from the cross of Yangxiaomai × Zhongyou 9507 were used to validate the association of a CAPS marker (Td99211) with TKW on chromosome 5A.

The association mapping panel was planted at the Dayangdian experimental farm of Anhui Agricultural University in Hefei (31°93′N, 117°21′E) and the Guohe experimental farm of Anhui Agricultural University in Lujiang (31°47′N, 117°25′E) during the 2014–2015 (designated as E1 and E2, respectively) and 2015–2016 (designated as E3 and E4, respectively) cropping seasons. Each plot comprised five 4.0-m rows spaced 25 cm apart. The Natural population and the RILs population were planted at the Dayangdian experimental farm of Anhui Agricultural University in Hefei (31°93′N, 117°21′E) with two 2.0-m rows spaced 25cm apart per material during the 2015–2016 and 2016–2017 cropping seasons. Field trials were conducted in randomized complete blocks with two replications. Test plots were managed according to local practices. All fields were kept free of diseases and weeds.

Phenotypic trait evaluation and statistical analysis

The 1,300 seeds were sown in each plot. At the two- or three-leaf stages, seedlings were counted and thinned for about 1,000 evenly distributing plants each plot. Three 1.0-m sections were chosen and marked in each plot. The maximum number of shoots per section was scored during the elongation stage, and then converted to maximum number of shoots per square meter. The leaf-area index (LAI) was measured with five replications per plot at the heading stage using the SunScan Canopy Analysis System (Delta-T Devices Ltd., Burwell, Cambridge, UK). The effective number of spikes per square meter (ESN) was counted at the ripening stage, and then the percentage of effective spikes (PES) was calculated. Plant height (PH) was measured with five replications per plot at the yellow maturity stage, and the mean value of five scores was used for subsequent analysis.

When plants reached physiological maturity, 20 spikes per plot were harvested and manually threshed. The total number of kernels was determined using the WSeen SC-G Seed Test System (WSeen Testing Technology Co., Ltd., Hangzhou, China), and then converted to number of kernels per spike (KPS). The remaining spikes in each plot were harvested and threshed using Wintersteiger Plot Combines (Wintersteiger AG, Ried i.I., Austria). Grain yield (GY) was calculated as the weight of wheat grain harvested from whole plots. TKW was measured using 1,000 randomly selected kernels with two replicates, while the ratio of kernel length/kernel width (RLW) was determined using 100 correctly placed kernels in the WSeen SC-G Seed Test System in duplicate.

Statistical analysis of phenotypic data

The best linear unbiased predictions (BLUPs) can eliminate the environmental deviation and estimate the real individual breeding value, so it has gradually become more common application by plant breeders who wish to generate more precise estimates of genotypic values [2325]. Therefore, the broad sense heritability (HB2) and BLUPs were determined using the ‘lme4’ package of the R3.1.3 software (www.r-project.org), with year and location as random effects in the model [Y = lmer (X~(1|LINE) + (1|LOC) + (1|YEAR) + (1|LINE:LOC) + (1|LINE:YEAR)] [2627]. The descriptive statistics of different traits, correlations between BLUP and the measured values among different environments, and those among BLUPs of different phenotypes were analyzed using SPSS Statistics 20 (http://www.ibm.com/analytics/).

DNA extraction and genotyping

Dry seeds were ground to powder using the FastPrep-96 Homogenizer (MP Biomedicals, USA) for genomic DNA extraction following the MPure Nucleic Acid Purification system (MP Biomedicals). The DNA quality was checked by 1.0% agarose gel electrophoresis, and the concentration was determined with the NanoDrop ND-200 Nanophotometer (Thermo Fisher Scientific Inc., USA).

Samples were genotyped using the wheat 90K Illumina iSelect array containing 81,587 SNP markers at Beijing Compass Biotechnology Co., Ltd., following the manufacturer’s protocol [28], in which 46,977 SNPs were genetically mapped using eight mapping populations [29]. The SNP allele clustering and genotype calling were completed with the Genome Studio program (version 2011.1) (Illumina; https://www.illumina.com/). The accuracy of the SNP clustering was visually validated, and incorrectly clustered SNPs were manually adjusted. The SNP markers with a missing rate exceeding 0.1 or a minor allele frequency less than 0.05 were removed, and 31,250 effective SNPs were used for subsequent population structure, principal component, and kinship analyses. Of these, 15,430 SNP markers mapped on different chromosome regions based on the genetic and physical maps [29] (S1 Table) were used for further genome-wide association analysis.

Population structure, principal component, and kinship analyses

The population structure of the association mapping panel was assessed with all 31,250 effective SNP markers on 21 wheat chromosomes, using the fastStructure algorithm in Python (http://rajanil.github.io/fastStructure/). Multiple K values ranging from 1 to 10 were implemented using the Simple Model prior to obtaining a reasonable range of values for the appropriate model complexity required to explain the population structure [30]. A useful heuristic technique based on the tendency of mean field variation schemes was used to select K [30]. The estimated Q matrix was obtained based on a variation inference executed for a choice of K, and the ancestry contribution of each model component was computed as the mean admixture proportion for all samples [30].

The principal component analysis (PCA) was conducted with numerical values for genotypes (31,250 SNP markers) using the genome association and prediction integrated tool (GAPIT) of the R software [3132]. A turning point of the eigenvalue change was chosen as the optimal number for the principal component (PC).

The marker-based kinship matrix (K*) was calculated with the same genotypes using the VanRaden method, and then used to create a clustering heat map of the association mapping panel in the GAPIT [32].

Genome-wide association analysis

A GWAS was performed using the mixed linear models in TASSEL (version 5.2) [3334]. The significant MTAs between 15,430 SNPs and traits were identified using a model with a Q matrix as the fixed effect and a kinship matrix as the random effect (Q + K*) [35] as well as a model with a PC matrix as the fixed effect and a kinship matrix as the random effect (PCA + K*), at a threshold of P < 0.001. Because the degree of the correlation with different models varied from trait to trait, a Bayesian information criterion (BIC)-based model comparison was used for each phenotype [36]. The criterion value for a model was calculated as BIC = −2·maximized log-likelihood + log(n)·number of estimated parameters (n = sample size). The model with Q + K* or PCA + K* as covariates was selected for each trait according to the maximum BIC value [3738]. The SNPs with the genetic distance less than 5 cM were assumed as one MTA, and the MTA identified in more than three environments (P < 0.001) was assumed as a major locus.

Based on a SNP (Tdurum_contig71499_211) on chromosome 5A significantly associated with TKW (P < 0.001), a CAPS marker was developed, designated Td99211 (F: GCTGGAGCAAAGTTGTATT, R: GGTTATGTCGCTTGAGTTAT), using Primer premier 5.0. PCR was performed in a total volume of 10 μL, including 1.0 μL of 10 × PCR buffer, 200 μM of dNTPs, 4 pmol of each primer, 0.5 U Taq DNA polymerase and 100 ng of template DNA. The PCR procedure included a denaturation at 94°C for 5 min, followed by 38 cycles of denaturation at 94°C for 30 s, 60°C for 30 s, 72 `C for 30 s, and a final extension at 72°C for 8 min. The PCR products were digested with AluI at 37°C for 3 h (restriction site: AG/CT, http://www.neb-china.com) according to the manufacturer’s directions, and separated on 1.5% agarose gel. The SPSS Statistics 20 software was used for data analysis, and t-tests were performed using the independent-samples t-test.

Results

Grain yield and related traits

The phenotype values of GY and related traits (i.e., MSN, ESN, PES, KPS, TKW, RLW, LAI, and PH) of the association panel in different environments were shown in Table 1. The HB2 and BLUPs were used for evaluating the genetic variance components of the target traits (S2 Table and S2 Fig). The HB2 of GY and related traits had a wide range from 0.43 (LAI) to 0.92 (PH). Among three major yield components, TKW (0.88) had the highest HB2, followed by KPS (0.75), and ESN (0.64).

thumbnail
Table 1. Phenotypic variation and broad sense heritability of the grain yield and related traits.

https://doi.org/10.1371/journal.pone.0188662.t001

The BLUPs for GY and related traits were positively correlated with the measured data in different environments (0.61–0.97) (S3 Table). There was no significant correlation between GY and MSN, ESN, or PES. However, KPS and TKW were positively correlated with GY (0.20 and 0.34, respectively) (S4 Table). A significantly positive correlation (0.62) was identified between MSN and ESN, but no correlation between KPS and TKW. Additionally, we observed a positive correlation (0.35) between ESN and LAI. These results fully revealed the complexity and instability of wheat yield formation.

Model-based population structure, principal component, and kinship analyses

To effectively evaluate population compositions, a Q matrix (K = 4) and kinship matrix (K*) as well as a PC matrix (PC8) and kinship matrix (K*) were used as the covariates for a subsequent association study. The panel of 105 wheat varieties and advanced lines was divided into four subpopulations based on model complexity. The 105 elite breeding varieties were assigned to each subpopulation according to ancestry contributions (Fig 1a). A significant change in the variances was detected in the eighth PC (Fig 1b), indicating the cumulative variance contribution (> 40%) was relatively high for the first eight principal components. These varieties were assigned to three genetic clusters in a three-dimensional plot of the first three principal components (i.e., PC1, PC2, and PC3) (Fig 1c). Genetic clustering with the kinship matrix indicated that the association mapping panel was mainly divided into three groups, with considerable genetic differences among the varieties (i.e., red to yellow in the clustering heat map of Fig 1d).

thumbnail
Fig 1. Population structure, principal component, and kinship analyses, respectively, with the district plot (a), the screen plot (b), and the genetic clustering heat map (d).

The district plot (a) was generated using the mean of the variation posterior distribution over inferred admixture proportions. The screen plot (b) was generated with the changes in variances in each principal component. Three-dimensional plot of the first three principal components (c) along with the results of the kinship analysis with the genetic clustering heat map (d) was created with a kinship matrix for evaluating the genetic differences among 105 wheat varieties.

https://doi.org/10.1371/journal.pone.0188662.g001

Marker–trait association analysis

According to the BIC values of different traits, a Q + K* model was selected for association analysis of MSN, ESN, KPS, TKW, LAI, PH, and GY. A PC8 + K* model was chosen for PES and RLW (S5 Table). In total, 24 highly significant MTAs (P < 0.001) were detected on chromosomes 1A, 1D, 2A (2), 3B, 4A (2), 4B, 5A (4), 5B (4), 5D, 6B (2), 7A (2), and 7B (3) using the Q + K* or PC8 + K* models for these traits. These MTAs could explain 10.86–20.27% of the phenotypic variance (Table 2).

thumbnail
Table 2. Details regarding the significant marker–trait associations (P < 0.001) for grain yield and related traits.

https://doi.org/10.1371/journal.pone.0188662.t002

Four MTAs for ESN were identified on chromosomes 1D (Ku_c16809_845, 78.36 cM), 3B (wsnp_Ex_c15944_24350833, 62.57 cM; Excalibur_c15944_70, 62.67 cM), 4A (Kukri_c12563_52, 66.28 cM), and 4B (Kukri_rep_c104277_1326 and Excalibur_c55463_232, 26.00 cM), explaining 12.18–16.01% of the phenotypic variance. However, all SNPs related to ESN were only detected in one environment or not (P < 0.001), suggesting the genetic instability of these MTAs in different environments.

For KPS, we detected only one MTA (Kukri_c14516_224, Tdurum_contig10002_533, and BS00108184_51) on chromosome 7A (130.27 cM), explaining 11.78–12.45% of the phenotypic variance.

For TKW, one MTA harbored three SNP markers (BS00073670_51, wsnp_Ex_c1138_2185522, and Tdurum_contig71499_211) on chromosome 5A (84.13–86.36 cM), accounting for an average phenotypic variation of 12.62%. Another one with two SNPs (Excalibur_c14451_1313 and Kukri_c19251_579) significantly associated in three environments (E1, E2, and E4) (P < 0.001) on chromosome 7A (156.23 cM), explained 13.91% of the TKW variation, which implied this MTA is a major one.

Four MTAs were identified for RLW on chromosomes 5A (wsnp_Ex_c2526_4715978, 99.56 cM), 5B (Ex_c24031_300, 212.43 cM), 6B (Tdurum_contig14046_364, 67.24 cM), and 7B (wsnp_Ex_c24376_33618864 and wsnp_Ex_c24376_33619527, 52.18 cM). Of them, the MTA on 6B was more stable and significant (P < 0.001) in four environments (E1, E2, E3, and E4), explaining a higher phenotypic variance (17.68%), which implied this region covered a credible QTL.

Five MTAs for PH on chromosomes 5B (Excalibur_c1925_2569, 131.79 cM), 5D (Kukri_c9285_762, 200.74 cM), 6B (Kukri_rep_c106092_300, 113.67 cM), and 7B (wsnp_Ex_c11003_17857272, 77.13 cM; wsnp_Ex_rep_c68762_67626384, Excalibur_c50612_409, and Tdurum_contig77073_193, 129.77 cM) explained 13.74–20.27% of the phenotypic variance. Of them, both two MTAs on 7B (77.13 cM and 129.77 cM) were identified in three environments, indicating two independent major loci on 7B. Of these associated SNPs, wsnp_Ex_rep_c68762_67626384 on 7B was more significantly associated with PH, suggesting the importance of this region.

Four MTAs associated with GY were identified on chromosomes 2A, 5A, 5B, and 6B, respectively, with phenotypic contributions ranging from 11.50% (Excalibur_c92223_97) to 15.91% (wsnp_BQ166999B_Ta_2_1), and an average of 13.86%. Similar to ESN, all MTAs for GY showed a poor genetic stability in different environments.

CAPS marker development and validation of a SNP for TKW on chromosome 5A

A SNP (Tdurum_contig71499_211) on chromosome 5A for TKW was developed into the CAPS marker (Td99211), and genotyped in 372 wheat varieties (lines) and 188 lines from the RIL population of Yangxiaomai × Zhongyou 9507 cross. Two allelic variations were detected, designated Td99211-A and Td99211-G (Fig 2). There was a significant difference in TKW between the two alleles in the two populations (P < 0.01), and the varieties (lines) harboring Td99211-G had a higher TKW compared with those carrying Td99211-A (Table 3).

thumbnail
Fig 2. Two allelic variations (Td99211-A and Td99211-G) of the CAPS marker (Td99211) digested by AluI in part wheat materials.

https://doi.org/10.1371/journal.pone.0188662.g002

thumbnail
Table 3. Validation of a SNP (Tdurum_contig71499_211) for TKW on chromosome 5A in the natural population (NP) composed of 372 wheat varieties (lines) and the RIL population derived from the cross of Yangxiaomai × Zhongyou 9507 across environments.

https://doi.org/10.1371/journal.pone.0188662.t003

Discussion

Analysis of phenotype heritability

In the current study, 105 elite wheat varieties and advanced lines had extremely diverse genetic backgrounds and highly variable phenotypes. Because of the excellent agronomic traits, most of them were widely used as parents in breeding [22]. However, the complex genetic relationships between GY and related traits dramatically hindered relevant research process of GY formation and the breeding application of obtained achievements. Therefore, investigations of MTAs for GY and related traits in these varieties will provide useful information for wheat breeding programs.

The yield-related traits belong to typical quantitative traits controlled by multiple QTLs, and highly vulnerable to environmental factors. The use of BLUPs can eliminate the environmental deviation and estimate the real individual breeding value [2325]. In the present study, we investigated nine yield-related traits, including MSN, ESN, PES, KPS, TKW, LAI, RLW, PH, and GY, and analyzed their BLUP values. There was a highly significant correlation between BLUP values and the measured values in different environments, indicating that the BLUP values are suitable for GWAS.

In addition, we also analyzed the HB2 values of the above nine traits. The PH had highest HB2 values (0.92), followed by TKW (0.88) and RLW (0.85), while the LAI was the lowest (0.43). The difference of heritability is consistent with GWAS results, that is, only one SNP for LAI was detected, while four major MTAs for PH (2), TKW (1), and RLW (1) were stably identified across environments.

Evaluation of population structure

Because of the limitations of the Structure [39] and Admixture [40] programs regarding the number of markers used for population structure analysis, only a small proportion of SNP markers were utilized in previous studies [1318]. This probably produces false-positive results. In contrast, we used 31,250 effective SNP markers to accurately analyze the population structure (Fig 1a) with the fastStructure algorithm that estimates the approximate posterior distributions on ancestry proportions in two orders of magnitude faster than Structure, with ancestry estimates and prediction accuracies comparable to those of Admixture [30]. The considerable improvement in runtime and comparable accuracies of fastStructure enables the application of this algorithm for analyzing large genotype data sets, generating results clearly different from those of previous studies.

The use of different models is suitable for studying different traits, but blindly using the Q + K* model for all traits probably results in an over-correction of the population structure and some false-negative results [17, 32, 36]. Therefore, the principal component (Fig 1b and 1c) and kinship (Fig 1d) of the association mapping panel were also accurately calculated with all 31,250 effective SNP markers to build the Q + K* and PC8 + K* models.

Comparison of the present study with previous researches

Based on 90K-derived genetic map described by Wang et al. [29] (S6 Table), we further compare the partial MTAs identified in the present study with previous researches. In this study, the MTA (Ex_c24031_300) significantly associated with RLW was detected on chromosome 5B (212.43 cM). Chen et al. [20] also identified a MTA (IACX2594) for RLW in the same genetic position (212.43 cM) using a high-density Illumina iSelect 90K single nucleotide polymorphism assay in a Chinese winter wheat population. For GY, we identified a MTA (RAC875_c2926_371 and wsnp_Ku_c7890_13513783) on chromosome 5A, which was only 3.62 cM from the QTL (wsnp_Ex_c31830_40573624 and wsnp_Ex_rep_c69526_68472665) for GY reported by Li et al. [8]. Therefore, we suggest that the above two QTLs belong to the same locus controlling the GY trait. In addition, in the present study, we also identified a major MTA (Excalibur_c14451_1313 and Kukri_c19251_579) for TKW on chromosome 7A (156.23 cM), which was adjacent to the QTLs (wsnp_Ex_c11047_17915103, wsnp_Ku_c8437_14341371, BS00021657_51, wsnp_JD_c20555_18262317, and CAP7_c2350_105) controlling TKW reported by Li et al. [8] and Su et al. [10]. For PH, a significant MTA (Kukri_c9285_762) was detected on chromosome 5D in this study, which was close to BS00089597_51 (known as GA20ox1 in rice) associated with PH reported by Zanke et al. [15]. In the present study, the SNPs within 5 cM associated with the same traits were assumed as one MTA/QTL. Therefore, the MTAs/QTLs for TKW (7A) and PH (5D) identified in the present study were the same loci with those previously reported by Zanke et al. [15], Li et al. [8] and Su et al. [10]. Notably, no MTA or QTL associated with PH was reported on chromosome 7B, suggesting that the two MTAs on 7B (77.13 and 129.77 cM) identified in the present study are likely to be novel.

MTAs with pleiotropic effects

We detected a MTA on chromosome 6B with PH (Kukri_rep_c106092_300, 113.67 cM) and GY (RAC875_c31299_1215, 110.45 cM), indicating the importance of PH to GY. However, several MTAs for different traits were detected in the same or neighboring positions as those identified in previous studies. For example, wsnp_Ex_c24376_33618864 for RLW on chromosome 7B (52.18 cM) was also identified for PH by Zanke et al. [15]. Wsnp_Ex_c11003_17857272 for PH on chromosome 7B (77.13 cM) was detected for TKW (Ex_c12057_797, 77.13cM) [18]. Wsnp_Ex_c15944_24350833 and Excalibur_c15944_70 for ESN on chromosome 2A were located in the same position (62.57 cM) as Kukri_c21467_571 for TKW [18] and wsnp_JD_c8158_9193784 for KW [20]. Additionally, the MTA was also close to Kukri_c48750_1372 (61.89 cM) for chlorophyll content (measured as SPAD value) during grain filling [17] and BS00074688_51 (65.55 cM) for days-to-heading [19]. Tdurum_contig14046_364 for RLW on chromosome 6B was only 0.84 cM from the QTLs for the number of spikes per square meter (wsnp_Ra_c14498_667649, wsnp_Ex_c34011_42398664, and wsnp_Ex_rep_c67012_65465394) [8]. RAC875_c31299_1215 for GY on chromosome 6B was located only 0.59 cM from the QTLs for TKW (wsnp_Ex_c3025_5587183, wsnp_Ex_rep_c66342_64519823, wsnp_Ex_rep_c69373_68311942, and wsnp_Ex_rep_c69373_68312188) [8], and was also close to the MTA (RAC875_rep_c71463_98) for PH [15]. These results revealed the pleiotropism of QTLs/MTAs for the GY and related traits, which may be due to the complex relationships among these traits.

Development of CAPS marker for TKW on chromosome 5A and its application in wheat breeding

The SNP (Tdurum_contig71499_211) on chromosome 5A was identified to be significantly associated with TKW based on BLUP values, and a CAPS marker (Td99211) for the SNP was successfully developed. Using 372 wheat varieties (lines) and 188 lines from the RIL population of Yangxiaomai × Zhongyou 9507 cross, we further validated the association of the CAPS marker with TKW. Moreover, the Td99211-G allele was associated with higher TKW compared with Td99211-A, and thus considered as a favorable allele. Notably, only 49 (13.17%) harbored the Td99211-G allele in 372 wheat varieties (lines), indicating the Td99211-G allele have not been widely utilized in genetic improvement of wheat yield. By contrast, the favorable variations of several genes controlling TKW, such as TaGW2 [41], TaCWI [42], TaGS-D1 [43], TaGASR7-A1 [44], have been positively selected in wheat breeding. However, the candidate genes controlling TKW on chromosome 5A had not been reported in previous studies. Therefore, cloning of the target gene controlling TKW on chromosome 5A is necessary for pyramiding breeding for wheat yield.

Supporting information

S1 Table. Distribution of 31,250 and 15,430 effective single nucleotide polymorphism markers throughout the wheat genome.

https://doi.org/10.1371/journal.pone.0188662.s001

(XLSX)

S2 Table. The phenotypic BLUPs of different varieties.

https://doi.org/10.1371/journal.pone.0188662.s002

(XLSX)

S3 Table. Correlations between best linear unbiased predictions and the measured values among different environments.

https://doi.org/10.1371/journal.pone.0188662.s003

(XLSX)

S4 Table. Correlations among best linear unbiased predictions of the grain yield and related traits.

https://doi.org/10.1371/journal.pone.0188662.s004

(XLSX)

S5 Table. Bayesian information criterion values for various traits analyzed using a model with Q + K* or PC8 + K* covariates.

https://doi.org/10.1371/journal.pone.0188662.s005

(XLSX)

S6 Table. The comparison of 90K markers identified in different studies.

https://doi.org/10.1371/journal.pone.0188662.s006

(XLSX)

S1 Fig. The GWAS panel of 105 winter wheat varieties with abundant phenotypic variation used in this study.

https://doi.org/10.1371/journal.pone.0188662.s007

(TIF)

S2 Fig. Frequency distributions of different phenotypic BLUPs.

https://doi.org/10.1371/journal.pone.0188662.s008

(TIF)

Acknowledgments

We thank a lot to Dr. Shi-He Xiao in CAAS for kindly providing parts of materials.

References

  1. 1. Godfray C, Beddington J, Crute I, Haddad L, Lawrence D, Muir J, et al. Food Security: The Challenge of Feeding 9 Billion People. Science. 2010; 327: 812–818. pmid:20110467
  2. 2. Mir RR, Kumar N, Jaiswal V, Girdharwal N, Prasad M, Balyan HS, et al. Genetic dissection of grain weight in bread wheat through quantitative trait locus interval and association mapping. Mol Breeding. 2012; 29: 963–972.
  3. 3. Kumar N, Kulwal PL, Balyan HS, Gupta PK. QTL mapping for yield and yield contributing traits in two mapping populations of bread wheat. Mol Breeding. 2007; 19: 163–177.
  4. 4. McIntyre CL, Mathews KL, Rattey A, Chapman SC, Drenth J, Ghaderi M, et al. Molecular detection of genomic regions associated with grain yield and yield-related components in an elite bread wheat cross evaluated under irrigated and rainfed conditions. Theor Appl Genet. 2010; 120: 527–541. pmid:19865806
  5. 5. Wang JS, Liu WH, Wang H, Li LH, Wu J, Yang XM, et al. QTL mapping of yield-related traits in the wheat germplasm 3228. Euthytica. 2011; 177: 277–292.
  6. 6. Mason RE, Hays DB, Mondal S, Ibrahim AMH, Basnet BR. QTL for yield, yield components and canopy temperature depression in wheat under late sown field conditions. Euphytica. 2013; 194: 243–259.
  7. 7. Cui F, Zhao CH, Ding AM, Li J, Wang L, Li XF, et al. Construction of an integrative linkage map and QTL mapping of grain yield-related traits using three related wheat RIL populations. Theor Appl Genet. 2014; 127: 659–675. pmid:24326459
  8. 8. Li CL, Bai GH, Carver BF, Chao SAM, Wang ZH. Single nucleotide polymorphism markers linked to QTL for wheat yield traits. Euphytica. 2015; 206: 89–101.
  9. 9. Addison CK, Mason RE, Brown-Guedira G, Guedira M, Hao YF, Miller RG, et al. QTL and major genes influencing grain yield potential in soft red winter wheat adapted to the southern united states. Euphytica. 2016; 209: 665–677.
  10. 10. Su ZQ, Jin SJ, Lu Y, Zhang GR, Chao SAM, Bai GH. Single nucleotide polymorphism tightly linked to a major QTL on chromosome 7A for both kernel length and kernel weight in wheat. Mol Breeding. 2016; 36: 1–11.
  11. 11. Neumann K, Kobiljski B, Denčić S, Varshney RK, Börner A. Genome-wide association mapping: a case study in bread wheat (Triticum aestivum L.). Mol Breeding. 2011; 27: 37–58.
  12. 12. Wang LF, Ge HM, Hao CY, Dong YS, Zhang XY. Identifying loci influencing 1,000-kernel weight in wheat by microsatellite screening for evidence of selection during breeding. PLoS ONE. 2012; 7: e29432. pmid:22328917
  13. 13. Bordes J, Goudemand E, Duchalais L, Chevarin L, Oury FX, Heumez E, et al. Genome-wide association mapping of three important traits using bread wheat elite breeding populations. Mol Breeding. 2014; 33: 755–768.
  14. 14. Edae EA, Byrne PF, Haley SD, Lopes MS, Reynolds MP. Genome-wide association mapping of yield and yield components of spring wheat under contrasting moisture regimes. Theor Appl Genet. 2014; 127: 791–807. pmid:24408378
  15. 15. Zanke CD, Ling J, Plieske J, Kollers S, Ebmeyer E, Korzun V, et al. Whole genome association mapping of plant height in winter wheat (Triticum aestivum L.). PLoS ONE. 2014; 9: e113287. pmid:25405621
  16. 16. Mora F, Castillo D, Lado B, Matus I, Poland J, Belzile F, et al. Genome-wide association mapping of agronomic traits and carbon isotope discrimination in a worldwide germplasm collection of spring wheat using SNP markers. Mol Breeding. 2015; 35: 69.
  17. 17. Sukumaran S, Dreisigacker S, Lopes M, Chavez P, Reynolds MP. Genome-wide association study for grain yield and related traits in an elite spring wheat population grown in temperate irrigated environments. Theor Appl Genet. 2015; 128: 353–363. pmid:25490985
  18. 18. Zanke CD, Ling J, Plieske J, Kollers S, Ebmeyer E, Korzun V, et al. Analysis of main effect QTL for thousand grain weight in European winter wheat (Triticum aestivum L.) by genome-wide association mapping. Front Plant Sci. 2015; 6: 644. pmid:26388877
  19. 19. Ain Q, Rasheed A, Anwar A, Mahmood T, Imtiaz M, Mahmood T, et al. Genome-wide association for grain yield under rainfed conditions in historical wheat cultivars from Pakistan. Front Plant Sci. 2015; 6: 743. pmid:26442056
  20. 20. Chen GF, Zhang H, Deng ZY, Wu RG, Li DM, Wang MY, et al. Genome-wide association study for kernel weight-related traits using SNPs in a Chinese winter wheat population. Euphytica. 2016; 212: 173–185.
  21. 21. Zhu YL, Wang SX, Zhang HP, Zhao LX, Wu ZY, Jiang H, et al. Identification of major loci for seed dormancy at different post-ripening stages after harvest and validation of a novel locus on chromosome 2AL in common wheat. Mol Breeding. 2016; 36:174.
  22. 22. Wang SX, Zhu YL, Zhang HP, Chang C, Ma CX. Analysis of genetic diversity and relationship among wheat breeding parents by SSR markers. J Triticeae Crops. 2014; 34: 621–627 (in Chinese with English abstract).
  23. 23. Mi XF, Wegenast T, Friedrich Utz H, Dhillon BS, Melchinger AE. Best linear unbiased prediction and optimum allocation of test resources in maize breeding with doubled haploids. Theor Appl Genet. 2011; 123: 1–10. pmid:21547486
  24. 24. Robinson GK. That BLUP is a good thing: the estimation of random effects. Statistical Sci. 1991; 6: 15–32.
  25. 25. Viana JMS, de Almeida ÍF, de Resebde MDV, Faria VR, Fonseca e Silva F. BLUP for genetic evaluation of plants in non-inbred families of annual crops. Euphytica. 2010; 174: 31–39.
  26. 26. Merk H L. Estimating Heritability and BLUPs for traits using tomato phenotypic data. Plant Breeding and Genomics. 2014. https://articles.extension.org/pages/61006/
  27. 27. Bates D, Maechler M, Bolker BM, Walker S. lme4: linear mixed-effects models using Eigen and S4. Journal of Statistical Software. 2015. Available from: http://lme4.r-forge.r-project.org/
  28. 28. Akhunov E, Nicolet C, Dvorak J. Single nucleotide polymorphism genotyping in polyploid wheat with the Illumina GoldenGate assay. Theor Appl Genet. 2009; 119: 507–517. pmid:19449174
  29. 29. Wang SC, Wong D, Forrest K, Allen A, Chao S, Huang BE, et al. Characterization of polyploid wheat genomic diversity using a high-density 90000 single nucleotide polymorphism array. Plant Biotech J. 2014; 12: 787–796.
  30. 30. Raj A, Stephens M, Prichard JK. fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics. 2014; 197: 573–589. pmid:24700103
  31. 31. Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, et al. Variance component model to account for sample structure in genome-wide association studies. Nat Genet. 2010; 42: 348–354. pmid:20208533
  32. 32. Lipka AE, Tian F, Wang QS, Peiffer J, Li M, Bradbury PJ, et al. GAPIT: genome association and prediction integrated tools. Bioinformatics. 2012; 28: 2397–2399. pmid:22796960
  33. 33. Bradbury PJ, Zhang ZW, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007; 23: 2633–2635. pmid:17586829
  34. 34. Zhang Z, Ersoz E, Lai CQ, Todhunter RJ, Tiwari HK, Gore MA, et al. Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 2010; 42: 355–360. pmid:20208535
  35. 35. George AW, Cavanagh C. Genome-wide association mapping in plants. Theor Appl Genet. 2015; 128: 1163–1174. pmid:25800009
  36. 36. Yu JM, Pressoir G, Briggs WH, Bi IV, Yamasaki M, Doebley JF, et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet. 2006; 38: 203–208. pmid:16380716
  37. 37. Schwarz G. Estimating the dimension of a model. The Annals of Statistics. 1978; 6: 461–464.
  38. 38. Claeskens G. Statistical model choice. Working Papers Department of Decision Sciences & Information Management; 2015.
  39. 39. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000; 155: 945–959. pmid:10835412
  40. 40. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009; 19: 1655–1664. pmid:19648217
  41. 41. Yang ZB, Bai ZY, Li XL, Wang P, Wu QX, Yang L, et al. SNP identification and allelic-specific PCR markers development for TaGW2, a gene linked to wheat kernel weight. Theor Appl Genet. 2012; 125:1057–1068. pmid:22643902
  42. 42. Jiang YM, Jiang QY, Hao CY, Hou J, Wang LF, Zhang HN, et al. A yield-associated gene TaCWI, in wheat: its function, selection and evolution in global breeding revealed by haplotype analysis. Theor Appl Genet. 2014; 128: 131–143. pmid:25367379
  43. 43. Zhang YJ, Liu JD, Xia XC, He ZH. TaGS-D1, an ortholog of rice OsGS3, is associated with grain weight and grain length in common wheat. Mol Breeding. 2014; 34: 1097–1107.
  44. 44. Dong LL, Wang FM, Liu T, Dong ZY, Li AL, Jing RL, et al. Natural variation of TaGASR7-A1 affects grain length in common wheat under multiple cultivation conditions. Mol Breeding. 2014; 34: 937–947.