Utilizing genomics and historical data to optimize gene pools for new breeding programs: A case study in winter wheat

dc.contributor.authorBallen-Taborda, Carolinaen
dc.contributor.authorLyerly, Jeanetteen
dc.contributor.authorSmith, Jareden
dc.contributor.authorHowell, Kimberlyen
dc.contributor.authorBrown-Guedira, Ginaen
dc.contributor.authorBabar, Md. Alien
dc.contributor.authorHarrison, Stephen A.en
dc.contributor.authorMason, Richard E.en
dc.contributor.authorMergoum, Mohameden
dc.contributor.authorMurphy, J. Paulen
dc.contributor.authorSutton, Russellen
dc.contributor.authorGriffey, Carl A.en
dc.contributor.authorBoyles, Richard E.en
dc.description.abstractWith the rapid generation and preservation of both genomic and phenotypic information for many genotypes within crops and across locations, emerging breeding programs have a valuable opportunity to leverage these resources to 1) establish the most appropriate genetic foundation at program inception and 2) implement robust genomic prediction platforms that can effectively select future breeding lines. Integrating genomics-enabled (1) breeding into cultivar development can save costs and allow resources to be reallocated towards advanced (i.e., later) stages of field evaluation, which can facilitate an increased number of testing locations and replicates within locations. In this context, a reestablished winter wheat breeding program was used as a case study to understand best practices to leverage and tailor existing genomic and phenotypic resources to determine optimal genetics for a specific target population of environments. First, historical multi-environment phenotype data, representing 1,285 advanced breeding lines, were compiled from multi-institutional testing as part of the SunGrains cooperative and used to produce GGE biplots and PCA for yield. Locations were clustered based on highly correlated line performance among the target population of environments into 22 subsets. For each of the subsets generated, EMMs and BLUPs were calculated using linear models with the 'lme4' R package. Second, for each subset, TPs representative of the new SC breeding lines were determined based on genetic relatedness using the 'STPGA' R package. Third, for each TP, phenotypic values and SNP data were incorporated into the 'rrBLUP' mixed models for generation of GEBVs of YLD, TW, HD and PH. Using a five-fold cross-validation strategy, an average accuracy of r = 0.42 was obtained for yield between all TPs. The validation performed with 58 SC elite breeding lines resulted in an accuracy of r = 0.62 when the TP included complete historical data. Lastly, QTL-by-environment interaction for 18 major effect genes across three geographic regions was examined. Lines harboring major QTL in the absence of disease could potentially underperform (e.g., Fhb1 R-gene), whereas it is advantageous to express a major QTL under biotic pressure (e.g., stripe rust R-gene). This study highlights the importance of genomics-enabled breeding and multi-institutional partnerships to accelerate cultivar development.en
dc.description.notesThis work was supported by the USDA NIFA AFRI Foundational project SC-2020-03599 awarded to REB (award no. 2021-67014-33941) and the Sun Grains cooperative breeding program.en
dc.description.sponsorshipUSDA NIFA AFRI Foundational project; Sun Grains cooperative breeding program [SC-2020-03599, 2021-67014-33941]en
dc.description.versionPublished versionen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.subjectwinter wheat (Triticum aestivum L.)en
dc.subjecthistorical dataen
dc.subjecttraining populationsen
dc.subjectgenomic selectionen
dc.subjectprediction accuracyen
dc.titleUtilizing genomics and historical data to optimize gene pools for new breeding programs: A case study in winter wheaten
dc.title.serialFrontiers in Geneticsen
dc.typeArticle - Refereeden


Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
2.01 MB
Adobe Portable Document Format
Published version