Integrating bioinformatic approaches to promote crop resilience


TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


Even under the best management strategies contemporary crops face yield losses from diverse threats such as, pathogens, pests, and environmental stress. Adding to this management challenge is that under current global climate projections these impacts are predicted to become even greater. Natural genetic variation, long used by traditional plant breeders, holds great promise for adapting high performing agronomic lines to these stressors. Yet, efforts to bolster crop plant resilience using wild relatives have been hindered by time consuming efforts to develop genomic tools and/or identify the genetic basis for agronomic traits. Thus, increasing crop plant resilience requires developing and deploying approaches that leverage current high-throughput sequencing technologies to more rapidly and robustly develop genomic tools in these systems. Here we report the integration of bioinformatic and statistical tools to leverage high-throughput sequencing to 1) develop a machine learning approach to determine factors impacting transcriptome assembly and quantitatively evaluate transcriptome completeness, 2) dissect complex physiological pathway interactions in Solanum pimpinellifolium under combined stresses—using comparative transcriptomics, and 3) develop a genome assembly pipeline that can be deployed to rapidly assemble a more contiguous genome, unraveling previously hidden complexity, using Phytopthora capsici as a model. As a result, we have generated strategic guidelines for transcriptome assembly and developed an orthologue and reference free, machine learning based tool "WWMT" to quantitatively score transcriptome completeness from short read data. Secondly, we identified "hub genes" and describe genes involved with "cross-talk" between drought and herbivore stress response pathways. Finally, we demonstrate a protocol for combining long-read sequencing from the Oxford Nanopore Technologies MinION, and short-read data, to rapidly assembly a cost-effective, contiguous and relatively complete genome. Here we uncovered hidden variation in a well-known plant pathogen finding that the genome was 92% bigger than previous estimates with more than 39% of duplicated regions, supporting a hypothesized recent whole genome duplication in this clade. This community resource will support new functional and evolutionary studies in this economically important pathogen.



transcriptome, Machine learning, stress tolerance, Nanopore, genome