Browsing by Author "Momen, Mehdi"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
- An assessment of genomic connectedness measures in Nellore cattleAmorim, Sabrina T.; Yu, Haipeng; Momen, Mehdi; de Albuquerque, Lucia Galvao; Cravo Pereira, Angelica S.; Baldi, Fernando; Morota, Gota (2020-11)An important criterion to consider in genetic evaluations is the extent of genetic connectedness across management units (MU), especially if they differ in their genetic mean. Reliable comparisons of genetic values across MU depend on the degree of connectedness: the higher the connectedness, the more reliable the comparison. Traditionally, genetic connectedness was calculated through pedigree-based methods; however, in the era of genomic selection, this can be better estimated utilizing new approaches based on genomics. Most procedures consider only additive genetic effects, which may not accurately reflect the underlying gene action of the evaluated trait, and little is known about the impact of non-additive gene action on connectedness measures. The objective of this study was to investigate the extent of genomic connectedness measures, for the first time, in Brazilian field data by applying additive and non-additive relationship matrices using a fatty acid profile data set from seven farms located in the three regions of Brazil, which are part of the three breeding programs. Myristic acid (C14:0) was used due to its importance for human health and reported presence of non-additive gene action. The pedigree included 427,740 animals and 925 of them were genotyped using the Bovine high-density genotyping chip. Six relationship matrices were constructed, parametrically and non-parametrically capturing additive and non-additive genetic effects from both pedigree and genomic data. We assessed genome-based connectedness across MU using the prediction error variance of difference (PEVD) and the coefficient of determination (CD). PEVD values ranged from 0.540 to 1.707, and CD from 0.146 to 0.456. Genomic information consistently enhanced the measures of connectedness compared to the numerator relationship matrix by at least 63%. Combining additive and non-additive genomic kernel relationship matrices or a non-parametric relationship matrix increased the capture of connectedness. Overall, the Gaussian kernel yielded the largest measure of connectedness. Our findings showed that connectedness metrics can be extended to incorporate genomic information and non-additive genetic variation using field data. We propose that different genomic relationship matrices can be designed to capture additive and non-additive genetic effects, increase the measures of connectedness, and to more accurately estimate the true state of connectedness in herds.
- Including Phenotypic Causal Networks in Genome-Wide Association Studies Using Mixed Effects Structural Equation ModelsMomen, Mehdi; Mehrgardi, Ahmad Ayatollahi; Roudbar, Mahmoud Amiri; Kranis, Andreas; Pinto, Renan Mercuri; Valente, Bruno D.; Morota, Gota; Rosa, Gullherme J. M.; Gianola, Daniel (Frontiers, 2018-10-09)Network based statistical models accounting for putative causal relationships among multiple phenotypes can be used to infer single-nucleotide polymorphism (SNP) effect which transmitting through a given causal path in genome-wide association studies (GWAS). In GWAS with multiple phenotypes, reconstructing underlying causal structures among traits and SNPs using a single statistical framework is essential for understanding the entirety of genotype-phenotype maps. A structural equation model (SEM) can be used for such purposes. We applied SEM to GWAS (SEM-GWAS) in chickens, taking into account putative causal relationships among breast meat (BM), body weight (Btu), hen-house production (HHP), and SNPs. We assessed the performance of SEM-GWAS by comparing the model results with those obtained from traditional multi-trait association analyses (MTM-GWAS). Three different putative causal path diagrams were inferred from highest posterior density (HPD) intervals of 0.75, 0.85, and 0.95 using the inductive causation algorithm. A positive path coefficient was estimated for BM -> BW, and negative values were obtained for BM -> HHP and BW -> HHP in all implemented scenarios. Further, the application of SEM-GWAS enabled the decomposition of SNP effects into direct, indirect, and total effects, identifying whether a SNP effect is acting directly or indirectly on a given trait. In contrast, MTM-GWAS only captured overall genetic effects on traits, which is equivalent to combining the direct and indirect SNP effects from SEM-GWAS. Although MTM-GWAS and SEM-GWAS use the similar probabilistic models, we provide evidence that SEM-GWAS captures complex relationships in terms of causal meaning and mediation and delivers a more comprehensive understanding of SNP effects compared to MTM-GWAS. Our results showed that SEM-GWAS provides important insight regarding the mechanism by which identified SNPs control traits by partitioning them into direct, indirect, and total SNP effects.
- Modeling multiple phenotypes in wheat using data-driven genomic exploratory factor analysis and Bayesian network learningMomen, Mehdi; Bhatta, Madhav; Hussain, Waseem; Yu, Haipeng; Morota, Gota (2021-01)Inferring trait networks from a large volume of genetically correlated diverse phenotypes such as yield, architecture, and disease resistance can provide information on the manner in which complex phenotypes are interrelated. However, studies on statistical methods tailored to multidimensional phenotypes are limited, whereas numerous methods are available for evaluating the massive number of genetic markers. Factor analysis operates at the level of latent variables predicted to generate observed responses. The objectives of this study were to illustrate the manner in which data-driven exploratory factor analysis can map observed phenotypes into a smaller number of latent variables and infer a genomic latent factor network using 45 agro-morphological, disease, and grain mineral phenotypes measured in synthetic hexaploid wheat lines (Triticum aestivum L.). In total, eight latent factors including grain yield, architecture, flag leaf-related traits, grain minerals, yellow rust, two types of stem rust, and leaf rust were identified as common sources of the observed phenotypes. The genetic component of the factor scores for each latent variable was fed into a Bayesian network to obtain a trait structure reflecting the genetic interdependency among traits. Three directed paths were consistently identified by two Bayesian network algorithms. Flag leaf-related traits influenced leaf rust, and yellow rust and stem rust influenced grain yield. Additional paths that were identified included flag leaf-related traits to minerals and minerals to architecture. This study shows that data-driven exploratory factor analysis can reveal smaller dimensional common latent phenotypes that are likely to give rise to numerous observed field phenotypes without relying on prior biological knowledge. The inferred genomic latent factor structure from the Bayesian network provides insights for plant breeding to simultaneously improve multiple traits, as an intervention on one trait will affect the values of focal phenotypes in an interrelated complex trait system.
- Multi-omic data integration for the study of production, carcass, and meat quality traits in Nellore cattlede Novais, Francisco Jose; Yu, Haipeng; Cesar, Aline Silva Mello; Momen, Mehdi; Poleti, Mirele Daiana; Petry, Bruna; Mourao, Gerson Barreto; Regitano, Luciana Correia de Almeida; Morota, Gota; Coutinho, Luiz Lehmann (Frontiers, 2022-10)Data integration using hierarchical analysis based on the central dogma or common pathway enrichment analysis may not reveal non-obvious relationships among omic data. Here, we applied factor analysis (FA) and Bayesian network (BN) modeling to integrate different omic data and complex traits by latent variables (production, carcass, and meat quality traits). A total of 14 latent variables were identified: five for phenotype, three for miRNA, four for protein, and two for mRNA data. Pearson correlation coefficients showed negative correlations between latent variables miRNA 1 (mirna1) and miRNA 2 (mirna2) (-0.47), ribeye area (REA) and protein 4 (prot4) (-0.33), REA and protein 2 (prot2) (-0.3), carcass and prot4 (-0.31), carcass and prot2 (-0.28), and backfat thickness (BFT) and miRNA 3 (mirna3) (-0.25). Positive correlations were observed among the four protein factors (0.45-0.83): between meat quality and fat content (0.71), fat content and carcass (0.74), fat content and REA (0.76), and REA and carcass (0.99). BN presented arcs from the carcass, meat quality, prot2, and prot4 latent variables to REA; from meat quality, REA, mirna2, and gene expression mRNA1 to fat content; from protein 1 (prot1) and mirna2 to protein 5 (prot5); and from prot5 and carcass to prot2. The relations of protein latent variables suggest new hypotheses about the impact of these proteins on REA. The network also showed relationships among miRNAs and nebulin proteins. REA seems to be the central node in the network, influencing carcass, prot2, prot4, mRNA1, and meat quality, suggesting that REA is a good indicator of meat quality. The connection among miRNA latent variables, BFT, and fat content relates to the influence of miRNAs on lipid metabolism. The relationship between mirna1 and prot5 composed of isoforms of nebulin needs further investigation. The FA identified latent variables, decreasing the dimensionality and complexity of the data. The BN was capable of generating interrelationships among latent variables from different types of data, allowing the integration of omics and complex traits and identifying conditional independencies. Our framework based on FA and BN is capable of generating new hypotheses for molecular research, by integrating different types of data and exploring non-obvious relationships.
- Multi-trait random regression models increase genomic prediction accuracy for a temporal physiological trait derived from high-throughput phenotypingBaba, Toshimi; Momen, Mehdi; Campbell, Malachy T.; Walia, Harkamal; Morota, Gota (PLOS, 2020-02-03)Random regression models (RRM) are used extensively for genomic inference and prediction of time-valued traits in animal breeding, but only recently have been used in plant systems. High-throughput phenotyping (HTP) platforms provide a powerful means to collect high-dimensional phenotypes throughout the growing season for large populations. However, to date, selection of an appropriate statistical genomic framework to integrate multiple temporal traits for genomic prediction in plants remains unexplored. Here, we demonstrate the utility of a multi-trait RRM (MT-RRM) for genomic prediction of daily water usage (WU) in rice (Oryza sativa) through joint modeling with shoot biomass (projected shoot area, PSA). Three hundred and fifty-seven accessions were phenotyped daily for WU and PSA over 20 days using a greenhouse-based HTP platform. MT-RRMs that modeled additive genetic and permanent environmental effects for both traits using quadratic Legendre polynomials were used to assess genomic correlations between traits and genomic prediction for WU. Predictive abilities of the MT-RRMs were assessed using two cross-validation (CV) scenarios. The first scenario was designed to predict genetic values for WU at all time points for a set of accessions with unobserved WU. The second scenario was designed to forecast future genetic values for WU for a panel of known accessions with records for WU at earlier time periods. In each scenario we evaluated two MT-RRMs in which PSA records were absent or available for time points in the testing population. Weak to strong genomic correlations between WU and PSA were observed across the days of imaging (0.29-0.870.38-0.80). In both CV scenarios, MT-RRMs showed better predictive abilities compared to single-trait RRM, and prediction accuracies were greatly improved when PSA records were available for the testing population. In summary, these frameworks provide an effective approach to predict temporal physiological traits that are difficult or expensive to quantify in large populations.
- Predicting Longitudinal Traits Derived from High-Throughput Phenomics in Contrasting Environments Using Genomic Legendre Polynomials and B-SplinesMomen, Mehdi; Campbell, Malachy T.; Walia, Harkamal; Morota, Gota (Genetics Society of America, 2019-10)Recent advancements in phenomics coupled with increased output from sequencing technologies can create the platform needed to rapidly increase abiotic stress tolerance of crops, which increasingly face productivity challenges due to climate change. In particular, high-throughput phenotyping (HTP) enables researchers to generate large-scale data with temporal resolution. Recently, a random regression model (RRM) was used to model a longitudinal rice projected shoot area (PSA) dataset in an optimal growth environment. However, the utility of RRM is still unknown for phenotypic trajectories obtained from stress environments. Here, we sought to apply RRM to forecast the rice PSA in control and water-limited conditions under various longitudinal cross-validation scenarios. To this end, genomic Legendre polynomials and B-spline basis functions were used to capture PSA trajectories. Prediction accuracy declined slightly for the water-limited plants compared to control plants. Overall, RRM delivered reasonable prediction performance and yielded better prediction than the baseline multi-trait model. The difference between the results obtained using Legendre polynomials and that using B-splines was small; however, the former yielded a higher prediction accuracy. Prediction accuracy for forecasting the last five time points was highest when the entire trajectory from earlier growth stages was used to train the basis functions. Our results suggested that it was possible to decrease phenotyping frequency by only phenotyping every other day in order to reduce costs while minimizing the loss of prediction accuracy. This is the first study showing that RRM could be used to model changes in growth over time under abiotic stress conditions.
- Quantifying genomic connectedness and prediction accuracy from additive and non-additive gene actionsMomen, Mehdi; Morota, Gota (2018-09-17)Background Genetic connectedness is classically used as an indication of the risk associated with breeding value comparisons across management units because genetic evaluations based on best linear unbiased prediction rely for their success on sufficient linkage among different units. In the whole-genome prediction era, the concept of genetic connectedness can be extended to measure a connectedness level between reference and validation sets. However, little is known regarding (1) the impact of non-additive gene action on genomic connectedness measures and (2) the relationship between the estimated level of connectedness and prediction accuracy in the presence of non-additive genetic variation. Results We evaluated the extent to which non-additive kernel relationship matrices increase measures of connectedness and investigated its relationship with prediction accuracy in the cross-validation framework using best linear unbiased prediction and coefficients of determination. Simulated data assuming additive, dominance, and epistatic gene action scenarios and real swine data were analyzed. We found that the joint use of additive and non-additive genomic kernel relationship matrices or non-parametric relationship matrices led to increased capturing of connectedness, up to 25%, and improved prediction accuracies compared to those of baseline additive relationship counterparts in the presence of non-additive gene action. Conclusions Our findings showed that connectedness metrics can be extended to incorporate non-additive genetic variation of complex traits. Use of kernel relationship matrices designed to capture non-additive gene action increased measures of connectedness and improved whole-genome prediction accuracy, further broadening the scope of genomic connectedness studies.
- Structural equation modeling for investigating multi-trait genetic architecture of udder health in dairy cattlePegolo, Sara; Momen, Mehdi; Morota, Gota; Rosa, Gullherme J. M.; Gianola, Daniel; Bittante, Giovanni; Cecchinato, Alessio (Nature Publishing Group, 2020-05-08)Mastitis is one of the most prevalent and costly diseases in dairy cattle. It results in changes in milk composition and quality which are indicators of udder inflammation in absence of clinical signs. We applied structural equation modeling (SEM) - GWAS aiming to explore interrelated dependency relationships among phenotypes related to udder health, including milk yield (MY), somatic cell score (SCS), lactose (%, LACT), pH and non-casein N (NCN, % of total milk N), in a cohort of 1,158 Brown Swiss cows. The phenotypic network inferred via the Hill-Climbing algorithm was used to estimate SEM parameters. Integration of multi-trait models-GWAS and SEM-GWAS identified six significant SNPs for SCS, and quantified the contribution of MY and LACT acting as mediator traits to total SNP effects. Functional analyses revealed that overrepresented pathways were often shared among traits and were consistent with biological knowledge (e.g., membrane transport activity for pH and MY or Wnt signaling for SCS and NCN). In summary, SEM-GWAS offered new insights on the relationships among udder health phenotypes and on the path of SNP effects, providing useful information for genetic improvement and management strategies in dairy cattle.
- Utilizing trait networks and structural equation models as tools to interpret multi-trait genome-wide association studiesMomen, Mehdi; Campbell, Malachy T.; Walia, Harkamal; Morota, Gota (2019-09-18)Background Plant breeders seek to develop cultivars with maximal agronomic value, which is often assessed using numerous, often genetically correlated traits. As intervention on one trait will affect the value of another, breeding decisions should consider the relationships among traits in the context of putative causal structures (i.e., trait networks). While multi-trait genome-wide association studies (MTM-GWAS) can infer putative genetic signals at the multivariate scale, standard MTM-GWAS does not accommodate the network structure of phenotypes, and therefore does not address how the traits are interrelated. We extended the scope of MTM-GWAS by incorporating trait network structures into GWAS using structural equation models (SEM-GWAS). Here, we illustrate the utility of SEM-GWAS using a digital metric for shoot biomass, root biomass, water use, and water use efficiency in rice. Results A salient feature of SEM-GWAS is that it can partition the total single nucleotide polymorphism (SNP) effects acting on a trait into direct and indirect effects. Using this novel approach, we show that for most QTL associated with water use, total SNP effects were driven by genetic effects acting directly on water use rather that genetic effects originating from upstream traits. Conversely, total SNP effects for water use efficiency were largely due to indirect effects originating from the upstream trait, projected shoot area. Conclusions We describe a robust framework that can be applied to multivariate phenotypes to understand the interrelationships between complex traits. This framework provides novel insights into how QTL act within a phenotypic network that would otherwise not be possible with conventional multi-trait GWAS approaches. Collectively, these results suggest that the use of SEM may enhance our understanding of complex relationships among agronomic traits.