Browsing by Author "Baba, Toshimi"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- Comparison of Single-Breed and Multi-Breed Training Populations for Infrared Predictions of Novel Phenotypes in Holstein CowsMota, Lucio Flavio Macedo; Pegolo, Sara; Baba, Toshimi; Morota, Gota; Peñagaricano, Francisco; Bittante, Giovanni; Cecchinato, Alessio (MDPI, 2021-07-02)In general, Fourier-transform infrared (FTIR) predictions are developed using a single-breed population split into a training and a validation set. However, using populations formed of different breeds is an attractive way to design cross-validation scenarios aimed at increasing prediction for difficult-to-measure traits in the dairy industry. This study aimed to evaluate the potential of FTIR prediction using training set combining specialized and dual-purpose dairy breeds to predict different phenotypes divergent in terms of biological meaning, variability, and heritability, such as body condition score (BCS), serum β-hydroxybutyrate (BHB), and kappa casein (k-CN) in the major cattle breed, i.e., Holstein-Friesian. Data were obtained from specialized dairy breeds: Holstein (468 cows) and Brown Swiss (657 cows), and dual-purpose breeds: Simmental (157 cows), Alpine Grey (75 cows), and Rendena (104 cows), giving a total of 1461 cows from 41 multi-breed dairy herds. The FTIR prediction model was developed using a gradient boosting machine (GBM), and predictive ability for the target phenotype in Holstein cows was assessed using different cross-validation (CV) strategies: a within-breed scenario using 10-fold cross-validation, for which the Holstein population was randomly split into 10 folds, one for validation and the remaining nine for training (10-fold_HO); an across-breed scenario (BS_HO) where the Brown Swiss cows were used as the training set and the Holstein cows as the validation set; a specialized multi-breed scenario (BS+HO_10-fold), where the entire Brown Swiss and Holstein populations were combined then split into 10 folds, and a multi-breed scenario (Multi-breed), where the training set comprised specialized (Holstein and Brown Swiss) and dual-purpose (Simmental, Alpine Grey, and Rendena) dairy cows, combined with nine folds of the Holstein cows. Lastly a Multi-breed CV2 scenario was implemented, assuming the same number of records as the reference scenario and using the same proportions as the multi-breed. Within-Holstein, FTIR predictions had a predictive ability of 0.63 for BCS, 0.81 for BHB, and 0.80 for k-CN. Using a specific breed (Brown Swiss) as the training set for prediction in the Holstein population reduced the prediction accuracy by 10% for BCS, 7% for BHB, and 11% for k-CN. Notably, the combination of Holstein and Brown Swiss cows in the training set increased the predictive ability of the model by 6%, which was 0.66 for BCS, 0.85 for BHB, and 0.87 for k-CN. Using multiple specialized and dual-purpose animals in the training set outperforms the 10-fold_HO (standard) approach, with an increase in predictive ability of 8% for BCS, 7% for BHB, and 10% for k-CN. When the Multi-breed CV2 was implemented, no improvement was observed. Our findings suggest that FTIR prediction of different phenotypes in the Holstein breed can be improved by including different specialized and dual-purpose breeds in the training population. Our study also shows that predictive ability is enhanced when the size of the training population and the phenotypic variability are increased.
- Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattleBaba, Toshimi; Pegolo, Sara; Mota, Lucio Flavio Macedo; Peñagaricano, Francisco; Bittante, Giovanni; Cecchinato, Alessio; Morota, Gota (2021-03-16)Abstract Background Over the past decade, Fourier transform infrared (FTIR) spectroscopy has been used to predict novel milk protein phenotypes. Genomic data might help predict these phenotypes when integrated with milk FTIR spectra. The objective of this study was to investigate prediction accuracy for milk protein phenotypes when heterogeneous on-farm, genomic, and pedigree data were integrated with the spectra. To this end, we used the records of 966 Italian Brown Swiss cows with milk FTIR spectra, on-farm information, medium-density genetic markers, and pedigree data. True and total whey protein, and five casein, and two whey protein traits were analyzed. Multiple kernel learning constructed from spectral and genomic (pedigree) relationship matrices and multilayer BayesB assigning separate priors for FTIR and markers were benchmarked against a baseline partial least squares (PLS) regression. Seven combinations of covariates were considered, and their predictive abilities were evaluated by repeated random sub-sampling and herd cross-validations (CV). Results Addition of the on-farm effects such as herd, days in milk, and parity to spectral data improved predictions as compared to those obtained using the spectra alone. Integrating genomics and/or the top three markers with a large effect further enhanced the predictions. Pedigree data also improved prediction, but to a lesser extent than genomic data. Multiple kernel learning and multilayer BayesB increased predictive performance, whereas PLS did not. Overall, multilayer BayesB provided better predictions than multiple kernel learning, and lower prediction performance was observed in herd CV compared to repeated random sub-sampling CV. Conclusions Integration of genomic information with milk FTIR spectral can enhance milk protein trait predictions by 25% and 7% on average for repeated random sub-sampling and herd CV, respectively. Multiple kernel learning and multilayer BayesB outperformed PLS when used to integrate heterogeneous data for phenotypic predictions.
- Longitudinal genome-wide association analysis using a single-step random regression model for height in Japanese Holstein cattleBaba, Toshimi; Morota, Gota; Kawakami, Junpei; Gotoh, Yusaku; Oka, Taro; Masuda, Yutaka; Brito, Luiz F.; Cockrum, Rebecca R.; Kawahara, Takayoshi (American Dairy Science Association, 2023-07-13)Growth traits, such as body weight and height, are essential in the design of genetic improvement programs of dairy cattle due to their relationship with feeding efficiency, longevity, and health. We investigated genomic regions influencing height across growth stages in Japanese Holstein cattle using a single-step random regression model. We used 72,921 records from birth to 60 mo of age with 4,111 animals born between 2000 and 2016. The analysis included 1,410 genotyped animals with 35,319 single nucleotide polymorphisms, consisting of 883 females with records and 527 bulls, and 30,745 animals with pedigree information. A single genomic region at the 58.4 megabase pair on chromosome 18 was consistently identified across 6 age points of 10, 20, 30, 40, 50, and 60 mo after multiple testing corrections for the significance threshold. Twelve candidate genes, previously reported for longevity and gestation length, were found near the identified genomic region. Another location near the identified region was also previously associated with body conformation, fertility, and calving difficulty. Functional Gene Ontology enrichment analysis suggested that the candidate genes regulate dephosphorylation and phosphatase activity. Our findings show that further study of the identified candidate genes will contribute to a better understanding of the genetic basis of height in Japanese Holstein cattle.
- Multi-trait random regression models increase genomic prediction accuracy for a temporal physiological trait derived from high-throughput phenotypingBaba, Toshimi; Momen, Mehdi; Campbell, Malachy T.; Walia, Harkamal; Morota, Gota (PLOS, 2020-02-03)Random regression models (RRM) are used extensively for genomic inference and prediction of time-valued traits in animal breeding, but only recently have been used in plant systems. High-throughput phenotyping (HTP) platforms provide a powerful means to collect high-dimensional phenotypes throughout the growing season for large populations. However, to date, selection of an appropriate statistical genomic framework to integrate multiple temporal traits for genomic prediction in plants remains unexplored. Here, we demonstrate the utility of a multi-trait RRM (MT-RRM) for genomic prediction of daily water usage (WU) in rice (Oryza sativa) through joint modeling with shoot biomass (projected shoot area, PSA). Three hundred and fifty-seven accessions were phenotyped daily for WU and PSA over 20 days using a greenhouse-based HTP platform. MT-RRMs that modeled additive genetic and permanent environmental effects for both traits using quadratic Legendre polynomials were used to assess genomic correlations between traits and genomic prediction for WU. Predictive abilities of the MT-RRMs were assessed using two cross-validation (CV) scenarios. The first scenario was designed to predict genetic values for WU at all time points for a set of accessions with unobserved WU. The second scenario was designed to forecast future genetic values for WU for a panel of known accessions with records for WU at earlier time periods. In each scenario we evaluated two MT-RRMs in which PSA records were absent or available for time points in the testing population. Weak to strong genomic correlations between WU and PSA were observed across the days of imaging (0.29-0.870.38-0.80). In both CV scenarios, MT-RRMs showed better predictive abilities compared to single-trait RRM, and prediction accuracies were greatly improved when PSA records were available for the testing population. In summary, these frameworks provide an effective approach to predict temporal physiological traits that are difficult or expensive to quantify in large populations.