Scholarly Works, Center for Advanced Innovation in Agriculture

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 14 of 14
  • Trends, Insights, and Future Prospects for Production in Controlled Environment Agriculture and Agrivoltaics Systems
    Dohlman, Erik; Maguire, Karen; Davis, Wilma V.; Husby, Megan; Bovay, John; Weber, Catharine; Lee, Yoonjung (2024-01-11)
    Investments in alternative food production systems by public and private entities have increased in recent years. Two systems, controlled environment agriculture (CEA) and agrivoltaics (AV), have been highlighted for their potential to provide socioeconomic benefits beyond food production. CEA is the use of enclosed structures—including hydroponic and vertical farming structures—for growing crops, primarily specialty crops. CEA may provide access to local production of nutritious food in communities that lack space for traditional outdoor production, improve access to local foods in urban areas, and serve as a potential tool for adapting to or mitigating climate change. The CEA sector is expanding in large part due to technological advancements. The number of CEA operations more than doubled between 2009 and 2019. Further, more than 60 percent of production for some prominent CEA crops (primarily vegetables) were grown using nontraditional technological systems in 2019. AV is the colocation of agricultural production and solar panels. AV may allow for expanded solar development to address climate change without land use conflicts associated with traditional large-scale solar developments. As of 2021, most AV sites were solar farms planted with pollinator-friendly vegetative cover that, in some cases, were grazed by sheep. Funding for research on a variety of AV systems with specialty crop and/or livestock production continues to increase.
  • Evaluating metabolic and genomic data for predicting grain traits under high night temperature stress in rice
    Bi, Ye; Yassue, Rafael Massahiro; Paul, Puneet; Dhatt, Balpreet Kaur; Sandhu, Jaspreet; Do, Phuc Thi; Walia, Harkamal; Obata, Toshihiro; Morota, Gota (Oxford University Press, 2023-05)
    The asymmetric increase in average nighttime temperatures relative to increase in average daytime temperatures due to climate change is decreasing grain yield and quality in rice. Therefore, a better genome-level understanding of the impact of higher night temperature stress on the weight of individual grains is essential for future development of more resilient rice. We investigated the utility of metabolites obtained from grains to classify high night temperature (HNT) conditions of genotypes, and metabolites and single-nucleotide polymorphisms (SNPs) to predict grain length, width, and perimeter phenotypes using a rice diversity panel. We found that the metabolic profiles of rice genotypes alone could be used to classify control and HNT conditions with high accuracy using random forest or extreme gradient boosting. Best linear unbiased prediction and BayesC showed greater metabolic prediction performance than machine learning models for grain-size phenotypes. Metabolic prediction was most effective for grain width, resulting in the highest prediction performance. Genomic prediction performed better than metabolic prediction. Integrating metabolites and genomics simultaneously in a prediction model slightly improved prediction performance. We did not observe a difference in prediction between the control and HNT conditions. Several metabolites were identified as auxiliary phenotypes that could be used to enhance the multi-trait genomic prediction of grain-size phenotypes. Our results showed that, in addition to SNPs, metabolites collected from grains offer rich information to perform predictive analyses, including classification modeling of HNT responses and regression modeling of grain-size-related phenotypes in rice.
  • Genome-wide association analysis of hyperspectral reflectance data to dissect the genetic architecture of growth-related traits in maize under plant growth-promoting bacteria inoculation
    Yassue, Rafael Massahiro; Galli, Giovanni; Chen, Chun-Peng James; Fritsche-Neto, Roberto; Morota, Gota (Wiley, 2023-04)
    Plant growth-promoting bacteria (PGPB) may be of use for increasing crop yield and plant resilience to biotic and abiotic stressors. Using hyperspectral reflectance data to assess growth-related traits may shed light on the underlying genetics as such data can help assess biochemical and physiological traits. This study aimed to integrate hyperspectral reflectance data with genome-wide association analyses to examine maize growth-related traits under PGPB inoculation. A total of 360 inbred maize lines with 13,826 single nucleotide polymorphisms (SNPs) were evaluated with and without PGPB inoculation; 150 hyperspectral wavelength reflectances at 386-1021 nm and 131 hyperspectral indices were used in the analysis. Plant height, stalk diameter, and shoot dry mass were measured manually. Overall, hyperspectral signatures produced similar or higher genomic heritability estimates than those of manually measured phenotypes, and they were genetically correlated with manually measured phenotypes. Furthermore, several hyperspectral reflectance values and spectral indices were identified by genome-wide association analysis as potential markers for growth-related traits under PGPB inoculation. Eight SNPs were detected, which were commonly associated with manually measured and hyperspectral phenotypes. Different genomic regions were found for plant growth and hyperspectral phenotypes between with and without PGPB inoculation. Moreover, the hyperspectral phenotypes were associated with genes previously reported as candidates for nitrogen uptake efficiency, tolerance to abiotic stressors, and kernel size. In addition, a Shiny web application was developed to explore multiphenotype genome-wide association results interactively. Taken together, our results demonstrate the usefulness of hyperspectral-based phenotyping for studying maize growth-related traits in response to PGPB inoculation.
  • Resources to Engage the Cyberbiosecurity Workforce Pipeline: Empowering Agricultural Educators and Middle School Girls in STEM
    Smilnak, David; Scherer, Hannah H.; Walz, Anita R.; Bonnett, Erika; Grey, Kindred (2023-03-16)
    Initiating the Rural Cyberbiosecurity Workforce Pipeline Through Empowering Agricultural Educators & Supporting Middle School Girls: Project Resources About the Resources The resources and activities of this project were piloted in middle school agriculture classes and 4-H learning environments and revised based on educator and learner feedback. Factsheets were evaluated by scientific and cybersecurity education experts and in part by the Center for Advanced Innovation in Agriculture Graduate Student Affiliates. The resources have been introduced to school-based agricultural educators and extension agents at state-level professional development conferences and to members of a national cybersecurity education network. Implementation by a new cohort of educators is underway to collect further input from the field. Their Purpose To date, youth have found the activities engaging, educators are excited about the possibility of innovating their agricultural education programs, and the factsheets provide spark novel ideas for further activities that can be modified and/or developed. Produced as Open Educational Resources (OER), the materials are freely available online for educators to download and can be remixed for use in a variety of educational settings. Educators are encouraged to use our resources, revise them for their own setting, and contribute their new versions and ideas to the growing OER collection. All factsheets, facilitator guides, and handouts are available for free electronic download.
  • Workload Assessment of Tractor Operations with Ergonomic Transducers and Machine Learning Techniques
    Hota, Smrutilipi; Tewari, V. K.; Chandel, Abhilash K. (MDPI, 2023-01-27)
    Dynamic muscular workload assessments of tractor operators are rarely studied or documented, which is critical to improving their performance efficiency and safety. A study was conducted to assess and model dynamic load on muscles, physiological variations, and discomfort of the tractor operators arriving from the repeated clutch and brake operations using wearable non-invasive ergonomic transducers and data-run techniques. Nineteen licensed tractor operators operated three different tractor types of varying power ranges at three operating speeds (4–5 km/h), and on two common operating surfaces (tarmacadam and farm roads). During these operations, ergonomic transducers were utilized to capture the load on foot muscles (gastrocnemius right [GR] and soleus right [SR] for brake operation and gastrocnemius left [GL], and soleus left [SL] for clutch operation) using electromyography (EMG). Forces exerted by the feet during brake and clutch operations were measured using a custom-developed foot transducer. During the process, heart rate (HR) and oxygen consumption rates (OCR) were also measured using HR monitor and K4b2 systems, and energy expenditure rate (EER) was determined using empirical equation. Post-tractor operation cycle, an overall discomfort rating (ODR) for that operation was manually recorded on a 10-point psychophysical scale. EMG-based maximum volumetric contraction (%MVC) measurements revealed higher strain on GR (%MVC = 43%), GL (%MVC = 38%), and SR (%MVC = 41%) muscles which in normal conditions should be below 30%. The clutch and brake actuation forces were recorded in the ranges of 90–312 N and 105–332 N, respectively and were significantly affected by the operating speed, tractor type, and operating surface (p < 0.05). EERs of the operators were measured in the moderate-heavy to heavy ranges (9–24 kJ/min) during the course of trials, suggesting the need to refine existing clutch and brake system designs. Average operator ODR responses indicated 7.8% operations in light, 48.5% in light-moderate, 25.2% in moderate, 10.7% in moderate-high, and 4.9% operations in high discomfort categories. When evaluated for the possibility of minimizing the number of transducers for physical workload assessment, EER showed moderate-high correlations with the EMG signals (rGR = 0.78, rGL = 0.75, rSR = 0.68, rSL = 0.66). Similarly, actuation forces had higher correlations with EMG signals for all the selected muscles (r = 0.70–0.87), suggesting the use of simpler transducers for effective operator workload assessment. As a means to minimize subjectivity in ODR responses, machine learning algorithms, including K-nearest neighbor (KNN), random forest classifier (RFC), and support vector machine (SVM), predicted the ODR using body mass index (BMI), HR, EER, and EMG at high accuracies of 87–97%, with RFC being the most accurate. Such high-throughput and data-run ergonomic evaluations can be instrumental in reconsidering workplace designs and better fits for end-users in terms of agricultural tractors and machinery systems.
  • VTag: a semi-supervised pipeline for tracking pig activity with a single top-view camera
    Chen, Chun-Peng J.; Morota, Gota; Lee, Kiho; Zhang, Zhiwu; Cheng, Hao (Oxford University Press, 2022-06)
    Precision livestock farming has become an important research focus with the rising demand of meat production in the swine industry. Currently, the farming practice is widely conducted by the technology of computer vision (CV), which automates monitoring pig activity solely based on video recordings. Automation is fulfilled by deriving imagery features that can guide CV systems to recognize animals' body contours, positions, and behavioral categories. Nevertheless, the performance of the CV systems is sensitive to the quality of imagery features. When the CV system is deployed in a variable environment, its performance may decrease as the features are not generalized enough under different illumination conditions. Moreover, most CV systems are established by supervised learning, in which intensive effort in labeling ground truths for the training process is required. Hence, a semi-supervised pipeline, VTag, is developed in this study. The pipeline focuses on long-term tracking of pig activity without requesting any pre-labeled video but a few human supervisions to build a CV system. The pipeline can be rapidly deployed as only one top-view RGB camera is needed for the tracking task. Additionally, the pipeline was released as a software tool with a friendly graphical interface available to general users. Among the presented datasets, the average tracking error was 17.99 cm. Besides, with the prediction results, the pig moving distance per unit time can be estimated for activity studies. Finally, as the motion is monitored, a heat map showing spatial hot spots visited by the pigs can be useful guidance for farming management. The presented pipeline saves massive laborious work in preparing training dataset. The rapid deployment of the tracking system paves the way for pig behavior monitoring. Lay Summary Collecting detailed measurements of animals through cameras has become an important focus with the rising demand for meat production in the swine industry. Currently, researchers use computational approaches to train models to recognize pig morphological features and monitor pig behaviors automatically. Though little human effort is needed after model training, current solutions require a large amount of pre-selected images for the training process, and the expensive preparation work is difficult for many farms to implement such practice. Hence, a pipeline, VTag, is presented to address these challenges in our study. With few supervisions, VTag can automatically track positions of multiple pigs from one single top-view RGB camera. No pre-labeled images are required to establish a robust pig tracking system. Additionally, the pipeline was released as a software tool with a friendly graphical user interface, that is easy to learn for general users. Among the presented datasets, the average tracking error is 17.99 cm, which is shorter than one-third of the pig body length in the study. The estimated pig activity from VTag can serve as useful farming guidance. The presented strategy saves massive laborious work in preparing training datasets and setting up monitoring environments. The rapid deployment of the tracking system paves the way for pig behavior monitoring. The presented pipeline, VTag, saves massive laborious work in preparing labeled training datasets and setting up environment for pig tracking tasks. VTag can be deployed rapidly and paves the way for pig behavior monitoring.
  • Water Stress Identification of Winter Wheat Crop with State-of-the-Art AI Techniques and High-Resolution Thermal-RGB Imagery
    Chandel, Narendra S.; Rajwade, Yogesh A.; Dubey, Kumkum; Chandel, Abhilash K.; Subeesh, A.; Tiwari, Mukesh K. (MDPI, 2022-12-02)
    Timely crop water stress detection can help precision irrigation management and minimize yield loss. A two-year study was conducted on non-invasive winter wheat water stress monitoring using state-of-the-art computer vision and thermal-RGB imagery inputs. Field treatment plots were irrigated using two irrigation systems (flood and sprinkler) at four rates (100, 75, 50, and 25% of crop evapotranspiration [ETc]). A total of 3200 images under different treatments were captured at critical growth stages, that is, 20, 35, 70, 95, and 108 days after sowing using a custom-developed thermal-RGB imaging system. Crop and soil response measurements of canopy temperature (Tc), relative water content (RWC), soil moisture content (SMC), and relative humidity (RH) were significantly affected by the irrigation treatments showing the lowest Tc (22.5 ± 2 °C), and highest RWC (90%) and SMC (25.7 ± 2.2%) for 100% ETc, and highest Tc (28 ± 3 °C), and lowest RWC (74%) and SMC (20.5 ± 3.1%) for 25% ETc. The RGB and thermal imagery were then used as inputs to feature-extraction-based deep learning models (AlexNet, GoogLeNet, Inception V3, MobileNet V2, ResNet50) while, RWC, SMC, Tc, and RH were the inputs to function-approximation models (Artificial Neural Network (ANN), Kernel Nearest Neighbor (KNN), Logistic Regression (LR), Support Vector Machine (SVM) and Long Short-Term Memory (DL-LSTM)) to classify stressed/non-stressed crops. Among the feature extraction-based models, ResNet50 outperformed other models showing a discriminant accuracy of 96.9% with RGB and 98.4% with thermal imagery inputs. Overall, classification accuracy was higher for thermal imagery compared to RGB imagery inputs. The DL-LSTM had the highest discriminant accuracy of 96.7% and less error among the function approximation-based models for classifying stress/non-stress. The study suggests that computer vision coupled with thermal-RGB imagery can be instrumental in high-throughput mitigation and management of crop water stress.
  • Securing the Food Industry: An Introduction to Cyberbiosecurity for Food Science
    Miller, Rebekah J.; Yun, Yin; Ray, Andrew; Duncan, Susan E. (2022-07-26)
    As technology becomes ever integrated into our food system and everyday life, our food industry and supply become ever more vulnerable to attack. Cyber attacks continue to threaten large and small companies, government agencies, individuals, and food and agriculture. This module, ‘Securing the Food Industry,’ aims to introduce the idea of cyberbiosecurity through a lecture format along with three case studies allowing students to interact and think through the concepts and materials. This module was built for implementation into college level courses with connection or interest in the food industry, food science, and agriculture as well as and technology courses focused on real world applications. The lecture starts by introducing the amount of technology in food science and the food industry then transitions into concerns about security. After discussing multiple subtypes of security already integrated into the food industry, cyberbiosecurity is introduced. The term and definition are discussed before the categories of cyber attacks are introduced. The lecture relates these ideas back to the food industry before sharing a few real-life examples of detrimental cyber-attacks. The lecture concludes are explain the impact a cyber attack can cause, who is responsible for preventing and recovering from these attacks, as well as suggested practices to reduce vulnerabilities. Three theoretical but realistic case studies with discussion questions follow the lecture. These studies were written to act as small group discussion starters but could be used for whole class discussion, individual writing assignments, or other applications. A list of additional resources can be found with the course material. This list provides a small sampling of additional documents which discuss cyberbiosecurity. The resources listed at the end of the lecture are not included in the additional resources document but also provide helpful information in the exploration and understanding of cyberbiosecurity. Food science resources are also included in this document to provide additional background around the food industry portion of this course material. Securing the Food Industry is an open educational resource (OER). Instructors reviewing, adopting, or adapting the module should indicate their interest at https://forms.gle/orFRGhYs8owBP7gD6.
  • How to Count Bugs: A Method to Estimate the Most Probable Absolute Population Density and Its Statistical Bounds from a Single Trap Catch
    Onufrieva, Ksenia S.; Onufriev, Alexey V. (MDPI, 2021-10-13)
    Knowledge of insect population density is crucial for establishing management and conservation tactics and evaluating treatment efficacies. Here, we propose a simple and universal method for estimating the most probable absolute population density and its statistical bounds. The method is based on a novel relationship between experimentally measurable characteristics of insect trap systems and the probability to catch an insect located a given distance away from the trap. The generality of the proposed relationship is tested using 10 distinct trapping datasets collected for insects from 5 different orders and using major trapping methods, i.e., chemical-baited traps and light. For all datasets, the relationship faithfully (R = 0.91) describes the experiment. The proposed approach will take insect detection and monitoring to a new, rigorously quantitative level. It will improve conservation and management, while driving future basic and applied research in population and chemical ecology.
  • Forecasting dynamic body weight of nonrestrained pigs from images using an RGB-D sensor camera
    Yu, Haipeng; Lee, Kiho; Morota, Gota (Oxford University Press, 2021-01-01)
    Average daily gain is an indicator of the growth rate, feed efficiency, and current health status of livestock species including pigs. Continuous monitoring of daily gain in pigs aids producers to optimize their growth performance while ensuring animal welfare and sustainability, such as reducing stress reactions and feed waste. Computer vision has been used to predict live body weight from video images without direct handling of the pig. In most studies, videos were taken while pigs were immobilized at a weighing station or feeding area to facilitate data collection. An alternative approach is to capture videos while pigs are allowed to move freely within their own housing environment, which can be easily applied to the production system as no special imaging station needs to be established. The objective of this study was to establish a computer vision system by collecting RGB-D videos to capture top-view red, green, and blue (RGB) and depth images of nonrestrained, growing pigs to predict their body weight over time. Over a period of 38 d, eight growers were video recorded for approximately 3 min/d, at the rate of six frames per second, and manually weighed using an electronic scale. An image-processing pipeline in Python using OpenCV was developed to process the images. Specifically, each pig within the RGB frame was segmented by a thresholding algorithm, and the contour of the pig was identified to extract its length and width. The height of a pig was estimated from the depth images captured by the infrared depth sensor. Quality control included removing pigs that were touching the fence and sitting, as well as those showing extremely distorted shape or motion blur owing to their frequent movement. Fitting all of the morphological image descriptors simultaneously in linear mixed models yielded prediction coefficients of determination of 0.72-0.98, 0.65-0.95, 0.51-0.94, and 0.49-0.93 for 1-, 2-, 3-, and 4-d ahead forecasting, respectively, of body weight in time series cross-validation. Based on the results, we conclude that our RGB-D sensor-based imaging system coupled with the Python image-processing pipeline could potentially provide an effective approach to predict the live body weight of nonrestrained pigs from images.
  • Assessing the Role of Cyberbiosecurity in Agriculture: A Case Study
    Drape, Tiffany A.; Magerkorth, Noah; Sen, Anuradha; Simpson, Joseph; Seibel, Megan M.; Murch, Randall Steven; Duncan, Susan E. (Frontiers, 2021-08-19)
    Agriculture has adopted the use of smart technology to help meet growing food demands. This increased automation and associated connectivity increases the risk of farms being targeted by cyber-attacks. Increasing frequency of cybersecurity breaches in many industries illustrates the need for securing our food supply chain. The uniqueness of biological data, the complexity of integration across the food and agricultural system, and the importance of this system to the U.S. bioeconomy and public welfare suggests an urgency as well as unique challenges that are not common across all industries. To identify and address the gaps in awareness and knowledge as well as encourage collaborations, Virginia Tech hosted a virtual workshop consisting of professionals from agriculture, cybersecurity, government, and academia. During the workshop, thought leaders and influencers discussed 1) common food and agricultural system challenges, scenarios, outcomes and risks to various sectors of the system; 2) cyberbiosecurity strategies for the system, gaps in workforce and training, and research and policy needs. The meeting sessions were transcribed and analyzed using qualitative methodology. The most common themes that emerged were challenges, solutions, viewpoints, common vocabulary. From the results of the analysis, it is evident that none of the participating groups had available cybersecurity training and resources. Participants were uncertain about future pathways for training, implementation, and outreach related to cyberbiosecurity. Recommendations include creating training and education, continued interdisciplinary collaboration, and recruiting government involvement to speed up better security practices related to cyberbiosecurity.
  • ASAS-NANP SYMPOSIUM: prospects for interactive and dynamic graphics in the era of data-rich animal science
    Morota, Gota; Cheng, Hao; Cook, Dianne; Tanaka, Emi (2021-02)
    Statistical graphics, and data visualization, play an essential but under-utilized, role for data analysis in animal science, and also to visually illustrate the concepts, ideas, or outputs of research and in curricula. The recent rise in web technologies and ubiquitous availability of web browsers enables easier sharing of interactive and dynamic graphics. Interactivity and dynamic feedback enhance humancomputer interaction and data exploration. Web applications such as decision support systems coupled with multimedia tools synergize with interactive and dynamic graphics. However, the importance of graphics for effectively communicating data, understanding data uncertainty, and the state of the field of interactive and dynamic graphics is underappreciated in animal science. To address this gap, we describe the current state of graphical methodology and technology that might be more broadly adopted. This includes an explanation of a conceptual framework for effective graphics construction. The ideas and technology are illustrated using publicly available animal datasets. We foresee that many new types of big and complex data being generated in precision livestock farming create exciting opportunities for applying interactive and dynamic graphics to improve data analysis and make data-supported decisions.
  • Comparison of Single-Breed and Multi-Breed Training Populations for Infrared Predictions of Novel Phenotypes in Holstein Cows
    Mota, Lucio Flavio Macedo; Pegolo, Sara; Baba, Toshimi; Morota, Gota; Peñagaricano, Francisco; Bittante, Giovanni; Cecchinato, Alessio (MDPI, 2021-07-02)
    In general, Fourier-transform infrared (FTIR) predictions are developed using a single-breed population split into a training and a validation set. However, using populations formed of different breeds is an attractive way to design cross-validation scenarios aimed at increasing prediction for difficult-to-measure traits in the dairy industry. This study aimed to evaluate the potential of FTIR prediction using training set combining specialized and dual-purpose dairy breeds to predict different phenotypes divergent in terms of biological meaning, variability, and heritability, such as body condition score (BCS), serum β-hydroxybutyrate (BHB), and kappa casein (k-CN) in the major cattle breed, i.e., Holstein-Friesian. Data were obtained from specialized dairy breeds: Holstein (468 cows) and Brown Swiss (657 cows), and dual-purpose breeds: Simmental (157 cows), Alpine Grey (75 cows), and Rendena (104 cows), giving a total of 1461 cows from 41 multi-breed dairy herds. The FTIR prediction model was developed using a gradient boosting machine (GBM), and predictive ability for the target phenotype in Holstein cows was assessed using different cross-validation (CV) strategies: a within-breed scenario using 10-fold cross-validation, for which the Holstein population was randomly split into 10 folds, one for validation and the remaining nine for training (10-fold_HO); an across-breed scenario (BS_HO) where the Brown Swiss cows were used as the training set and the Holstein cows as the validation set; a specialized multi-breed scenario (BS+HO_10-fold), where the entire Brown Swiss and Holstein populations were combined then split into 10 folds, and a multi-breed scenario (Multi-breed), where the training set comprised specialized (Holstein and Brown Swiss) and dual-purpose (Simmental, Alpine Grey, and Rendena) dairy cows, combined with nine folds of the Holstein cows. Lastly a Multi-breed CV2 scenario was implemented, assuming the same number of records as the reference scenario and using the same proportions as the multi-breed. Within-Holstein, FTIR predictions had a predictive ability of 0.63 for BCS, 0.81 for BHB, and 0.80 for k-CN. Using a specific breed (Brown Swiss) as the training set for prediction in the Holstein population reduced the prediction accuracy by 10% for BCS, 7% for BHB, and 11% for k-CN. Notably, the combination of Holstein and Brown Swiss cows in the training set increased the predictive ability of the model by 6%, which was 0.66 for BCS, 0.85 for BHB, and 0.87 for k-CN. Using multiple specialized and dual-purpose animals in the training set outperforms the 10-fold_HO (standard) approach, with an increase in predictive ability of 8% for BCS, 7% for BHB, and 10% for k-CN. When the Multi-breed CV2 was implemented, no improvement was observed. Our findings suggest that FTIR prediction of different phenotypes in the Holstein breed can be improved by including different specialized and dual-purpose breeds in the training population. Our study also shows that predictive ability is enhanced when the size of the training population and the phenotypic variability are increased.
  • Integrating genomic and infrared spectral data improves the prediction of milk protein composition in dairy cattle
    Baba, Toshimi; Pegolo, Sara; Mota, Lucio Flavio Macedo; Peñagaricano, Francisco; Bittante, Giovanni; Cecchinato, Alessio; Morota, Gota (2021-03-16)
    Abstract Background Over the past decade, Fourier transform infrared (FTIR) spectroscopy has been used to predict novel milk protein phenotypes. Genomic data might help predict these phenotypes when integrated with milk FTIR spectra. The objective of this study was to investigate prediction accuracy for milk protein phenotypes when heterogeneous on-farm, genomic, and pedigree data were integrated with the spectra. To this end, we used the records of 966 Italian Brown Swiss cows with milk FTIR spectra, on-farm information, medium-density genetic markers, and pedigree data. True and total whey protein, and five casein, and two whey protein traits were analyzed. Multiple kernel learning constructed from spectral and genomic (pedigree) relationship matrices and multilayer BayesB assigning separate priors for FTIR and markers were benchmarked against a baseline partial least squares (PLS) regression. Seven combinations of covariates were considered, and their predictive abilities were evaluated by repeated random sub-sampling and herd cross-validations (CV). Results Addition of the on-farm effects such as herd, days in milk, and parity to spectral data improved predictions as compared to those obtained using the spectra alone. Integrating genomics and/or the top three markers with a large effect further enhanced the predictions. Pedigree data also improved prediction, but to a lesser extent than genomic data. Multiple kernel learning and multilayer BayesB increased predictive performance, whereas PLS did not. Overall, multilayer BayesB provided better predictions than multiple kernel learning, and lower prediction performance was observed in herd CV compared to repeated random sub-sampling CV. Conclusions Integration of genomic information with milk FTIR spectral can enhance milk protein trait predictions by 25% and 7% on average for repeated random sub-sampling and herd CV, respectively. Multiple kernel learning and multilayer BayesB outperformed PLS when used to integrate heterogeneous data for phenotypic predictions.