Systems Integration of Biodefense Omics Data for Analysis of Pathogen-Host Interactions and Identification of Potential Targets
The NIAID (National Institute for Allergy and Infectious Diseases) Biodefense Proteomics program aims to identify targets for potential vaccines, therapeutics, and diagnostics for agents of concern in bioterrorism, including bacterial, parasitic, and viral pathogens. The program includes seven Proteomics Research Centers, generating diverse types of pathogen-host data, including mass spectrometry, microarray transcriptional profiles, protein interactions, protein structures and biological reagents. The Biodefense Resource Center (www.proteomicsresource.org) has developed a bioinformatics framework, employing a protein-centric approach to integrate and support mining and analysis of the large and heterogeneous data. Underlying this approach is a data warehouse with comprehensive protein + gene identifier and name mappings and annotations extracted from over 100 molecular databases. Value-added annotations are provided for key proteins from experimental findings using controlled vocabulary. The availability of pathogen and host omics data in an integrated framework allows global analysis of the data and comparisons across different experiments and organisms, as illustrated in several case studies presented here. (1) The identification of a hypothetical protein with differential gene and protein expressions in two host systems (mouse macrophage and human HeLa cells) infected by different bacterial (Bacillus anthracis and Salmonella typhimurium) and viral (orthopox) pathogens suggesting that this protein can be prioritized for additional analysis and functional characterization. (2) The analysis of a vaccinia-human protein interaction network supplemented with protein accumulation levels led to the identification of human Keratin, type II cytoskeletal 4 protein as a potential therapeutic target. (3) Comparison of complete genomes from pathogenic variants coupled with experimental information on complete proteomes allowed the identification and prioritization of ten potential diagnostic targets from Bacillus anthracis. The integrative analysis across data sets from multiple centers can reveal potential functional significance and hidden relationships between pathogen and host proteins, thereby providing a systems approach to basic understanding of pathogenicity and target identification.