Computational Analysis of LC-MS/MS Data for Metabolite Identification
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Metabolomics aims at the detection and quantitation of metabolites within a biological system. As the most direct representation of phenotypic changes, metabolomics is an important component in system biology research. Recent development on high-resolution, high-accuracy mass spectrometers enables the simultaneous study of hundreds or even thousands of metabolites in one experiment. Liquid chromatography-mass spectrometry (LC-MS) is a commonly used instrument for metabolomic studies due to its high sensitivity and broad coverage of metabolome.
However, the identification of metabolites remains a bottle-neck for current metabolomic studies. This thesis focuses on utilizing computational approaches to improve the accuracy and efficiency for metabolite identification in LC-MS/MS-based metabolomic studies. First, an outlier screening approach is developed to identify those LC-MS runs with low analytical quality, so they will not adversely affect the identification of metabolites. The approach is computationally simple but effective, and does not depend on any preprocessing approach. Second, an integrated computational framework is proposed and implemented to improve the accuracy of metabolite identification and prioritize the multiple putative identifications of one peak in LC-MS data. Through the framework, peaks are likely to have the m/z values that can give appropriate putative identifications. And important guidance for the metabolite verification is provided by prioritizing the putative identifications. Third, an MS/MS spectral matching algorithm is proposed based on support vector machine classification. The approach provides an improved retrieval performance in spectral matching, especially in the presence of data heterogeneity due to different instruments or experimental settings used during the MS/MS spectra acquisition.