Browsing by Author "Gong, Ting"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Computational Dissection of Composite Molecular Signatures and Transcriptional ModulesGong, Ting (Virginia Tech, 2009-12-14)This dissertation aims to develop a latent variable modeling framework with which to analyze gene expression profiling data for computational dissection of molecular signatures and transcriptional modules. The first part of the dissertation is focused on extracting pure gene expression signals from tissue or cell mixtures. The main goal of gene expression profiling is to identify the pure signatures of different cell types (such as cancer cells, stromal cells and inflammatory cells) and estimate the concentration of each cell type. In order to accomplish this, a new blind source separation method is developed, namely, nonnegative partially independent component analysis (nPICA), for tissue heterogeneity correction (THC). The THC problem is formulated as a constrained optimization problem and solved with a learning algorithm based on geometrical and statistical principles. The second part of the dissertation sought to identify gene modules from gene expression data to uncover important biological processes in different types of cells. A new gene clustering approach, nonnegative independent component analysis (nICA), is developed for gene module identification. The nICA approach is completed with an information-theoretic procedure for input sample selection and a novel stability analysis approach for proper dimension estimation. Experimental results showed that the gene modules identified by the nICA approach appear to be significantly enriched in functional annotations in terms of gene ontology (GO) categories. The third part of the dissertation moves from gene module level down to DNA sequence level to identify gene regulatory programs by integrating gene expression data and protein-DNA binding data. A sparse hidden component model is first developed for this problem, taking into account a well-known biological principle, i.e., a gene is most likely regulated by a few regulators. This is followed by the development of a novel computational approach, motif-guided sparse decomposition (mSD), in order to integrate the binding information and gene expression data. These computational approaches are primarily developed for analyzing high-throughput gene expression profiling data. Nevertheless, the proposed methods should be able to be extended to analyze other types of high-throughput data for biomedical research.
- Motif-guided sparse decomposition of gene expression data for regulatory module identificationGong, Ting; Xuan, Jianhua; Chen, Li; Riggins, Rebecca B.; Li, Huai; Hoffman, Eric P.; Clarke, Robert; Wang, Yue (2011-03-22)Background Genes work coordinately as gene modules or gene networks. Various computational approaches have been proposed to find gene modules based on gene expression data; for example, gene clustering is a popular method for grouping genes with similar gene expression patterns. However, traditional gene clustering often yields unsatisfactory results for regulatory module identification because the resulting gene clusters are co-expressed but not necessarily co-regulated. Results We propose a novel approach, motif-guided sparse decomposition (mSD), to identify gene regulatory modules by integrating gene expression data and DNA sequence motif information. The mSD approach is implemented as a two-step algorithm comprising estimates of (1) transcription factor activity and (2) the strength of the predicted gene regulation event(s). Specifically, a motif-guided clustering method is first developed to estimate the transcription factor activity of a gene modu≤ sparse component analysis is then applied to estimate the regulation strength, and so predict the target genes of the transcription factors. The mSD approach was first tested for its improved performance in finding regulatory modules using simulated and real yeast data, revealing functionally distinct gene modules enriched with biologically validated transcription factors. We then demonstrated the efficacy of the mSD approach on breast cancer cell line data and uncovered several important gene regulatory modules related to endocrine therapy of breast cancer. Conclusion We have developed a new integrated strategy, namely motif-guided sparse decomposition (mSD) of gene expression data, for regulatory module identification. The mSD method features a novel motif-guided clustering method for transcription factor activity estimation by finding a balance between co-regulation and co-expression. The mSD method further utilizes a sparse decomposition method for regulation strength estimation. The experimental results show that such a motif-guided strategy can provide context-specific regulatory modules in both yeast and breast cancer studies.