Computational Analysis of Gene Expression Regulation from Cross Species Comparison to Single Cell Resolution
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Gene expression regulation is dynamic and specific to various factors such as developmental stages, environmental conditions, and stimulation of pathogens. Nowadays, a tremendous amount of transcriptome data sets are available from diverse species. This trend enables us to perform comparative transcriptome analysis that identifies conserved or diverged gene expression responses across species using transcriptome data. The goal of this dissertation is to develop and apply approaches of comparative transcriptomics to transfer knowledge from model species to non-model species with the hope that such an approach can contribute to the improvement of crop yield and human health. First, we presented a comprehensive method to identify cross-species modules between two plant species. We adapted the unsupervised network-based module finding method to identify conserved patterns of co-expression and functional conservation between Arabidopsis, a model species, and soybean, a crop species. Second, we compared drought-responsive genes across Arabidopsis, soybean, rice, corn, and Populus in order to explore the genomic characteristics that are conserved under drought stress across species. We identified hundreds of common gene families and conserved regulatory motifs between monocots and dicots. We also presented a BLS-based clustering method which takes into account evolutionary relationships among species to identify conserved co-expression genes. Last, we analyzed single-cell RNA-seq data from monocytes to attempt to understand regulatory mechanism of innate immune system under low-grade inflammation. We identified novel subpopulations of cells treated with lipopolysaccharide (LPS), that show distinct expression patterns from pro-inflammatory genes. The data revealed that a promising therapeutic reagent, sodium 4-phenylbutyrate, masked the effect of LPS. We inferred the existence of specific cellular transitions under different treatments and prioritized important motifs that modulate the transitions using feature selection by a random forest method. There has been a transition in genomics research from bulk RNA-seq to single-cell RNA-seq, and scRNA-seq has become a widely used approach for transcriptome analysis. With the experience we gained by analyzing scRNA-seq data, we plan to conduct comparative single-cell transcriptome analysis across multiple species.