Supervised Inference of Gene Regulatory Networks

dc.contributor.authorSen, Malabika Ashiten
dc.contributor.committeechairMurali, T. M.en
dc.contributor.committeememberReddy, Chandan K.en
dc.contributor.committeememberHeath, Lenwood S.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2023-03-04T07:00:18Zen
dc.date.available2023-03-04T07:00:18Zen
dc.date.issued2021-09-09en
dc.description.abstractA gene regulatory network (GRN) records the interactions among transcription factors and their target genes. GRNs are useful to study how transcription factors (TFs) control gene expression as cells transition between states during differentiation and development. Scientists usually construct GRNs by careful examination and study of the literature. This process is slow and painstaking and does not scale to large networks. In this thesis, we study the problem of inferring GRNs automatically from gene expression data. Recent data-driven approaches to infer GRNs increasingly rely on single-cell level RNA-sequencing (scRNA-seq) data. Most of these methods rely on unsupervised or association based strategies, which cannot leverage known regulatory interactions by design. To facilitate supervised learning, we propose a novel graph convolutional neural network (GCN) based autoencoder to infer new regulatory edges from a known GRN and scRNA-seq data. As the name suggests, a GCN-based autoencoder consists of an encoder that learns a low-dimensional embedding of the nodes (genes) in the input graph (the GRN) through a series of graph convolution operations and a decoder that aims to reconstruct the original graph as accurately as possible. We investigate several GCN-based architectures to determine the ideal encoder-decoder combination for GRN reconstruction. We systematically study the performance of these and other supervised learning methods on different mouse and human scRNA-seq datasets for two types of evaluation. We demonstrate that our GCN-based approach substantially outperforms traditional machine learning approaches.en
dc.description.abstractgeneralIn multi-cellular living organisms, stem cells differentiate into multiple cell types. Proteins called transcription factors (TFs) control the activity of genes to effect these transitions. It is possible to represent these interactions abstractly using a gene regulatory network (GRN). In a GRN, each node is a TF or a gene and each edge connects a TF to a gene or TF that it controls. New high-throughput technologies that can measure gene expression (activity) in individual cells provide rich data that can be used to construct GRNs. In this thesis, we take advantage of recent advances in the field of machine learning to develop a new computational method for computationally constructing GRNs. The distinguishing property of our technique is that it is supervised, i.e., it uses experimentally-known interactions to infer new regulatory connections. We investigate several variations of this approach to reconstruct a GRN as close to the original network as possible. We analyze and provide a rationale for the decisions made in designing, evaluating, and choosing the characteristics of our predictor. We show that our predictor has a reconstruction accuracy that is superior to other supervised-learning approaches.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:32441en
dc.identifier.urihttp://hdl.handle.net/10919/114037en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectGene Regulatory Networksen
dc.subjectNetwork Inferenceen
dc.subjectLink Predictionen
dc.subjectGraph Convolutional Networksen
dc.subjectGraph Machine Learningen
dc.titleSupervised Inference of Gene Regulatory Networksen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sen_MA_T_2021.pdf
Size:
1.33 MB
Format:
Adobe Portable Document Format

Collections