Algorithms for regulatory network inference and experiment planning in systems biology

TR Number

Date

2020-07-17

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

I present novel solutions to two different classes of computational problems that arise in the study of complex cellular processes. The first problem arises in the context of planning large-scale genetic cross experiments that can be used to validate predictions of multigenic perturbations made by mathematical models.

(i) I present CrossPlan, a novel methodology for systematically planning genetic crosses to make a set of target mutants from a set of source mutants. CrossPlan is based on a generic experimental workflow used in performing genetic crosses in budding yeast. CrossPlan uses an integer-linear-program (ILP) to maximize the number of target mutants that we can make under certain experimental constraints. I apply it to a comprehensive mathematical model of the protein regulatory network controlling cell division in budding yeast.

(ii) I formulate several natural problems related to efficient synthesis of a target mutant from source mutants. These formulations capture experimentally-useful notions of verifiability (e.g., the need to confirm that a mutant contains mutations in the desired genes) and permissibility (e.g., the requirement that no intermediate mutants in the synthesis be inviable). I present several polynomial time or fixed-parameter tractable algorithms for optimal synthesis of a target mutant for special cases of the problem that arise in practice.

The second problem I address is inferring gene regulatory networks (GRNs) from single cell transcriptomic (scRNA-seq) data. These GRNs can serve as starting points to build mathematical models.

(iii) I present BEELINE, a comprehensive evaluation of state-of-the-art algorithms for inferring gene regulatory networks (GRNs) from single-cell gene expression data. The evaluations from BEELINE suggest that the area under the precision-recall curve and early precision of these algorithms are moderate. Techniques that do not require pseudotime-ordered cells are generally more accurate. Based on these results, I present recommendations to end users of GRN inference methods. BEELINE will aid the development of gene regulatory network inference algorithms.

(iv) Based on the insights gained from BEELINE, I propose a novel graph convolutional neural network (GCN) based supervised algorithm for GRN inference form single-cell gene expression data. This GCN-based model has a considerably better accuracy than existing supervised learning algorithms for GRN inference from scRNA-seq data and can infer cell-type specific regulatory networks.

Description

Keywords

network biology, experiment planning, gene regulatory networks, deep learning, single cell transcriptomics

Citation