Graph-Based Computational Approaches for Modeling Viral Evolution
| dc.contributor.author | Das, Badhan | en |
| dc.contributor.committeechair | Heath, Lenwood S. | en |
| dc.contributor.committeemember | Cao, Young | en |
| dc.contributor.committeemember | Pritchard, Leighton | en |
| dc.contributor.committeemember | Ji, Bo | en |
| dc.contributor.committeemember | Vinatzer, Boris A. | en |
| dc.contributor.department | Computer Science and#38; Applications | en |
| dc.date.accessioned | 2026-01-08T09:00:14Z | en |
| dc.date.available | 2026-01-08T09:00:14Z | en |
| dc.date.issued | 2026-01-07 | en |
| dc.description.abstract | Modeling viral evolution is essential for understanding how pathogens adapt, spread, and generate new variants of concern. Yet, it remains challenging due to high mutation rates, minimal sequence divergence, and the scale of modern genomic data. Most phylogenetic trees enforce a strictly bifurcating structure that struggles to represent recurrent mutations, recombination, convergent evolution, and intra-host diversity. In contrast, quasispecies theory describes viral populations as clouds of closely related mutants evolving within a high-dimensional sequence space, where evolutionary relationships are more naturally captured by graphs than trees. In this dissertation, I develop a sequence of graph-centered frameworks that integrate viral fitness, mutational distance, and mutational dynamics to model viral evolution from algorithmic and data-driven perspectives. First, ViraFit introduces a proof-of-concept model that couples epidemiological spread on contact networks with evolutionary dynamics on fitness landscapes, demonstrating how mutation, selection, and network structure jointly shape adaptive trajectories. Second, the Variant Evolution Graph (VEG) provides a scalable graph-based representation of SARS-CoV-2 evolution derived from mutational distances, allowing multiple ancestral relationships and capturing virus-specific evolutionary patterns that are difficult to represent with phylogenetic trees. A derived Disease Transmission Network further supports inference of likely transmission pathways and superspreaders. Finally, the Ancestor-Joining algorithm extends this representation into a predictive framework, Mutation Learning Graph (MLG), by inferring intermediate ancestral variants and enabling graph neural network–based lineage classification and mutational link prediction across geographically diverse SARS-CoV-2 cohorts. Together, ViraFit, VEG, and MLG form a unified methodological progression that links mechanistic modeling, evolutionary reconstruction, and predictive graph learning, providing a scalable, mutation-centric view of viral evolution that complements traditional phylogenetic approaches and supports future variant forecasting. | en |
| dc.description.abstractgeneral | Viruses such as SARS-CoV-2 evolve rapidly, generating many closely related variants as they spread through a population. These small genetic changes can influence how easily a virus spreads, how severe the disease becomes, and whether existing vaccines or treatments remain effective. Because of this, understanding how viruses change over time is essential for public health and pandemic preparedness. This dissertation develops new computational approaches to study viral evolution by using graphs, where each viral genome is represented as a node and edges show how one strain may have changed into another. Graphs offer a flexible way to capture the many possible paths a virus may take as it mutates, including patterns that traditional phylogenetic methods often miss. The first part of this work introduces ViraFit, a simulation framework that models how viruses mutate and how disease is transmitted simultaneously. It shows how the structure of human contact networks and the fitness of different viral strains work together to shape which variants become dominant. The second part presents the Variant Evolution Graph (VEG), a new method for organizing real viral genome data. VEG makes it easier to detect important virus-specific evolutionary events, such as repeated mutations, recombination, and diversity within infected individuals, that may not be clearly evident in standard evolutionary trees. The final part of the dissertation expands these ideas using machine learning. The Mutation Learning Graph (MLG) combines graph representations with advanced neural networks to learn patterns in viral evolution, predict relationships that have not yet been observed, and help anticipate how future variants might emerge. Together, these methods provide a more detailed and flexible picture of viral evolution, offering tools to support genomic surveillance, early detection of new variants, and improved understanding of viral evolution. Although developed using SARS-CoV-2 data, the approaches are general and can be applied to many other rapidly evolving viruses and biological systems. | en |
| dc.description.degree | Doctor of Philosophy | en |
| dc.format.medium | ETD | en |
| dc.identifier.other | vt_gsexam:45479 | en |
| dc.identifier.uri | https://hdl.handle.net/10919/140658 | en |
| dc.language.iso | en | en |
| dc.publisher | Virginia Tech | en |
| dc.rights | In Copyright | en |
| dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
| dc.subject | Viral evolution | en |
| dc.subject | Mutation | en |
| dc.subject | Fitness landscape | en |
| dc.subject | Edit distance | en |
| dc.subject | Mutation similarity | en |
| dc.subject | Variant Evolution Graph | en |
| dc.subject | Mutation Learning Graph | en |
| dc.title | Graph-Based Computational Approaches for Modeling Viral Evolution | en |
| dc.type | Dissertation | en |
| thesis.degree.discipline | Computer Science & Applications | en |
| thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
| thesis.degree.level | doctoral | en |
| thesis.degree.name | Doctor of Philosophy | en |
Files
Original bundle
1 - 1 of 1