Graph-Based Computational Approaches for Modeling Viral Evolution

Files

TR Number

Date

2026-01-07

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Modeling viral evolution is essential for understanding how pathogens adapt, spread, and generate new variants of concern. Yet, it remains challenging due to high mutation rates, minimal sequence divergence, and the scale of modern genomic data. Most phylogenetic trees enforce a strictly bifurcating structure that struggles to represent recurrent mutations, recombination, convergent evolution, and intra-host diversity. In contrast, quasispecies theory describes viral populations as clouds of closely related mutants evolving within a high-dimensional sequence space, where evolutionary relationships are more naturally captured by graphs than trees. In this dissertation, I develop a sequence of graph-centered frameworks that integrate viral fitness, mutational distance, and mutational dynamics to model viral evolution from algorithmic and data-driven perspectives. First, ViraFit introduces a proof-of-concept model that couples epidemiological spread on contact networks with evolutionary dynamics on fitness landscapes, demonstrating how mutation, selection, and network structure jointly shape adaptive trajectories. Second, the Variant Evolution Graph (VEG) provides a scalable graph-based representation of SARS-CoV-2 evolution derived from mutational distances, allowing multiple ancestral relationships and capturing virus-specific evolutionary patterns that are difficult to represent with phylogenetic trees. A derived Disease Transmission Network further supports inference of likely transmission pathways and superspreaders. Finally, the Ancestor-Joining algorithm extends this representation into a predictive framework, Mutation Learning Graph (MLG), by inferring intermediate ancestral variants and enabling graph neural network–based lineage classification and mutational link prediction across geographically diverse SARS-CoV-2 cohorts. Together, ViraFit, VEG, and MLG form a unified methodological progression that links mechanistic modeling, evolutionary reconstruction, and predictive graph learning, providing a scalable, mutation-centric view of viral evolution that complements traditional phylogenetic approaches and supports future variant forecasting.

Description

Keywords

Viral evolution, Mutation, Fitness landscape, Edit distance, Mutation similarity, Variant Evolution Graph, Mutation Learning Graph

Citation