Metagenomic approaches for examining the diversity of large DNA viruses in the biosphere

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


The discovery of large DNA viruses has challenged the traditional perception of viral complexity due to their enormous genome size and physical dimensions. Previously, viruses were considered small, filterable agents until the discovery of large DNA viruses. Among large DNA viruses, the phylum Nucleocytoviricota and its members, which are often called "giant viruses" have large genome sizes (up to 2.5 Mbp) and virion sizes (up to 1.5 um). Due to having large virion and genome sizes, these viruses were often excluded from viral surveys and remained understudied for years. Luckily, the advancement of metagenomic analysis has facilitated the study of large DNA viruses by analyzing them directly from their environment without cultivating them in the lab, which could be challenging for viruses. In the first chapter of the thesis, I investigated 11 metagenome-assembled genomes (MAGs) of giant viruses previously surveyed from Station ALOHA in the Pacific Ocean. St. ALOHA is located near Hawaii and represents oligotrophic gyres which the majority of the ocean is made of them. I focused on 11 MAGs of giant viruses to get insight into their phylogenetic characteristics, genomic repertoire, and global distribution patterns. Despite the fact that metagenomic analysis has facilitated the study of genetic materials of microbes and viruses on a huge scale, it is essential to benchmark the performance of metagenomic tools and understand the associated biases, particularly in viral metagenomics. In the second chapter, I evaluated the performance of metagenomic tools (contigs assembler and binning tool) in recovering viral genomes using annotated dataset. We used a metagenome simulator (CAMISIM) to generate simulated short reads with known composition to assess these processes. Moreover, I emphasized the importance of binning contigs for viral genomes to fully recover the genomes of viruses along with discussing how diversity metrics were differed for contigs, bins populations.



Giant Viruses, Nucleocytoviricota, viral metagenomics, metagenome-assembled genomes (MAGs), large DNA viruses