Methods for Analysis of Prokaryotic Genome Architecture
Warren, Andrew Scott
MetadataShow full item record
Research in comparative microbial genomics has largely been organized around the concept of reference genomes. Reference genomes provide a useful comparative touchstone for closely related organisms. However, they do not necessarily well represent the biological diversity in a group of genomes. As sampling through sequencing becomes both deeper and broader, reference genome based methods become less effective at characterizing groups of organisms. We present an algorithm for creating whole genome multiple sequence comparisons and a model for representing the similarities and differences among sequences as a graph of syntenic gene families called a pan-synteny graph. As the evolutionary distance between organisms increase sequence similarity and homology detection tend to break down. However, similarities in the functional characteristics of certain genes and gene modules may persist or have converged over time. Detecting and defining patterns in these functional similarities, in relation to conserved gene order, is a largely unexplored problem. To create a model for representing the architectural similarity of functional modules, using ontologies and semantic similarity, we present a corpus independent semantic similarity method, and describe a computational framework for using semantic similarity and pan-synteny graphs.
- Doctoral Dissertations