Identifying The Structure Of Genomic Islands In Prokaryotes

TR Number

Date

2022-08-03

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Prokaryotic genomes evolve via horizontal gene transfer (HGT), mutations, and rearrangements. HGT is a mechanism that plays a significant role in prokaryotic evolution and leads to biodiversity in nature. One of the important components of HGT is the genomic island (GI) which is a subsequence of the genome created by HGT. This research aims to identify the structures of the prokaryotic GIs that have a fundamental role in the adoption of prokaryotes and the impact of the species on the environment. Previous computational biology research has focused on developing tools that detect GIs in prokaryotic genomes, while there is little research investigating GI structure. This research introduces a novel idea that has not yet been addressed intensively, which is identifying additional structures of the GIs in prokaryotes. There are two main directions in this research used to study the prokaryotic GIs structure from each different perspective. In the first direction, the aim is to investigate GI patterns and the existence of biological connections across bacterial phyla in terms of GIs on a large scale. This direction mainly aims to pursue the novel idea of connecting GIs across prokaryotic and phage genomes via patterns of protein families across many species. A pattern is a sequence of protein families that is found to frequently occur in the genomes of a number of species. Here the large data set available from the IslandViewer4 database and protein families from the Pfam database have been combined. Furthermore, implementing a comprehensive strategy to identify patterns that makes use of HMMER, BLAST, and MUSCLE; also implement Python programs that link the analysis into a single pipeline. Research results demonstrate that related GIs often exist in multiple species that are not evolutionarily related and indeed may be from multiple bacterial phyla. Analysis of the discovered patterns led to the identification of biological connections among prokaryotes and phages through their GIs. A connection is an HGT relation represented as a pattern that exists in a phage and a number of prokaryotic species. These discovered connections suggest quite broad HGT connections across the bacterial kingdom and its associated phages. In addition, these connections provide the basis for additional analysis of the breadth of HGT and the identification of individual HGT events that span bacterial phyla. Moreover, these patterns can suggest the basis for discovering the specific patterns in pathogenic GIs that could play a crucial role in antibiotic resistance. The second direction aims to identify the structure of the GIs in terms of their location within the genome. Prokaryotic GIs have been analyzed according to the genome structure that they are located in, whether it be a circular or a linear genome. The analysis is performed to study the GIs' location in relation to the oriC, investigating the nature of the distances between the GIs, and determining the distribution of GIs in the genome. The analysis has been performed on all of the GIs in the data set. Moreover, the GIs in one genome from each species and the GIs of the most frequent species are in the data set, in order to avoid bias. Overall, the results showed that there are preferable sites for the GIs in the genome. In the linear genomes, they are usually located in the origin of replication area and terminus, and in the circular genomes they are located in the terminus.

Description

Keywords

Prokaryotes, Genomic islands, Patterns, Bacteriophages, Connections

Citation