Identifying The Structure Of Genomic Islands In Prokaryotes
Aldaihani, Reem A. A. H. S.
MetadataShow full item record
Prokaryotic genomes evolve via horizontal gene transfer (HGT), mutations, and rearrangements. HGT is a mechanism that plays a significant role in prokaryotic evolution and leads to biodiversity in nature. One of the important components of HGT is the genomic island (GI) which is a subsequence of the genome created by HGT. This research aims to identify the structures of the prokaryotic GIs that have a fundamental role in the adoption of prokaryotes and the impact of the species on the environment. Previous computational biology research has focused on developing tools that detect GIs in prokaryotic genomes, while there is little research investigating GI structure. This research introduces a novel idea that has not yet been addressed intensively, which is identifying additional structures of the GIs in prokaryotes. There are two main directions in this research used to study the prokaryotic GIs structure from each different perspective. In the first direction, the aim is to investigate GI patterns and the existence of biological connections across bacterial phyla in terms of GIs on a large scale. This direction mainly aims to pursue the novel idea of connecting GIs across prokaryotic and phage genomes via patterns of protein families across many species. A pattern is a sequence of protein families that is found to frequently occur in the genomes of a number of species. Here the large data set available from the IslandViewer4 database and protein families from the Pfam database have been combined. Furthermore, implementing a comprehensive strategy to identify patterns that makes use of HMMER, BLAST, and MUSCLE; also implement Python programs that link the analysis into a single pipeline. Research results demonstrate that related GIs often exist in multiple species that are not evolutionarily related and indeed may be from multiple bacterial phyla. Analysis of the discovered patterns led to the identification of biological connections among prokaryotes and phages through their GIs. A connection is an HGT relation represented as a pattern that exists in a phage and a number of prokaryotic species. These discovered connections suggest quite broad HGT connections across the bacterial kingdom and its associated phages. In addition, these connections provide the basis for additional analysis of the breadth of HGT and the identification of individual HGT events that span bacterial phyla. Moreover, these patterns can suggest the basis for discovering the specific patterns in pathogenic GIs that could play a crucial role in antibiotic resistance. The second direction aims to identify the structure of the GIs in terms of their location within the genome. Prokaryotic GIs have been analyzed according to the genome structure that they are located in, whether it be a circular or a linear genome. The analysis is performed to study the GIs' location in relation to the oriC, investigating the nature of the distances between the GIs, and determining the distribution of GIs in the genome. The analysis has been performed on all of the GIs in the data set. Moreover, the GIs in one genome from each species and the GIs of the most frequent species are in the data set, in order to avoid bias. Overall, the results showed that there are preferable sites for the GIs in the genome. In the linear genomes, they are usually located in the origin of replication area and terminus, and in the circular genomes they are located in the terminus.
General Audience Abstract
Prokaryotes are one of the most abundant species on earth that play an essential role in naturally shaping the planet and its life. This research aims to identify the structure of a component in these species that has a fundamental role in the adoption of prokaryotes and the impact of the species on the environment. This component is a part of the genome named the genomic island (GI). This dissertation aims to identify the structure of the GIs in two different ways that have not yet been addressed extensively. The first direction aims to discover patterns in the GIs and then use them to bring to light biological connections between prokaryotic and bacteriophages. In this direction, a comprehensive strategy has been utilized to identify patterns and connections. This strategy uses several tools such as BLAST, HMMER, and MUSCLE. Furthermore, Python programs that link the analysis into a single pipeline have been implemented. In the second direction, an investigation has been performed to understand the nature of the GIs' locations within the genome. This direction addresses three different analysis techniques to achieve its target. The three analyses are studying the GIs' location in relation to the origin of replication, investigating the nature of the distances between the GIs, and discovering the location distribution of GIs in the genome. The analysis is performed on linear genomes and circular genomes separately. In each group of GIs, the data set has been utilized to see the results from different perspectives. The overall analysis in both directions relived several findings. In the first direction, the discovered patterns merit deep investigation based on the possibility that they are related to diseases. In addition, in prokaryotic genomes, there are specific sites where the GIs can be frequently seen that need further search to understand the relation between the GIs' location and the content of the GI in terms of proteins.
- Doctoral Dissertations