Developing a Computational Pipeline for Detecting Multi-Functional Antibiotic Resistance Genes in Metagenomics Data

dc.contributor.authorDang, Ngoc Khoien
dc.contributor.committeechairZhang, Liqingen
dc.contributor.committeememberKarpatne, Anujen
dc.contributor.committeememberLourentzou, Isminien
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2022-06-10T08:00:45Zen
dc.date.available2022-06-10T08:00:45Zen
dc.date.issued2022-06-09en
dc.description.abstractAntibiotic resistance is currently a global threat spanning clinical, environmental, and geopolitical research domains. The environment is increasingly recognized as a key node in the spread of antibiotic resistance genes (ARGs), which confer antibiotic resistance to bacteria. Detecting ARGs in the environment is the first step in monitoring and controlling antibiotic resistance. In recent years, next-generation sequencing of environmental samples (metagenomic sequencing data) has become a prolific tool for the field of surveillance. Metagenomic data are nucleic acid sequences, or nucleotides, of environmental samples. Metagenomic sequencing data has been used over the years to detect and analyze ARGs. An intriguing instance of ARGs is the multi-functional ARG, where one ARG encodes two or more different antibiotic resistance functions. Multi-functional ARGs provide resistance to two or more antibiotics, thus should have evolutionary advantage over ARGs with resistance to single antibiotic. However, there is no tool readily available to detect these multi-functional ARGs in metagenomic data. In this study, we develop a computational pipeline to detect multi-functional ARGs in metagenomic data. The pipeline takes raw metagenomic data as the input and generates a list of potential multi-functional ARGs. A plot for each potential multi-functional ARG is also created, showing the location of the multi-functionalities in the sequence and the sequencing coverage level. We collected samples from three different sources: influent samples of a wastewater treatment plant, hospital wastewater samples, and reclaimed water samples, ran the pipeline, and identified 19, 57, and 8 potentially bi-functional ARGs in each source, respectively. Manual inspection of the results identified three most likely bi-functional ARGs. Interestingly, one bi-functional ARG, encoding both aminoglycoside and tetracycline resistance, appeared in all three data sets, indicating its prevalence in different environments. As the amount of antibiotics keeps increasing in the environment, multi-functional ARGs might become more and more common. The pipeline will be a useful computational tool for initial screening and identification of multi-functional ARGs in metagenomic data.en
dc.description.abstractgeneralAntibiotics are the drug to fight against the infection of bacteria. Since the first antibiotic was discovered in 1928, many antibiotic drugs have been developed. At the same time, scientists discovered many genes responsible for the resistance of antibiotic drugs. Nowadays, antibiotic resistance is a global threat. Detecting antibiotic resistance genes in the environment is the first step toward monitoring and controlling antibiotic resistance. In recent years, next-generation sequencing has been widely used to get the DNA sequence from the environment. Metagenomics analysis has been used over the years to detect and analyze ARGs. In the literature, it has been reported that a single gene could carry two parts of sequences corresponding to two different ARGs, thus conferring resistance to two different antibiotics. This fusion might have some evolutionary advantages. In this study, we developed a novel computational tool to detect multi-functional ARGs. We collected data from three sources: the treatment plant water, the hospital wastewater, and the reclaimed water, and identified 19, 57, and 8 potential bi-functional ARGs in each source, respectively. After we manually inspected the result, we found three most likely bi-functional ARGs. We also found one bi-functional ARG that appears in all three datasets. The gene is responsible for aminoglycoside and tetracycline resistance. The tool will serve as the initial screening step to detect multi-functional ARGs.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:35014en
dc.identifier.urihttp://hdl.handle.net/10919/110595en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectmulti-functionalen
dc.subjectantibiotic resistance genesen
dc.subjectmetagenomicsen
dc.titleDeveloping a Computational Pipeline for Detecting Multi-Functional Antibiotic Resistance Genes in Metagenomics Dataen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dang_N_T_2022.pdf
Size:
622.55 KB
Format:
Adobe Portable Document Format

Collections