Updating microbial genomic sequences: improving accuracy & innovation

TR Number

Date

2014-11-07

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Background Many bacterial genome sequences completed using the Sanger method may contain assembly errors due in-part to low sequence coverage driven by cost. Findings To illustrate the need for re-sequencing of pre-nextgen genomes and to validate sequenced genomes, we conducted a series of experiments, using high coverage sequencing data generated by a Illumina Miseq sequencer to sequence genomic DNAs of Bacteroides fragilis NCTC 9343, Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150, Vibrio cholerae O1 biovar El Tor str. N16961, Bacillus halodurans C-125 and Caulobacter crescentus CB15, which had previously been sequenced by the Sanger method during the early 2000’s. Conclusions This study revealed a number of discrepancies between the published assemblies and sequence read alignments for all five bacterial species, suggesting that the continued use of these error-containing genomes and their genetic information may contribute to false conclusions and/or incorrect future discoveries when they are used.

Description

Keywords

Citation

BioData Mining. 2014 Nov 07;7(1):25