PATtyFams: Protein Families for the Microbial Genomes in the PATRIC Database

dc.contributor.authorDavis, James J.en
dc.contributor.authorGerdes, Svetlanaen
dc.contributor.authorOlsen, Gary J.en
dc.contributor.authorOlson, Roberten
dc.contributor.authorPusch, Gordon D.en
dc.contributor.authorShukla, Mauliken
dc.contributor.authorVonstein, Veronikaen
dc.contributor.authorWattam, Alice R.en
dc.contributor.authorYoo, Hyunseungen
dc.date.accessioned2019-03-15T16:33:29Zen
dc.date.available2019-03-15T16:33:29Zen
dc.date.issued2016-02-08en
dc.description.abstractThe ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation, and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org) in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based function assignments available through RAST (Rapid Annotation using Subsystem Technology) to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL). This new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.en
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.3389/fmicb.2016.00118en
dc.identifier.issn1664-302Xen
dc.identifier.other118en
dc.identifier.pmid26903996en
dc.identifier.urihttp://hdl.handle.net/10919/88459en
dc.identifier.volume7en
dc.language.isoen_USen
dc.publisherFrontiersen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.titlePATtyFams: Protein Families for the Microbial Genomes in the PATRIC Databaseen
dc.title.serialFrontiers in Microbiologyen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
fmicb-07-00118.pdf
Size:
4.56 MB
Format:
Adobe Portable Document Format
Description: