Accurate and Efficient Gene Function Prediction using a Multi-Bacterial Network

dc.contributor.authorLaw, Jeffrey N.en
dc.contributor.authorKale, Shiv D.en
dc.contributor.authorMurali, T. M.en
dc.date.accessioned2020-11-03T14:26:58Zen
dc.date.available2020-11-03T14:26:58Zen
dc.date.issued2019-05-24en
dc.description.abstractThe rapid rise in newly sequenced genomes requires the development of computational methods to supplement experimental functional annotations. The challenge that arises is to develop methods for gene function prediction that integrate information for multiple species while also operating on a genomewide scale. We develop a label propagation algorithm called FastSinkSource and apply it to a sequence similarity network integrated with species-specific heterogeneous data for 19 pathogenic bacterial species. By using mathematically-provable bounds on the rate of progress of FastSinkSource during power iteration, we decrease the running time by a factor of 100 or more without sacrificing prediction accuracy. To demonstrate scalability, we expand to a 73-million edge network across 200 bacterial species while maintaining accuracy and efficiency improvements. Our results point to the feasibility and promise of multi-species, genomewide gene function prediction, especially as more experimental data and annotations become available for a diverse variety of organisms.en
dc.description.sponsorshipThe research is based upon work supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via the Army Research Office (ARO) under cooperative Agreement Number [W911NF-17-2-0105].en
dc.format.extent34 pagesen
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1101/646687en
dc.identifier.urihttp://hdl.handle.net/10919/100773en
dc.language.isoenen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.titleAccurate and Efficient Gene Function Prediction using a Multi-Bacterial Networken
dc.title.serialVirginia Techen
dc.typeArticleen
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
646687v1.full.pdf
Size:
4.44 MB
Format:
Adobe Portable Document Format
Description: