Improving somatic variant identification through integration of genome and exome data

dc.contributor.authorVijayan, Vinayaen
dc.contributor.authorYiu, Siu-Mingen
dc.contributor.authorZhang, Liqingen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2017-10-24T14:12:29Zen
dc.date.available2017-10-24T14:12:29Zen
dc.date.issued2017-10-16en
dc.date.updated2017-10-23T16:15:14Zen
dc.description.abstractBackground: Cost-effective high-throughput sequencing technologies, together with efficient mapping and variant calling tools, have made it possible to identify somatic variants for cancer study. However, integrating somatic variants from whole exome and whole genome studies poses a challenge to researchers as the variants identified by whole genome analysis may not be identified by whole exome analysis and vice versa. Simply taking the union or intersection of the results may lead to too many false positives or too many false negatives. Results: To tackle this problem, we use machine learning models to integrate whole exome and whole genome calling results from two representative tools, VCMM (with the highest sensitivity but very low precision) and MuTect (with the highest precision). The evaluation results, based on both simulated and real data, show that our framework improves somatic variant calling, and is more accurate in identifying somatic variants than either individual method used alone or using variants identified from only whole genome data or only whole exome data. Conclusion: Using machine learning approach to combine results from multiple calling methods on multiple data platforms (e.g., genome and exome) enables more accurate identification of somatic variants.en
dc.description.versionPublished versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1186/s12864-017-4134-3en
dc.identifier.urihttp://hdl.handle.net/10919/79757en
dc.language.isoenen
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.holderThe Author(s)en
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.titleImproving somatic variant identification through integration of genome and exome dataen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
12864_2017_Article_4134.pdf
Size:
730.06 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: