Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU

dc.contributor.authorAl Hajri, Qaisen
dc.contributor.authorDash, Sajalen
dc.contributor.authorFeng, Wu-chunen
dc.contributor.authorGarner, Harold R.en
dc.contributor.authorAnandakrishnan, Ramuen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.contributor.departmentComputer Scienceen
dc.contributor.departmentBiomedical Sciences and Pathobiologyen
dc.date.accessioned2021-09-29T16:23:30Zen
dc.date.available2021-09-29T16:23:30Zen
dc.date.issued2020-02-06en
dc.date.updated2021-09-29T16:23:22Zen
dc.description.abstractDespite decades of research, effective treatments for most cancers remain elusive. One reason is that different instances of cancer result from different combinations of multiple genetic mutations (hits). Therefore, treatments that may be effective in some cases are not effective in others. We previously developed an algorithm for identifying combinations of carcinogenic genes with mutations (multi-hit combinations), which could suggest a likely cause for individual instances of cancer. Most cancers are estimated to require three or more hits. However, the computational complexity of the algorithm scales exponentially with the number of hits, making it impractical for identifying combinations of more than two hits. To identify combinations of greater than two hits, we used a compressed binary matrix representation, and optimized the algorithm for parallel execution on an NVIDIA V100 graphics processing unit (GPU). With these enhancements, the optimized GPU implementation was on average an estimated 12,144 times faster than the original integer matrix based CPU implementation, for the 3-hit algorithm, allowing us to identify 3-hit combinations. The 3-hit combinations identified using a training set were able to differentiate between tumor and normal samples in a separate test set with 90% overall sensitivity and 93% overall specificity. We illustrate how the distribution of mutations in tumor and normal samples in the multi-hit gene combinations can suggest potential driver mutations for further investigation. With experimental validation, these combinations may provide insight into the etiology of cancer and a rational basis for targeted combination therapy.en
dc.description.versionPublished versionen
dc.format.extent18 page(s)en
dc.format.mimetypeapplication/pdfen
dc.identifierARTN 2022 (Article number)en
dc.identifier.doihttps://doi.org/10.1038/s41598-020-58785-yen
dc.identifier.eissn2045-2322en
dc.identifier.issn2045-2322en
dc.identifier.issue1en
dc.identifier.other10.1038/s41598-020-58785-y (PII)en
dc.identifier.pmid32029803en
dc.identifier.urihttp://hdl.handle.net/10919/105105en
dc.identifier.volume10en
dc.language.isoenen
dc.publisherNature Publishing Groupen
dc.relation.urihttp://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000559759500020&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=930d57c9ac61a043676db62af60056c1en
dc.rightsCreative Commons Attribution 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/en
dc.subjectcancer driver genesen
dc.subjectsomatic mutationsen
dc.subjectbreast-canceren
dc.subjectp53en
dc.subjectovarianen
dc.subjecttp53en
dc.subjectinstabilityen
dc.subjectexpressionen
dc.subjectmutantsen
dc.subjectgainen
dc.subject.meshHumansen
dc.subject.meshNeoplasmsen
dc.subject.meshAntineoplastic Combined Chemotherapy Protocolsen
dc.subject.meshOligonucleotide Array Sequence Analysisen
dc.subject.meshComputational Biologyen
dc.subject.meshMutationen
dc.subject.meshAlgorithmsen
dc.subject.meshTime Factorsen
dc.subject.meshComputer Graphicsen
dc.subject.meshMolecular Targeted Therapyen
dc.subject.meshCarcinogenesisen
dc.subject.meshDatasets as Topicen
dc.subject.meshBiomarkers, Tumoren
dc.subject.meshPrecision Medicineen
dc.titleIdentifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPUen
dc.title.serialScientific Reportsen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten
dc.type.otherArticleen
dc.type.otherJournalen
dcterms.dateAccepted2020-01-20en
pubs.organisational-group/Virginia Techen
pubs.organisational-group/Virginia Tech/Engineeringen
pubs.organisational-group/Virginia Tech/Engineering/Computer Scienceen
pubs.organisational-group/Virginia Tech/Faculty of Health Sciencesen
pubs.organisational-group/Virginia Tech/All T&R Facultyen
pubs.organisational-group/Virginia Tech/Engineering/COE T&R Facultyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Identifying multi-hit carcinogenic gene combinations Scaling up a weighted set cover algorithm using compressed binary matri.pdf
Size:
10.55 MB
Format:
Adobe Portable Document Format
Description:
Published version