Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU
dc.contributor.author | Al Hajri, Qais | en |
dc.contributor.author | Dash, Sajal | en |
dc.contributor.author | Feng, Wu-chun | en |
dc.contributor.author | Garner, Harold R. | en |
dc.contributor.author | Anandakrishnan, Ramu | en |
dc.contributor.department | Electrical and Computer Engineering | en |
dc.contributor.department | Computer Science | en |
dc.contributor.department | Biomedical Sciences and Pathobiology | en |
dc.date.accessioned | 2021-09-29T16:23:30Z | en |
dc.date.available | 2021-09-29T16:23:30Z | en |
dc.date.issued | 2020-02-06 | en |
dc.date.updated | 2021-09-29T16:23:22Z | en |
dc.description.abstract | Despite decades of research, effective treatments for most cancers remain elusive. One reason is that different instances of cancer result from different combinations of multiple genetic mutations (hits). Therefore, treatments that may be effective in some cases are not effective in others. We previously developed an algorithm for identifying combinations of carcinogenic genes with mutations (multi-hit combinations), which could suggest a likely cause for individual instances of cancer. Most cancers are estimated to require three or more hits. However, the computational complexity of the algorithm scales exponentially with the number of hits, making it impractical for identifying combinations of more than two hits. To identify combinations of greater than two hits, we used a compressed binary matrix representation, and optimized the algorithm for parallel execution on an NVIDIA V100 graphics processing unit (GPU). With these enhancements, the optimized GPU implementation was on average an estimated 12,144 times faster than the original integer matrix based CPU implementation, for the 3-hit algorithm, allowing us to identify 3-hit combinations. The 3-hit combinations identified using a training set were able to differentiate between tumor and normal samples in a separate test set with 90% overall sensitivity and 93% overall specificity. We illustrate how the distribution of mutations in tumor and normal samples in the multi-hit gene combinations can suggest potential driver mutations for further investigation. With experimental validation, these combinations may provide insight into the etiology of cancer and a rational basis for targeted combination therapy. | en |
dc.description.version | Published version | en |
dc.format.extent | 18 page(s) | en |
dc.format.mimetype | application/pdf | en |
dc.identifier | ARTN 2022 (Article number) | en |
dc.identifier.doi | https://doi.org/10.1038/s41598-020-58785-y | en |
dc.identifier.eissn | 2045-2322 | en |
dc.identifier.issn | 2045-2322 | en |
dc.identifier.issue | 1 | en |
dc.identifier.other | 10.1038/s41598-020-58785-y (PII) | en |
dc.identifier.pmid | 32029803 | en |
dc.identifier.uri | http://hdl.handle.net/10919/105105 | en |
dc.identifier.volume | 10 | en |
dc.language.iso | en | en |
dc.publisher | Nature Publishing Group | en |
dc.relation.uri | http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000559759500020&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=930d57c9ac61a043676db62af60056c1 | en |
dc.rights | Creative Commons Attribution 4.0 International | en |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | en |
dc.subject | cancer driver genes | en |
dc.subject | somatic mutations | en |
dc.subject | breast-cancer | en |
dc.subject | p53 | en |
dc.subject | ovarian | en |
dc.subject | tp53 | en |
dc.subject | instability | en |
dc.subject | expression | en |
dc.subject | mutants | en |
dc.subject | gain | en |
dc.subject.mesh | Humans | en |
dc.subject.mesh | Neoplasms | en |
dc.subject.mesh | Antineoplastic Combined Chemotherapy Protocols | en |
dc.subject.mesh | Oligonucleotide Array Sequence Analysis | en |
dc.subject.mesh | Computational Biology | en |
dc.subject.mesh | Mutation | en |
dc.subject.mesh | Algorithms | en |
dc.subject.mesh | Time Factors | en |
dc.subject.mesh | Computer Graphics | en |
dc.subject.mesh | Molecular Targeted Therapy | en |
dc.subject.mesh | Carcinogenesis | en |
dc.subject.mesh | Datasets as Topic | en |
dc.subject.mesh | Biomarkers, Tumor | en |
dc.subject.mesh | Precision Medicine | en |
dc.title | Identifying multi-hit carcinogenic gene combinations: Scaling up a weighted set cover algorithm using compressed binary matrix representation on a GPU | en |
dc.title.serial | Scientific Reports | en |
dc.type | Article - Refereed | en |
dc.type.dcmitype | Text | en |
dc.type.other | Article | en |
dc.type.other | Journal | en |
dcterms.dateAccepted | 2020-01-20 | en |
pubs.organisational-group | /Virginia Tech | en |
pubs.organisational-group | /Virginia Tech/Engineering | en |
pubs.organisational-group | /Virginia Tech/Engineering/Computer Science | en |
pubs.organisational-group | /Virginia Tech/Faculty of Health Sciences | en |
pubs.organisational-group | /Virginia Tech/All T&R Faculty | en |
pubs.organisational-group | /Virginia Tech/Engineering/COE T&R Faculty | en |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Identifying multi-hit carcinogenic gene combinations Scaling up a weighted set cover algorithm using compressed binary matri.pdf
- Size:
- 10.55 MB
- Format:
- Adobe Portable Document Format
- Description:
- Published version