Concurrency Optimization for Integrative Network Analysis
Virginia Tech's Computational Bioinformatics and Bio-imaging Laboratory (CBIL) is exploring integrative network analysis techniques to identify subnetworks or genetic pathways that contribute to various cancers. Chen et. al. developed a bagging Markov random field (BMRF)-based approach which examines gene expression data with prior biological information to reliably identify significant genes and proteins. Using random resampling with replacement (bootstrapping or bagging) is essential to confident results but is computationally demanding as multiple iterations of the network identification (by simulated annealing) is required. The MATLAB implementation is computationally demanding, employs limited concurrency, and thus time prohibitive. Using strong software development discipline we optimize BMRF using algorithmic, compiler, and concurrency techniques (including Nvidia GPUs) to alleviate the wall clock time needed for analysis of large-scale genomic data. Particularly, we decompose the BMRF algorithm into functional blocks, implement the algorithm in C/C++ and further explore the C/C++ implementation with concurrency optimization. Experiments are conducted with simulation and real data to demonstrate that a significant speedup of BMRF can be achieved by exploiting concurrency opportunities. We believe that the experience gained by this research shall help pave the way for us to develop computationally efficient algorithms leveraging concurrency, enabling researchers to efficiently analyze larger-scale data sets essential for furthering cancer research.