Programming High-Performance Clusters with Heterogeneous Computing Devices

Aji, Ashwin M.

Programming High-Performance Clusters with Heterogeneous Computing Devices

dc.contributor.author	Aji, Ashwin M.	en
dc.contributor.committeechair	Feng, Wu-chun	en
dc.contributor.committeemember	Ribbens, Calvin J.	en
dc.contributor.committeemember	Bisset, Keith R.	en
dc.contributor.committeemember	Marathe, Madhav Vishnu	en
dc.contributor.committeemember	Balaji, Pavan	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2015-05-20T08:00:08Z	en
dc.date.available	2015-05-20T08:00:08Z	en
dc.date.issued	2015-05-19	en
dc.description.abstract	Today's high-performance computing (HPC) clusters are seeing an increase in the adoption of accelerators like GPUs, FPGAs and co-processors, leading to heterogeneity in the computation and memory subsystems. To program such systems, application developers typically employ a hybrid programming model of MPI across the compute nodes in the cluster and an accelerator-specific library (e.g.; CUDA, OpenCL, OpenMP, OpenACC) across the accelerator devices within each compute node. Such explicit management of disjointed computation and memory resources leads to reduced productivity and performance. This dissertation focuses on designing, implementing and evaluating a runtime system for HPC clusters with heterogeneous computing devices. This work also explores extending existing programming models to make use of our runtime system for easier code modernization of existing applications. Specifically, we present MPI-ACC, an extension to the popular MPI programming model and runtime system for efficient data movement and automatic task mapping across the CPUs and accelerators within a cluster, and discuss the lessons learned. MPI-ACC's task-mapping runtime subsystem performs fast and automatic device selection for a given task. MPI-ACC's data-movement subsystem includes careful optimizations for end-to-end communication among CPUs and accelerators, which are seamlessly leveraged by the application developers. MPI-ACC provides a familiar, flexible and natural interface for programmers to choose the right computation or communication targets, while its runtime system achieves efficient cluster utilization.	en
dc.description.degree	Ph. D.	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:4292	en
dc.identifier.uri	http://hdl.handle.net/10919/52366	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Runtime Systems	en
dc.subject	Programming Models	en
dc.subject	General Purpose Graphics Processing Units (GPGPUs)	en
dc.subject	Message Passing Interface (MPI)	en
dc.subject	CUDA	en
dc.subject	OpenCL	en
dc.title	Programming High-Performance Clusters with Heterogeneous Computing Devices	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Ph. D.	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Aji_AM_D_2015.pdf
Size:: 9.92 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations