MPI-ACC: Accelerator-Aware MPI for Scientific Applications

Aji, Ashwin M.; Panwar, Lokendra S.; Ji, Feng; Murthy, Karthik; Chabbi, Milind; Balaji, Pavan; Bisset, Keith R.; Dinan, James; Feng, Wu-chun; Mellor-Crummey, John; Ma, Xiaosong; Thakur, Rajeev

MPI-ACC: Accelerator-Aware MPI for Scientific Applications

dc.contributor.author	Aji, Ashwin M.	en
dc.contributor.author	Panwar, Lokendra S.	en
dc.contributor.author	Ji, Feng	en
dc.contributor.author	Murthy, Karthik	en
dc.contributor.author	Chabbi, Milind	en
dc.contributor.author	Balaji, Pavan	en
dc.contributor.author	Bisset, Keith R.	en
dc.contributor.author	Dinan, James	en
dc.contributor.author	Feng, Wu-chun	en
dc.contributor.author	Mellor-Crummey, John	en
dc.contributor.author	Ma, Xiaosong	en
dc.contributor.author	Thakur, Rajeev	en
dc.contributor.department	Computer Science	en
dc.contributor.department	Fralin Life Sciences Institute	en
dc.date.accessioned	2017-03-17T09:01:39Z	en
dc.date.available	2017-03-17T09:01:39Z	en
dc.date.issued	2016-05-01	en
dc.description.abstract	Data movement in high-performance computing systems accelerated by graphics processing units (GPUs) remains a challenging problem. Data communication in popular parallel programming models, such as the Message Passing Interface (MPI), is currently limited to the data stored in the CPU memory space. Auxiliary memory systems, such as GPU memory, are not integrated into such data movement standards, thus providing applications with no direct mechanism to perform end-toend data movement. We introduce MPI-ACC, an integrated and extensible framework that allows end-to-end data movement in accelerator-based systems. MPI-ACC provides productivity and performance benefits by integrating support for auxiliary memory spaces into MPI. MPI-ACC supports data transfer among CUDA, OpenCL and CPU memory spaces and is extensible to other offload models as well. MPI-ACC’s runtime system enables several key optimizations, including pipelining of data transfers, scalable memory management techniques, and balancing of communication based on accelerator and node architecture. MPIACC is designed to work concurrently with other GPU workloads with minimum contention. We describe how MPI-ACC can be used to design new communication-computation patterns in scientific applications from domains such as epidemiology simulation and seismology modeling, and we discuss the lessons learned. We present experimental results on a state-of-the-art cluster with hundreds of GPUs; and we compare the performance and productivity of MPI-ACC with MVAPICH, a popular CUDA-aware MPI solution. MPI-ACC encourages programmers to explore novel application-specific optimizations for improved overall cluster utilization.	en
dc.description.version	Published version	en
dc.format.extent	1401 - 1414 page(s)	en
dc.format.mimetype	application/pdf	en
dc.identifier.doi	https://doi.org/10.1109/TPDS.2015.2446479	en
dc.identifier.issn	1045-9219	en
dc.identifier.issue	5	en
dc.identifier.uri	http://hdl.handle.net/10919/76661	en
dc.identifier.volume	27	en
dc.language.iso	en	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.title	MPI-ACC: Accelerator-Aware MPI for Scientific Applications	en
dc.title.serial	IEEE Transactions on Parallel and Distributed Systems	en
dc.type	Article - Refereed	en
dc.type.dcmitype	Text	en
pubs.organisational-group	/Virginia Tech	en
pubs.organisational-group	/Virginia Tech/All T&R Faculty	en
pubs.organisational-group	/Virginia Tech/Engineering	en
pubs.organisational-group	/Virginia Tech/Engineering/COE T&R Faculty	en
pubs.organisational-group	/Virginia Tech/Engineering/Computer Science	en
pubs.organisational-group	/Virginia Tech/Faculty of Health Sciences	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: aji-mpi-acc-tpds15.pdf
Size:: 2.44 MB
Format:: Adobe Portable Document Format
Description:: Accepted Version

Download

License bundle

Now showing 1 - 1 of 1

Name:: VTUL_Distribution_License_2016_05_09.pdf
Size:: 18.09 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

All Faculty Deposits
Scholarly Works, Computer Science
Scholarly Works, Fralin Life Sciences Institute