Multi-level Parallelism with MPI and OpenACC for CFD Applications

McCall, Andrew James

Multi-level Parallelism with MPI and OpenACC for CFD Applications

dc.contributor.author	McCall, Andrew James	en
dc.contributor.committeechair	Roy, Christopher J.	en
dc.contributor.committeemember	Paterson, Eric G.	en
dc.contributor.committeemember	de Sturler, Eric	en
dc.contributor.department	Aerospace and Ocean Engineering	en
dc.date.accessioned	2017-06-15T08:00:47Z	en
dc.date.available	2017-06-15T08:00:47Z	en
dc.date.issued	2017-06-14	en
dc.description.abstract	High-level parallel programming approaches, such as OpenACC, have recently become popular in complex fluid dynamics research since they are cross-platform and easy to implement. OpenACC is a directive-based programming model that, unlike low-level programming models, abstracts the details of implementation on the GPU. Although OpenACC generally limits the performance of the GPU, this model significantly reduces the work required to port an existing code to any accelerator platform, including GPUs. The purpose of this research is twofold: to investigate the effectiveness of OpenACC in developing a portable and maintainable GPU-accelerated code, and to determine the capability of OpenACC to accelerate large, complex programs on the GPU. In both of these studies, the OpenACC implementation is optimized and extended to a multi-GPU implementation while maintaining a unified code base. OpenACC is shown as a viable option for GPU computing with CFD problems. In the first study, a CFD code that solves incompressible cavity flows is accelerated using OpenACC. Overlapping communication with computation improves performance for the multi-GPU implementation by up to 21%, achieving up to 400 times faster performance than a single CPU and 99% weak scalability efficiency with 32 GPUs. The second study ports the execution of a more complex CFD research code to the GPU using OpenACC. Challenges using OpenACC with modern Fortran are discussed. Three test cases are used to evaluate performance and scalability. The multi-GPU performance using 27 GPUs is up to 100 times faster than a single CPU and maintains a weak scalability efficiency of 95%.	en
dc.description.abstractgeneral	The research and analysis performed in scientific computing today produces an ever-increasing demand for faster and more energy efficient performance. Parallel computing with supercomputers that use many central processing units (CPUs) is the current standard for satisfying these demands. The use of graphics processing units (GPUs) for scientific computing applications is an emerging technology that has gained a lot of popularity in the past decade. A single GPU can distribute the computations required by a program over thousands of processing units. This research investigates the effectiveness of a relatively new standard, called OpenACC, for offloading execution of a program to the GPU. The most widely used standards today are highly complex and require low-level, detailed knowledge of the GPU’s architecture. These issues significantly reduce the maintainability and portability of a program. OpenACC does not require rewriting a program for the GPU. Instead, the developer annotates regions of code to run on the GPU and only has to denote high-level information about how to parallelize the code. The results of this research found that even for a complex program that models air flows, using OpenACC to run the program on 27 GPUs increases performance by a factor of 100 over a single CPU and by a factor of 4 over 27 CPUs. Although higher performance is expected with other GPU programming standards, these results were accomplished with minimal change to the original program. Therefore, these results demonstrate the ability of OpenACC to improve performance while keeping the program maintainable and portable.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:10307	en
dc.identifier.uri	http://hdl.handle.net/10919/78203	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Graphics processing unit	en
dc.subject	Directive-based programming	en
dc.subject	OpenACC	en
dc.subject	Lid-driven cavity	en
dc.subject	Multi-GPU	en
dc.subject	Parallel computing	en
dc.title	Multi-level Parallelism with MPI and OpenACC for CFD Applications	en
dc.type	Thesis	en
thesis.degree.discipline	Aerospace Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: McCall_AJ_T_2017.pdf
Size:: 6.53 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses