Multi-level Parallelism with MPI and OpenACC for CFD Applications

dc.contributor.authorMcCall, Andrew Jamesen
dc.contributor.committeechairRoy, Christopher J.en
dc.contributor.committeememberPaterson, Eric G.en
dc.contributor.committeememberde Sturler, Ericen
dc.contributor.departmentAerospace and Ocean Engineeringen
dc.date.accessioned2017-06-15T08:00:47Zen
dc.date.available2017-06-15T08:00:47Zen
dc.date.issued2017-06-14en
dc.description.abstractHigh-level parallel programming approaches, such as OpenACC, have recently become popular in complex fluid dynamics research since they are cross-platform and easy to implement. OpenACC is a directive-based programming model that, unlike low-level programming models, abstracts the details of implementation on the GPU. Although OpenACC generally limits the performance of the GPU, this model significantly reduces the work required to port an existing code to any accelerator platform, including GPUs. The purpose of this research is twofold: to investigate the effectiveness of OpenACC in developing a portable and maintainable GPU-accelerated code, and to determine the capability of OpenACC to accelerate large, complex programs on the GPU. In both of these studies, the OpenACC implementation is optimized and extended to a multi-GPU implementation while maintaining a unified code base. OpenACC is shown as a viable option for GPU computing with CFD problems. In the first study, a CFD code that solves incompressible cavity flows is accelerated using OpenACC. Overlapping communication with computation improves performance for the multi-GPU implementation by up to 21%, achieving up to 400 times faster performance than a single CPU and 99% weak scalability efficiency with 32 GPUs. The second study ports the execution of a more complex CFD research code to the GPU using OpenACC. Challenges using OpenACC with modern Fortran are discussed. Three test cases are used to evaluate performance and scalability. The multi-GPU performance using 27 GPUs is up to 100 times faster than a single CPU and maintains a weak scalability efficiency of 95%.en
dc.description.abstractgeneralThe research and analysis performed in scientific computing today produces an ever-increasing demand for faster and more energy efficient performance. Parallel computing with supercomputers that use many central processing units (CPUs) is the current standard for satisfying these demands. The use of graphics processing units (GPUs) for scientific computing applications is an emerging technology that has gained a lot of popularity in the past decade. A single GPU can distribute the computations required by a program over thousands of processing units. This research investigates the effectiveness of a relatively new standard, called OpenACC, for offloading execution of a program to the GPU. The most widely used standards today are highly complex and require low-level, detailed knowledge of the GPU’s architecture. These issues significantly reduce the maintainability and portability of a program. OpenACC does not require rewriting a program for the GPU. Instead, the developer annotates regions of code to run on the GPU and only has to denote high-level information about how to parallelize the code. The results of this research found that even for a complex program that models air flows, using OpenACC to run the program on 27 GPUs increases performance by a factor of 100 over a single CPU and by a factor of 4 over 27 CPUs. Although higher performance is expected with other GPU programming standards, these results were accomplished with minimal change to the original program. Therefore, these results demonstrate the ability of OpenACC to improve performance while keeping the program maintainable and portable.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:10307en
dc.identifier.urihttp://hdl.handle.net/10919/78203en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectGraphics processing uniten
dc.subjectDirective-based programmingen
dc.subjectOpenACCen
dc.subjectLid-driven cavityen
dc.subjectMulti-GPUen
dc.subjectParallel computingen
dc.titleMulti-level Parallelism with MPI and OpenACC for CFD Applicationsen
dc.typeThesisen
thesis.degree.disciplineAerospace Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
McCall_AJ_T_2017.pdf
Size:
6.53 MB
Format:
Adobe Portable Document Format

Collections