Browsing by Author "Pickering, Brent P."
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Directive-based GPU programming for computational fluid dynamicsPickering, Brent P.; Jackson, Ccharles W.; Scogland, Thomas R. W.; Feng, Wu-chun; Roy, Christopher J. (Pergamon-Elsevier, 2015-07-02)Directive-based programming of graphics processing units (GPUs) has recently appeared as a viable alternative to using specialized low-level languages such as CUDA C and OpenCL for general-purpose GPU programming. This technique, which uses ‘‘directive’’ or ‘‘pragma’’ statements to annotate source codes written in traditional high-level languages, is designed to permit a unified code base to serve multiple computational platforms. In this work we analyze the popular OpenACC programming standard, as implemented by the PGI compiler suite, in order to evaluate its utility and performance potential in computational fluid dynamics (CFD) applications. We examine the process of applying the OpenACC Fortran API to a test CFD code that serves as a proxy for a full-scale research code developed at Virginia Tech; this test code is used to asses the performance improvements attainable for our CFD algorithm on common GPU platforms, as well as to determine the modifications that must be made to the original source code in order to run efficiently on the GPU. Performance is measured on several recent GPU architectures from NVIDIA and AMD (using both double and single precision arithmetic) and the accelerator code is benchmarked against a multithreaded CPU version constructed from the same Fortran source code using OpenMP directives. A single NVIDIA Kepler GPU card is found to perform approximately 20! faster than a single CPU core and more than 2! faster than a 16-core Xeon server. An analysis of optimization techniques for OpenACC reveals cases in which manual intervention by the programmer can improve accelerator performance by up to 30% over the default compiler heuristics, although these optimizations are relevant only for specific platforms. Additionally, the use of multiple accelerators with OpenACC is investigated, including an experimental high-level interface for multi-GPU programming that automates scheduling tasks across multiple devices. While the overall performance of the OpenACC code is found to be satisfactory, we also observe some significant limitations and restrictions imposed by the OpenACC API regarding certain useful features of modern Fortran (2003/8); these are sufficient for us to conclude that it would not be practical to apply OpenACC to our full research code at this time due to the amount of refactoring required.
- A Numerical Investigation of Matrix-Free Implicit Time-Stepping Methods for Large CFD SimulationsSarshar, Arash; Tranquilli, Paul; Pickering, Brent P.; McCall, Andrew; Sandu, Adrian; Roy, Christopher J. (2016)This paper is concerned with the development and testing of advanced time-stepping methods suited for the integration of time-accurate, real-world applications of computational fluid dynamics (CFD). The performance of several time discretization methods is studied numerically with regards to computational efficiency, order of accuracy, and stability, as well as the ability to treat effectively stiff problems. We consider matrix-free implementations, a popular approach for time-stepping methods applied to large CFD applications due to its adherence to scalable matrix-vector operations and a small memory footprint. We compare explicit methods with matrix-free implementations of implicit, linearly-implicit, as well as Rosenbrock-Krylov methods. We show that Rosenbrock-Krylov methods are competitive with existing techniques excelling for a number of prob- lem types and settings.