CPU/GPU Code Acceleration on Heterogeneous Systems and Code Verification for CFD Applications

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


Computational Fluid Dynamics (CFD) applications usually involve intensive computations, which can be accelerated through using open accelerators, especially GPUs due to their common use in the scientific computing community. In addition to code acceleration, it is important to ensure that the code and algorithm are implemented numerically correctly, which is called code verification. This dissertation focuses on accelerating research CFD codes on multi-CPUs/GPUs using MPI and OpenACC, as well as the code verification for turbulence model implementation using the method of manufactured solutions and code-to-code comparisons. First, a variety of performance optimizations both agnostic and specific to applications and platforms are developed in order to 1) improve the heterogeneous CPU/GPU compute utilization; 2) improve the memory bandwidth to the main memory; 3) reduce communication overhead between the CPU host and the GPU accelerator; and 4) reduce the tedious manual tuning work for GPU scheduling. Both finite difference and finite volume CFD codes and multiple platforms with different architectures are utilized to evaluate the performance optimizations used. A maximum speedup of over 70 is achieved on 16 V100 GPUs over 16 Xeon E5-2680v4 CPUs for multi-block test cases. In addition, systematic studies of code verification are performed for a second-order accurate finite volume research CFD code. Cross-term sinusoidal manufactured solutions are applied to verify the Spalart-Allmaras and k-omega SST model implementation, both in 2D and 3D. This dissertation shows that the spatial and temporal schemes are implemented numerically correctly.



GPU, OpenACC, MPI, Domain Decomposition, Performance Optimization, GPUDirect, Code Verification, OOA, Discretization Error