Automated Runtime Analysis and Adaptation for Scalable Heterogeneous Computing

Helal, Ahmed Elmohamadi Mohamed

Automated Runtime Analysis and Adaptation for Scalable Heterogeneous Computing

dc.contributor.author	Helal, Ahmed Elmohamadi Mohamed	en
dc.contributor.committeechair	Feng, Wu-chun	en
dc.contributor.committeemember	Nazhandali, Leyla	en
dc.contributor.committeemember	Jung, Changhee	en
dc.contributor.committeemember	Hanafy, Yasser Y.	en
dc.contributor.committeemember	Min, Chang Woo	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2020-01-30T09:00:45Z	en
dc.date.available	2020-01-30T09:00:45Z	en
dc.date.issued	2020-01-29	en
dc.description.abstract	In the last decade, there have been tectonic shifts in computer hardware because of reaching the physical limits of the sequential CPU performance. As a consequence, current high-performance computing (HPC) systems integrate a wide variety of compute resources with different capabilities and execution models, ranging from multi-core CPUs to many-core accelerators. While such heterogeneous systems can enable dramatic acceleration of user applications, extracting optimal performance via manual analysis and optimization is a complicated and time-consuming process. This dissertation presents graph-structured program representations to reason about the performance bottlenecks on modern HPC systems and to guide novel automation frameworks for performance analysis and modeling and runtime adaptation. The proposed program representations exploit domain knowledge and capture the inherent computation and communication patterns in user applications, at multiple levels of computational granularity, via compiler analysis and dynamic instrumentation. The empirical results demonstrate that the introduced modeling frameworks accurately estimate the realizable parallel performance and scalability of a given sequential code when ported to heterogeneous HPC systems. As a result, these frameworks enable efficient workload distribution schemes that utilize all the available compute resources in a performance-proportional way. In addition, the proposed runtime adaptation frameworks significantly improve the end-to-end performance of important real-world applications which suffer from limited parallelism and fine-grained data dependencies. Specifically, compared to the state-of-the-art methods, such an adaptive parallel execution achieves up to an order-of-magnitude speedup on the target HPC systems while preserving the inherent data dependencies of user applications.	en
dc.description.abstractgeneral	Current supercomputers integrate a massive number of heterogeneous compute units with varying speed, computational throughput, memory bandwidth, and memory access latency. This trend represents a major challenge to end users, as their applications have been designed from the ground up to primarily exploit homogeneous CPUs. While heterogeneous systems can deliver several orders of magnitude speedup compared to traditional CPU-based systems, end users need extensive software and hardware expertise as well as significant time and effort to efficiently utilize all the available compute resources. To streamline such a daunting process, this dissertation presents automated frameworks for analyzing and modeling the performance on parallel architectures and for transforming the execution of user applications at runtime. The proposed frameworks incorporate domain knowledge and adapt to the input data and the underlying hardware using novel static and dynamic analyses. The experimental results show the efficacy of the introduced frameworks across many important application domains, such as computational fluid dynamics (CFD), and computer-aided design (CAD). In particular, the adaptive execution approach on heterogeneous systems achieves up to an order-of-magnitude speedup over the optimized parallel implementations.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:23625	en
dc.identifier.uri	http://hdl.handle.net/10919/96607	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Parallel Architectures	en
dc.subject	Accelerators	en
dc.subject	Heterogeneous Computing	en
dc.subject	Performance Modeling	en
dc.subject	Runtime Adaptation	en
dc.subject	Scheduling	en
dc.subject	Performance Portability	en
dc.subject	MPI	en
dc.subject	GPU	en
dc.subject	LLVM	en
dc.title	Automated Runtime Analysis and Adaptation for Scalable Heterogeneous Computing	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Helal_AE_D_2020.pdf
Size:: 6.24 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations