Show simple item record

dc.contributor.authorElteir, Marwa Khamisen_US

Nowadays, an increasing number of computational systems are equipped with heterogeneous compute resources, i.e., following different architecture. This applies to the level of a single chip, a single node and even supercomputers and large-scale clusters. With its impressive price-to-performance ratio as well as power efficiently compared to traditional multicore processors, graphics processing units (GPUs) has become an integrated part of these systems. GPUs deliver high peak performance; however efficiently exploiting their computational power requires the exploration of a multi-dimensional space of optimization methodologies, which is challenging even for the well-trained expert. The complexity of this multi-dimensional space arises not only from the traditionally well known but arduous task of architecture-aware GPU optimization at design and compile time, but it also arises in the partitioning and scheduling of the computation across these heterogeneous resources. Even with programming models like the Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL), the developer still needs to manage the data transfer be- tween host and device and vice versa, orchestrate the execution of several kernels, and more arduously, optimize the kernel code.

In this dissertation, we aim to deliver a transparent parallel programming environment for heterogeneous resources by leveraging the power of the MapReduce programming model and OpenCL programming language. We propose a portable architecture-aware framework that efficiently runs an application across heterogeneous resources, specifically AMD GPUs and NVIDIA GPUs, while hiding complex architectural details from the developer. To further enhance performance portability, we explore approaches for asynchronously and efficiently distributing the computations across heterogeneous resources. When applied to benchmarks and representative applications, our proposed framework significantly enhances performance, including up to 58% improvement over traditional approaches to task assignment and up to a 45-fold improvement over state-of-the-art MapReduce implementations.

dc.publisherVirginia Techen_US
dc.rightsI hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to Virginia Tech or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.en_US
dc.subjectGraphics Processing Unitsen_US
dc.subjectProgramming Modelsen_US
dc.subjectHeterogeneous Computingen_US
dc.titleA MapReduce Framework for Heterogeneous Computing Architecturesen_US
dc.contributor.departmentComputer Scienceen_US
dc.description.degreePh. D.en_US D.en_US Polytechnic Institute and State Universityen_US Scienceen_US
dc.contributor.committeechairFeng, Wu-chunen_US
dc.contributor.committeememberButt, Alien_US
dc.contributor.committeememberTilevich, Elien_US
dc.contributor.committeememberXiaosong, Mamen_US
dc.contributor.committeecochairLin, Heshanen_US

Files in this item


This item appears in the following Collection(s)

Show simple item record