Automatic Scheduling of Compute Kernels Across Heterogeneous Architectures

Lyerly, Robert Frantz

Automatic Scheduling of Compute Kernels Across Heterogeneous Architectures

dc.contributor.author	Lyerly, Robert Frantz	en
dc.contributor.committeechair	Ravindran, Binoy	en
dc.contributor.committeemember	Plassmann, Paul	en
dc.contributor.committeemember	Patterson, Cameron D.	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2017-06-13T19:44:10Z	en
dc.date.adate	2014-06-24	en
dc.date.available	2017-06-13T19:44:10Z	en
dc.date.issued	2014-05-07	en
dc.date.rdate	2014-06-24	en
dc.date.sdate	2014-05-20	en
dc.description.abstract	The world of high-performance computing has shifted from increasing single-core performance to extracting performance from heterogeneous multi- and many-core processors due to the power, memory and instruction-level parallelism walls. All trends point towards increased processor heterogeneity as a means for increasing application performance, from smartphones to servers. These various architectures are designed for different types of applications — traditional "big" CPUs (like the Intel Xeon) are optimized for low latency while other architectures (such as the NVidia Tesla K20x) are optimized for high-throughput. These architectures have different tradeoffs and different performance profiles, meaning fantastic performance gains for the right types of applications. However applications that are ill-suited for a given architecture may experience significant slowdown; therefore, it is imperative that applications are scheduled onto the correct processor. In order to perform this scheduling, applications must be analyzed to determine their execution characteristics. Traditionally this application-to-hardware mapping was determined statically by the programmer. However, this requires intimate knowledge of the application and underlying architecture, and precludes load-balancing by the system. We demonstrate and empirically evaluate a system for automatically scheduling compute kernels by extracting program characteristics and applying machine learning techniques. We develop a machine learning process that is system-agnostic, and works for a variety of contexts (e.g. embedded, desktop/workstation, server). Finally, we perform scheduling in a workload-aware and workload-adaptive manner for these compute kernels.	en
dc.description.degree	Master of Science	en
dc.identifier.other	etd-05202014-193503	en
dc.identifier.sourceurl	http://scholar.lib.vt.edu/theses/available/etd-05202014-193503/	en
dc.identifier.uri	http://hdl.handle.net/10919/78130	en
dc.language.iso	en_US	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	High-Performance Computing	en
dc.subject	Runtime Systems	en
dc.subject	Heterogeneous Architectures	en
dc.subject	Compilers	en
dc.subject	Scheduling	en
dc.title	Automatic Scheduling of Compute Kernels Across Heterogeneous Architectures	en
dc.type	Thesis	en
dc.type.dcmitype	Text	en
thesis.degree.discipline	Electrical and Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: etd-05202014-193503_Lyerly_RF_T_2014_2.pdf
Size:: 895.24 KB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses