An Automated Framework for Characterizing and Subsetting GPGPU Workloads

Adhinarayanan, VigneshFeng, Wu-chun2015-12-182015-12-182015-12-18http://hdl.handle.net/10919/64331Graphics processing units (GPUs) are becoming increasingly common in today’s computing systems due to their superior performance and energy efficiency relative to their cost. To further improve these desired characteristics, researchers have proposed several software and hardware techniques. Evaluation of these proposed techniques could be tricky due to the ad-hoc nature in which applications are selected for evaluation. Sometimes researchers spend unnecessary time evaluating redundant workloads, which is particularly problematic for time-consuming studies involving simulation. Other times, they fail to expose the shortcomings of their proposed techniques when too few workloads are chosen for evaluation. To overcome these problems, we propose an automated framework that characterizes and subsets GPGPU workloads, depending on a user-chosen set of performance metrics/counters. This framework internally uses principal component analysis (PCA) to reduce the dimensionality of the chosen metrics and then uses hierarchical clustering to identify similarity among the workloads. In this study, we use our framework to identify redundancy in the recently released SPEC ACCEL OpenCL benchmark suite using a few architecture-dependent metrics. Our analysis shows that a subset of eight applications provides most of the diversity in the 19-application benchmark suite. We also subset the Parboil, Rodinia, and SHOC benchmark suites and then compare them against each another to identify “gaps” in these suites. As an example, we show that SHOC has many applications that are similar to each other and could benefit from adding four applications from Parboil to improve its diversity.application/pdfenIn CopyrightComputer systemsAlgorithmsComputational science and engineeringParallel and distributed computingAn Automated Framework for Characterizing and Subsetting GPGPU WorkloadsTechnical reportTR-15-06