Scogland, Thomas R. W.Feng, Wu-chunRountree, Barryde Supinski, Bronis R.2017-03-172017-03-172015-11-011045-9219http://hdl.handle.net/10919/76657Heterogeneity continues to increase at all levels of computing, with the rise of accelerators such as GPUs, FPGAs, and other coprocessors into everything from desktops to supercomputers. As a consequence, efficiently managing such disparate resources has become increasingly complex. CoreTSAR seeks to reduce this complexity by adaptively worksharing parallel-loop regions across computer resources without requiring any transformation of the code within the loop. Our results show performance improvements of up to three-fold over a current state-of-the-art heterogeneous task scheduler as well as linear performance scaling from a single GPU to four GPUs for many codes. In addition, CoreTSAR demonstrates a robust ability to adapt to both a variety of workloads and underlying system configurations2970 - 2983 page(s)application/pdfenIn CopyrightCoreTSAR: Core Task-Size Adapting RuntimeArticle - RefereedIEEE Transactions on Parallel and Distributed Systemshttps://doi.org/10.1109/TPDS.2014.23651922611