VTechWorks staff will be away for the Thanksgiving holiday beginning at noon on Wednesday, November 27, through Friday, November 29. We will resume normal operations on Monday, December 2. Thank you for your patience.
 

Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs

dc.contributor.authorKrommydas, Konstantinosen
dc.contributor.authorHelal, Ahmed E.en
dc.contributor.authorVerma, Anshumanen
dc.contributor.authorFeng, Wu-chunen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2016-05-13T20:41:40Zen
dc.date.available2016-05-13T20:41:40Zen
dc.date.issued2016-05-13en
dc.description.abstractFor decades, the streaming architecture of FPGAs has delivered accelerated performance across many application domains, such as option pricing solvers in finance, computational fluid dynamics in oil and gas, and packet processing in network routers and firewalls. However, this performance has come at the significant expense of programmability, i.e., the performance-programmability gap. In particular, FPGA developers use hardware design languages (HDLs) to implement the application data path and to design hardware modules for computation pipelines, memory management, synchronization, and communication. This process requires extensive low-level knowledge of the target FPGA architecture and consumes significant development time and effort. To address this lack of programmability of FPGAs, OpenCL provides an easy-to-use and portable programming model for CPUs, GPUs, APUs, and now, FPGAs. However, this significantly improved programmability can come at the expense of performance; that is, there still remains a performance-programmability gap. To improve the performance of OpenCL kernels on FPGAs, and thus, bridge the performance-programmability gap, we identify general techniques to optimize OpenCL kernels for FPGAs under device-specific hardware constraints. We then apply these optimization techniques to the OpenDwarfs benchmark suite, with its diverse parallelism profiles and memory access patterns, in order to evaluate the effectiveness of the optimizations in terms of performance and resource utilization. Finally, we present the performance of the optimized OpenDwarfs, along with their potential re-factoring, to bridge the performance gap from programming in OpenCL versus programming in a HDL. Index Terms—OpenDwarfs; FPGA; OpenCL; GPU; GPGPU; MIC; Accelerators; Performance Portabilityen
dc.format.mimetypeapplication/pdfen
dc.identifier.trnumberTR-16-03en
dc.identifier.urihttp://hdl.handle.net/10919/70968en
dc.language.isoenen
dc.publisherDepartment of Computer Science, Virginia Polytechnic Institute & State Universityen
dc.relation.ispartofComputer Science Technical Reportsen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectArchitectureen
dc.subjectComputer systemsen
dc.subjectHigh performance computingen
dc.subjectParallel and distributed computingen
dc.titleBridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfsen
dc.typeTechnical reporten
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
fccm16-opendwarfs.pdf
Size:
1.02 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: