Characterization of FPGA-based High Performance Computers

Pimenta Pereira, Karl Savio

Characterization of FPGA-based High Performance Computers

dc.contributor.author	Pimenta Pereira, Karl Savio	en
dc.contributor.committeechair	Athanas, Peter M.	en
dc.contributor.committeemember	Schaumont, Patrick R.	en
dc.contributor.committeemember	Feng, Wu-chun	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2014-03-14T20:43:18Z	en
dc.date.adate	2011-09-02	en
dc.date.available	2014-03-14T20:43:18Z	en
dc.date.issued	2011-08-09	en
dc.date.rdate	2011-09-02	en
dc.date.sdate	2011-08-11	en
dc.description.abstract	As CPU clock frequencies plateau and the doubling of CPU cores per processor exacerbate the memory wall, hybrid core computing, utilizing CPUs augmented with FPGAs and/or GPUs holds the promise of addressing high-performance computing demands, particularly with respect to performance, power and productivity. While traditional approaches to benchmark high-performance computers such as SPEC, took an architecture-based approach, they do not completely express the parallelism that exists in FPGA and GPU accelerators. This thesis follows an application-centric approach, by comparing the sustained performance of two key computational idioms, with respect to performance, power and productivity. Specifically, a complex, single precision, floating-point, 1D, Fast Fourier Transform (FFT) and a Molecular Dynamics modeling application, are implemented on state-of-the-art FPGA and GPU accelerators. As results show, FPGA floating-point FFT performance is highly sensitive to a mix of dedicated FPGA resources; DSP48E slices, block RAMs, and FPGA I/O banks in particular. Estimated results show that for the floating-point FFT benchmark on FPGAs, these resources are the performance limiting factor. Fixed-point FFTs are important in a lot of high performance embedded applications. For an integer-point FFT, FPGAs exploit a flexible data path width to trade-off circuit cost and speed of computation, improving performance and resource utilization. GPUs cannot fully take advantage of this, having a fixed data-width architecture. For the molecular dynamics application, FPGAs benefit from the flexibility in creating a custom, tightly-pipelined datapath, and a highly optimized memory subsystem of the accelerator. This can provide a 250-fold improvement over an optimized CPU implementation and 2-fold improvement over an optimized GPU implementation, along with massive power savings. Finally, to extract the maximum performance out of the FPGA, each implementation requires a balance between the formulation of the algorithm on the platform, the optimum use of available external memory bandwidth, and the availability of computational resources; at the expense of a greater programming effort.	en
dc.description.degree	Master of Science	en
dc.identifier.other	etd-08112011-192508	en
dc.identifier.sourceurl	http://scholar.lib.vt.edu/theses/available/etd-08112011-192508/	en
dc.identifier.uri	http://hdl.handle.net/10919/34483	en
dc.publisher	Virginia Tech	en
dc.relation.haspart	PimentaPereira_KS_T_2011.pdf	en
dc.relation.haspart	PimentaPereira_KS_T_2011_fairuse.pdf	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	FFT	en
dc.subject	molecular dynamics	en
dc.subject	integer-point	en
dc.subject	floating-point	en
dc.subject	GPU	en
dc.subject	HPC	en
dc.subject	Field programmable gate arrays	en
dc.title	Characterization of FPGA-based High Performance Computers	en
dc.type	Thesis	en
thesis.degree.discipline	Electrical and Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: PimentaPereira_KS_T_2011.pdf
Size:: 8.04 MB
Format:: Adobe Portable Document Format

Download

Name:: PimentaPereira_KS_T_2011_fairuse.pdf
Size:: 6.84 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses