Characterization of FPGA-based High Performance Computers

dc.contributor.authorPimenta Pereira, Karl Savioen
dc.contributor.committeechairAthanas, Peter M.en
dc.contributor.committeememberSchaumont, Patrick R.en
dc.contributor.committeememberFeng, Wu-chunen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2014-03-14T20:43:18Zen
dc.date.adate2011-09-02en
dc.date.available2014-03-14T20:43:18Zen
dc.date.issued2011-08-09en
dc.date.rdate2011-09-02en
dc.date.sdate2011-08-11en
dc.description.abstractAs CPU clock frequencies plateau and the doubling of CPU cores per processor exacerbate the memory wall, hybrid core computing, utilizing CPUs augmented with FPGAs and/or GPUs holds the promise of addressing high-performance computing demands, particularly with respect to performance, power and productivity. While traditional approaches to benchmark high-performance computers such as SPEC, took an architecture-based approach, they do not completely express the parallelism that exists in FPGA and GPU accelerators. This thesis follows an application-centric approach, by comparing the sustained performance of two key computational idioms, with respect to performance, power and productivity. Specifically, a complex, single precision, floating-point, 1D, Fast Fourier Transform (FFT) and a Molecular Dynamics modeling application, are implemented on state-of-the-art FPGA and GPU accelerators. As results show, FPGA floating-point FFT performance is highly sensitive to a mix of dedicated FPGA resources; DSP48E slices, block RAMs, and FPGA I/O banks in particular. Estimated results show that for the floating-point FFT benchmark on FPGAs, these resources are the performance limiting factor. Fixed-point FFTs are important in a lot of high performance embedded applications. For an integer-point FFT, FPGAs exploit a flexible data path width to trade-off circuit cost and speed of computation, improving performance and resource utilization. GPUs cannot fully take advantage of this, having a fixed data-width architecture. For the molecular dynamics application, FPGAs benefit from the flexibility in creating a custom, tightly-pipelined datapath, and a highly optimized memory subsystem of the accelerator. This can provide a 250-fold improvement over an optimized CPU implementation and 2-fold improvement over an optimized GPU implementation, along with massive power savings. Finally, to extract the maximum performance out of the FPGA, each implementation requires a balance between the formulation of the algorithm on the platform, the optimum use of available external memory bandwidth, and the availability of computational resources; at the expense of a greater programming effort.en
dc.description.degreeMaster of Scienceen
dc.identifier.otheretd-08112011-192508en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-08112011-192508/en
dc.identifier.urihttp://hdl.handle.net/10919/34483en
dc.publisherVirginia Techen
dc.relation.haspartPimentaPereira_KS_T_2011.pdfen
dc.relation.haspartPimentaPereira_KS_T_2011_fairuse.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectFFTen
dc.subjectmolecular dynamicsen
dc.subjectinteger-pointen
dc.subjectfloating-pointen
dc.subjectGPUen
dc.subjectHPCen
dc.subjectField programmable gate arraysen
dc.titleCharacterization of FPGA-based High Performance Computersen
dc.typeThesisen
thesis.degree.disciplineElectrical and Computer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
PimentaPereira_KS_T_2011.pdf
Size:
8.04 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
PimentaPereira_KS_T_2011_fairuse.pdf
Size:
6.84 MB
Format:
Adobe Portable Document Format

Collections