AnalyzeThis: An Analysis Workflow-Aware Storage System

dc.contributor.authorSim, Hyogien
dc.contributor.committeechairButt, Ali R.en
dc.contributor.committeememberVazhkudai, Sudharshan S.en
dc.contributor.committeememberJung, Changheeen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2017-04-04T19:50:21Zen
dc.date.adate2015-01-13en
dc.date.available2017-04-04T19:50:21Zen
dc.date.issued2014-12-17en
dc.date.rdate2016-10-18en
dc.date.sdate2014-12-19en
dc.description.abstractSupercomputing application simulations on hundreds of thousands of cores produce vast amounts of data that need to be analyzed on smaller-scale clusters to glean insights. The process is referred to as an end-to-end workflow. Extant workflow systems are stymied by the storage wall, resulting from both the disk-based parallel file system (PFS) failing to keep pace with the compute and memory subsystems as well as the inefficiencies in end-to-end workflow processing. In the post-petaflop era, supercomputers are provisioned with flash devices, as an intermediary between compute nodes and the PFS, enabling novel paradigms not just for expediting I/O, but also for the in-situ analysis of the simulation output data on the flash device. An array of such active flash elements allows us to fundamentally rethink the way data analysis workflows interact with storage systems. By blending the flash storage array and data analysis together in a seamless fashion, we create an analysis workflow-aware storage system, AnalyzeThis. Our guiding principle is that analysis-awareness be deeply ingrained in each and every layer of the storage system—active flash fabric, analysis object abstraction layer, scheduling layer within the storage, and an easy-to-use file system interface—thereby elevating data analyses as first-class citizens. Together, these concepts transform AnalyzeThis into a potent analytics-aware appliance.en
dc.description.degreeMaster of Scienceen
dc.identifier.otheretd-12192014-112658en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-12192014-112658/en
dc.identifier.urihttp://hdl.handle.net/10919/76927en
dc.language.isoen_USen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectFile Systemen
dc.subjectDistributed Systemen
dc.titleAnalyzeThis: An Analysis Workflow-Aware Storage Systemen
dc.typeThesisen
dc.type.dcmitypeTexten
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
etd-12192014-112658_Sim_H_T_2014.pdf
Size:
468.96 KB
Format:
Adobe Portable Document Format

Collections