Show simple item record

dc.contributor.authorYeom, Jae-seungen
dc.date.accessioned2015-11-22T07:00:46Zen
dc.date.available2015-11-22T07:00:46Zen
dc.date.issued2014-05-30en
dc.identifier.othervt_gsexam:2796en
dc.identifier.urihttp://hdl.handle.net/10919/64180en
dc.description.abstractData-intensive scientific applications often process an enormous amount of data. The scalability of such applications depends critically on how to manage the locality of data. Our study explores two common types of applications that are vastly different in terms of memory access pattern and workload variation. One includes those with multi-stride accesses in regular nested parallel loops. The other is for processing large-scale irregular social network graphs. In the former case, the memory location or the data item accessed in a loop is predictable and the load on processing a unit work (an array element) is relatively uniform with no significant variation. On the other hand, in the latter case, the data access per unit work (a vertex) is highly irregular in terms of the number of accesses and the locations being accessed. This property is further tied to the load and presents significant challenges in the scalability of the application performance. Designing platforms to support extreme performance scaling requires understanding of how application specific information can be used to control the locality and improve the performance. Such insights are necessary to determine which control and which abstraction to provide for interfacing an underlying system and an application as well as for designing a new system. Our goal is to expose common requirements of data-intensive scientific applications for scalability. For the former type of applications, those with regular accesses and uniform workload, we contribute new methods to improve the temporal locality of software-managed local memories, and optimize the critical path of scheduling data transfers for multi-dimensional arrays in nested loops. In particular, we provide a runtime framework allowing transparent optimization by source-to-source compilers or automatic fine tuning by programmers. Finally, we demonstrate the effectiveness of the approach by comparing against a state-of-the-art language-based framework. For the latter type, those with irregular accesses and non-uniform workload, we analyze how the heavy-tailed property of input graphs limits the scalability of the application. Then, we introduce an application-specific workload model as well as a decomposition method that allows us to optimize locality with the custom load balancing constraints of the application. Finally, we demonstrate unprecedented strong scaling of a contagion simulation on two state-of-the-art high performance computing platforms.en
dc.format.mediumETDen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectParallel systemsen
dc.subjectSoftware-managed memoriesen
dc.subjectDistributed memoriesen
dc.subjectData localityen
dc.subjectScalabilityen
dc.subjectParallel discrete event simulationen
dc.subjectSocial networksen
dc.subjectContagionen
dc.titleOptimizing Data Accesses for Scaling Data-intensive Scientific Applicationsen
dc.typeDissertationen
dc.contributor.departmentComputer Scienceen
dc.description.degreePh. D.en
thesis.degree.namePh. D.en
thesis.degree.leveldoctoralen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.disciplineComputer Science and Applicationsen
dc.contributor.committeechairMarathe, Madhav Vishnuen
dc.contributor.committeechairNikolopoulos, Dimitrios S.en
dc.contributor.committeememberRibbens, Calvin J.en
dc.contributor.committeememberSchulz, Martinen
dc.contributor.committeememberBisset, Keith R.en


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record