Scalable Data Management for Object-based Storage Systems

dc.contributor.authorWadhwa, Bhartien
dc.contributor.committeechairButt, Ali R.en
dc.contributor.committeememberWang, Feiyien
dc.contributor.committeememberServant Cortes, Francisco Javieren
dc.contributor.committeememberTilevich, Elien
dc.contributor.committeememberMeng, Naen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2020-08-20T08:00:26Zen
dc.date.available2020-08-20T08:00:26Zen
dc.date.issued2020-08-19en
dc.description.abstractParallel I/O performance is crucial to sustain scientific applications on large-scale High-Performance Computing (HPC) systems. Large scale distributed storage systems, in particular the object-based storage systems, face severe challenges for managing the data efficiently. Inefficient data management leads to poor I/O and storage performance in HPC applications and scientific workflows. Some of the main challenges for efficient data management arise from poor resource allocation, load imbalance in object storage targets, and inflexible data sharing between applications in a workflow. In addition, parallel I/O makes it challenging to shoehorn new interfaces, such as taking advantage of multiple layers of storage and support for analysis in the data path. Solving these challenges to improve performance and efficiency of object-based storage systems is crucial, especially for upcoming era of exascale systems. This dissertation is focused on solving these major challenges in object-based storage systems by providing scalable data management strategies. In the first part of the dis-sertation (Chapter 3), we present a resource contention aware load balancing tool (iez) for large scale distributed object-based storage systems. In Chapter 4, we extend iez to support Progressive File Layout for object-based storage system: Lustre. In the second part (Chapter 5), we present a technique to facilitate data sharing in scientific workflows using object-based storage, with our proposed tool Workflow Data Communicator. In the last part of this dissertation, we present a solution for transparent data management in multi-layer storage hierarchy of present and next-generation HPC systems.This dissertation shows that by intelligently employing scalable data management techniques, scientific applications' and workflows' flexibility and performance in object-based storage systems can be enhanced manyfold. Our proposed data management strategies can guide next-generation HPC storage systems' software design to efficiently support data for scientific applications and workflows.en
dc.description.abstractgeneralLarge scale object-based storage systems face severe challenges to manage the data efficiently for HPC applications and workflows. These storage systems often manage and share data inflexibly, without considering the load imbalance and resource contention in the underlying multi-layer storage hierarchy. This dissertation first studies how resource contention and inflexible data sharing mechanisms impact HPC applications' storage and I/O performance; and then presents a series of efficient techniques, tools and algorithms to provide efficient and scalable data management for current and next-generation HPC storage systemsen
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:27105en
dc.identifier.urihttp://hdl.handle.net/10919/99791en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectLustreen
dc.subjectCephen
dc.subjectHigh Performance Computingen
dc.subjectParallel File Systemsen
dc.subjectParallelI/O Optimizationen
dc.subjectLoad Imbalanceen
dc.subjectResource Contentionen
dc.titleScalable Data Management for Object-based Storage Systemsen
dc.typeDissertationen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Wadhwa_B_D_2020.pdf
Size:
6.2 MB
Format:
Adobe Portable Document Format