Show simple item record

dc.contributor.authorKumaraswamy Ravindranathan, Krishnarajen_US
dc.date.accessioned2016-01-09T09:00:13Z
dc.date.available2016-01-09T09:00:13Z
dc.date.issued2016-01-08en_US
dc.identifier.othervt_gsexam:6712en_US
dc.identifier.urihttp://hdl.handle.net/10919/64423
dc.description.abstractThe objective of this thesis is to address the challenges faced in sustaining efficient, high-performance and scalable Distributed Software Frameworks (DSFs), such as MapReduce, Hadoop, Dryad, and Pregel, for supporting data-intensive scientific and enterprise applications on emerging heterogeneous compute, storage and network infrastructure. Large DSF deployments in the cloud continue to grow both in size and number, given DSFs are cost-effective and easy to deploy. DSFs are becoming heterogeneous with the use of advanced hardware technologies and due to regular upgrades to the system. For instance, low-cost, power-efficient clusters that employ traditional servers along with specialized resources such as FPGAs, GPUs, powerPC, MIPS and ARM based embedded devices, and high-end server-on-chip solutions will drive future DSFs infrastructure. Similarly, high-throughput DSF storage is trending towards hybrid and tiered approaches that use large in-memory buffers, SSDs, etc., in addition to disks. However, the schedulers and resource managers of these DSFs assume the underlying hardware to be similar or homogeneous. Another problem faced in evolving applications is that they are typically complex workflows comprising of different kernels. The kernels can be diverse, e.g., compute-intensive processing followed by data-intensive visualization and each kernel will have a different affinity towards different hardware. Because of the inability of the DSFs to understand heterogeneity of the underlying hardware architecture and applications, existing resource managers cannot ensure appropriate resource-application match for better performance and resource usage. In this dissertation, we design, implement, and evaluate DerbyhatS, an application-characteristics-aware resource manager for DSFs, which predicts the performance of the application under different hardware configurations and dynamically manage compute and storage resources as per the application needs. We adopt a quantitative approach where we first study the detailed behavior of various Hadoop applications running on different hardware configurations and propose application-attuned dynamic system management in order to improve the resource-application match. We re-design the Hadoop Distributed File System (HDFS) into a multi-tiered storage system that seamlessly integrates heterogeneous storage technologies into the HDFS. We also propose data placement and retrieval policies to improve the utilization of the storage devices based on their characteristics such as I/O throughput and capacity. DerbyhatS workflow scheduler is an application-attuned workflow scheduler and is constituted by two components. phi-Sched coupled with epsilon-Sched manages the compute heterogeneity and DUX coupled with AptStore manages the storage substrate to exploit heterogeneity. DerbyhatS will help realize the full potential of the emerging infrastructure for DSFs, e.g., cloud data centers, by offering many advantages over the state of the art by ensuring application-attuned, dynamic heterogeneous resource management.en_US
dc.format.mediumETDen_US
dc.publisherVirginia Techen_US
dc.rightsThis Item is protected by copyright and/or related rights. Some uses of this Item may be deemed fair and permitted by law even without permission from the rights holder(s), or the rights holder(s) may have licensed the work for use under certain conditions. For other uses you need to obtain permission from the rights holder(s).en_US
dc.subjectDistributed Software Frameworken_US
dc.subjectMapReduceen_US
dc.subjectWorkflow Manageren_US
dc.subjectHeterogeneous Resource Managementen_US
dc.subjectStorage Management.en_US
dc.titleExploiting Heterogeneity in Distributed Software Frameworksen_US
dc.typeDissertationen_US
dc.contributor.departmentComputer Scienceen_US
dc.description.degreePh. D.en_US
thesis.degree.namePh. D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen_US
thesis.degree.disciplineComputer Science and Applicationsen_US
dc.contributor.committeechairButt, Ali Raza Ashrafen_US
dc.contributor.committeememberGniady, Chrisen_US
dc.contributor.committeememberYao, Danfengen_US
dc.contributor.committeememberTilevich, Elien_US
dc.contributor.committeememberWang, Chaoen_US
dc.contributor.committeememberRamakrishnan, Narendranen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record