A Workload-aware Resource Management and Scheduling System for Big Data Analysis
MetadataShow full item record
The big data era has driven the needs for data analysis in every aspect of our daily lives. With the rapid growth of data size and complexity of data analysis models, modern big data analytic applications face the challenge to provide timely results often with limited resources. Such demand drives the growth of new hardware resources including GPUs and FPGAs, as well as storage devices such as SSDs and NVMs. It is challenging to manage the resources available in a cost restricted environment to best serve the applications with different characteristics. Extant approaches are agnostic to such heterogeneity in both underlying resources and workloads and require user knowledge and manual configuration for best performance. In this dissertation, we design, and implement a series of novel techniques, algorithms, and frameworks, to realize workload-aware resource management and scheduling. We demonstrate our techniques for efficient resource management across memory resource for in-memory data analytic platforms, processing resources for compute-intensive machine learning applications, and finally we design and develop a workload and heterogeneity-aware scheduler for general big data platforms. The dissertation demonstrates that designing an effective resource manager requires efforts from both application and system side. The presented approach makes and joins the efforts on both sides to provide a holistic heterogeneity-aware resource manage and scheduling system. We are able to avoid task failure due to resource unavailability by workload-aware resource management, and improve the performance of data processing frameworks by carefully scheduling tasks according to the task characteristics and utilization and availability of the resources.
General Audience Abstract
Clusters of multiple computers connected through internet are often deployed in industry for larger scale data processing or computation that cannot be handled by standalone computers. In such a cluster, resources such as CPU, memory, disks are integrated to work together. It is important to manage a pool of such resources in a cluster to efficiently work together to provide better performance for workloads running on top. This role is taken by a software component in the middle layer called resource manager. Resource manager coordinates the resources in the computers and schedule tasks to them for computation. This dissertation reveals that current resource managers often partition resources statically hence cannot capture the dynamic resource needs of workloads as well as the heterogeneous configurations of the underlying resources. For example, some computers in a clsuter might be older than the others with slower CPU, less memory, etc. Workloads can show different resource needs. Watching YouTube require a lot of network resource while playing games demands powerful GPUs. To this end, the disseration proposes novel approaches to manage resources that are able to capture the heterogeneity of resources and dynamic workload needs, based on which, it can achieve efficient resource management, and schedule the right task to the right resource.
- Doctoral Dissertations