Optimizing Systems for Deep Learning Applications
dc.contributor.author | Albahar, Hadeel Ahmad | en |
dc.contributor.committeechair | Butt, Ali | en |
dc.contributor.committeemember | Anwar, Ali | en |
dc.contributor.committeemember | Chantem, Thidapat | en |
dc.contributor.committeemember | Min, Chang Woo | en |
dc.contributor.committeemember | Tilevich, Eli | en |
dc.contributor.department | Electrical and Computer Engineering | en |
dc.date.accessioned | 2023-03-02T09:00:08Z | en |
dc.date.available | 2023-03-02T09:00:08Z | en |
dc.date.issued | 2023-03-01 | en |
dc.description.abstract | Modern systems for Machine Learning (ML) workloads support heterogeneous workloads and resources. However, existing resource managers in these systems do not differentiate between heterogeneous GPU resources. Moreover, users are usually unaware of the sufficient and appropriate type and amount of GPU resources to request for their ML jobs. In this thesis, we analyze the performance of ML training and inference jobs and identify ML model and GPU characteristics that impact this performance. We then propose ML-based prediction models to accurately determine appropriate and sufficient resource requirements to ensure improved job latency and GPU utilization in the cluster. | en |
dc.description.abstractgeneral | We daily interact with and use many software applications such as social media, e-commerce, healthcare, and finance. These applications rely on different computing systems as well as artificial intelligence to deliver users the best service and experience. In this dissertation, we present optimizations to improve the performance of these artificial intelligence applications while at the same time improving the performance and the utilization of the systems and the heterogeneous resources they run on. We propose utilizing machine learning models, that learn from historical data of application performance as well as application and resource characteristics, to predict the necessary and sufficient resource requirements for these applications to ensure the optimal performance for the application and the underlying system. | en |
dc.description.degree | Doctor of Philosophy | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:36650 | en |
dc.identifier.uri | http://hdl.handle.net/10919/114021 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | GPU heterogeneity | en |
dc.subject | Deep Learning and Inference | en |
dc.subject | Kubernetes | en |
dc.subject | GPU sharing | en |
dc.subject | Resource requirement prediction | en |
dc.title | Optimizing Systems for Deep Learning Applications | en |
dc.type | Dissertation | en |
thesis.degree.discipline | Computer Engineering | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | doctoral | en |
thesis.degree.name | Doctor of Philosophy | en |
Files
Original bundle
1 - 1 of 1