Optimizing Systems for Deep Learning Applications

Albahar, Hadeel Ahmad

Optimizing Systems for Deep Learning Applications

dc.contributor.author	Albahar, Hadeel Ahmad	en
dc.contributor.committeechair	Butt, Ali	en
dc.contributor.committeemember	Anwar, Ali	en
dc.contributor.committeemember	Chantem, Thidapat	en
dc.contributor.committeemember	Min, Chang Woo	en
dc.contributor.committeemember	Tilevich, Eli	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2023-03-02T09:00:08Z	en
dc.date.available	2023-03-02T09:00:08Z	en
dc.date.issued	2023-03-01	en
dc.description.abstract	Modern systems for Machine Learning (ML) workloads support heterogeneous workloads and resources. However, existing resource managers in these systems do not differentiate between heterogeneous GPU resources. Moreover, users are usually unaware of the sufficient and appropriate type and amount of GPU resources to request for their ML jobs. In this thesis, we analyze the performance of ML training and inference jobs and identify ML model and GPU characteristics that impact this performance. We then propose ML-based prediction models to accurately determine appropriate and sufficient resource requirements to ensure improved job latency and GPU utilization in the cluster.	en
dc.description.abstractgeneral	We daily interact with and use many software applications such as social media, e-commerce, healthcare, and finance. These applications rely on different computing systems as well as artificial intelligence to deliver users the best service and experience. In this dissertation, we present optimizations to improve the performance of these artificial intelligence applications while at the same time improving the performance and the utilization of the systems and the heterogeneous resources they run on. We propose utilizing machine learning models, that learn from historical data of application performance as well as application and resource characteristics, to predict the necessary and sufficient resource requirements for these applications to ensure the optimal performance for the application and the underlying system.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:36650	en
dc.identifier.uri	http://hdl.handle.net/10919/114021	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	GPU heterogeneity	en
dc.subject	Deep Learning and Inference	en
dc.subject	Kubernetes	en
dc.subject	GPU sharing	en
dc.subject	Resource requirement prediction	en
dc.title	Optimizing Systems for Deep Learning Applications	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Albahar_HA_D_2023.pdf
Size:: 1.85 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations