Optimizing Systems for Deep Learning Applications

dc.contributor.authorAlbahar, Hadeel Ahmaden
dc.contributor.committeechairButt, Alien
dc.contributor.committeememberAnwar, Alien
dc.contributor.committeememberChantem, Thidapaten
dc.contributor.committeememberMin, Chang Wooen
dc.contributor.committeememberTilevich, Elien
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2023-03-02T09:00:08Zen
dc.date.available2023-03-02T09:00:08Zen
dc.date.issued2023-03-01en
dc.description.abstractModern systems for Machine Learning (ML) workloads support heterogeneous workloads and resources. However, existing resource managers in these systems do not differentiate between heterogeneous GPU resources. Moreover, users are usually unaware of the sufficient and appropriate type and amount of GPU resources to request for their ML jobs. In this thesis, we analyze the performance of ML training and inference jobs and identify ML model and GPU characteristics that impact this performance. We then propose ML-based prediction models to accurately determine appropriate and sufficient resource requirements to ensure improved job latency and GPU utilization in the cluster.en
dc.description.abstractgeneralWe daily interact with and use many software applications such as social media, e-commerce, healthcare, and finance. These applications rely on different computing systems as well as artificial intelligence to deliver users the best service and experience. In this dissertation, we present optimizations to improve the performance of these artificial intelligence applications while at the same time improving the performance and the utilization of the systems and the heterogeneous resources they run on. We propose utilizing machine learning models, that learn from historical data of application performance as well as application and resource characteristics, to predict the necessary and sufficient resource requirements for these applications to ensure the optimal performance for the application and the underlying system.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:36650en
dc.identifier.urihttp://hdl.handle.net/10919/114021en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectGPU heterogeneityen
dc.subjectDeep Learning and Inferenceen
dc.subjectKubernetesen
dc.subjectGPU sharingen
dc.subjectResource requirement predictionen
dc.titleOptimizing Systems for Deep Learning Applicationsen
dc.typeDissertationen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Name:
Albahar_HA_D_2023.pdf
Size:
1.85 MB
Format:
Adobe Portable Document Format