Stacking Ensemble for auto_ml

dc.contributor.authorNgo, Khai Thoien
dc.contributor.committeechairErnst, Joseph M.en
dc.contributor.committeechairTokekar, Pratapen
dc.contributor.committeememberBroadwater, Robert P.en
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2018-06-14T08:01:10Zen
dc.date.available2018-06-14T08:01:10Zen
dc.date.issued2018-06-13en
dc.description.abstractMachine learning has been a subject undergoing intense study across many different industries and academic research areas. Companies and researchers have taken full advantages of various machine learning approaches to solve their problems; however, vast understanding and study of the field is required for developers to fully harvest the potential of different machine learning models and to achieve efficient results. Therefore, this thesis begins by comparing auto ml with other hyper-parameter optimization techniques. auto ml is a fully autonomous framework that lessens the knowledge prerequisite to accomplish complicated machine learning tasks. The auto ml framework automatically selects the best features from a given data set and chooses the best model to fit and predict the data. Through multiple tests, auto ml outperforms MLP and other similar frameworks in various datasets using small amount of processing time. The thesis then proposes and implements a stacking ensemble technique in order to build protection against over-fitting for small datasets into the auto ml framework. Stacking is a technique used to combine a collection of Machine Learning models’ predictions to arrive at a final prediction. The stacked auto ml ensemble results are more stable and consistent than the original framework; across different training sizes of all analyzed small datasets.en
dc.description.abstractgeneralMachine learning is a concept of using known past data to predict unknown future data. Many different industries uses machine learning; hospitals use machine learning to find mutations in DNA, online retailers use machine learning to recommend items, and advertisers use machine learning to show interesting ads to viewers. With increasing adoption of machine learning in various fields, there are a significant number of developers who want to take advantages of this concept, but they are not deeply familiar with techniques used in machine learning. This thesis introduces auto_ml framework which reduces the required deep understanding of these techniques. auto_ml automatically selects the best technique to use for each individual process, which used to train and predict given datasets. In addition, the thesis also implements a stacking ensemble technique which helps to yield consistently good predictions on small datasets. As the result, auto_ml performs better than MLP and other frameworks. In addition, auto_ml with the stacking ensemble technique performs more consistently than auto_ml without the stacking ensemble technique.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:15478en
dc.identifier.urihttp://hdl.handle.net/10919/83547en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectMachine learningen
dc.subjectStacking Ensembleen
dc.subjectModel Selectionen
dc.subjectHyper-parameter optimizationen
dc.subjectauto_mlen
dc.titleStacking Ensemble for auto_mlen
dc.typeThesisen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Ngo_KT_T_2018.pdf
Size:
1.84 MB
Format:
Adobe Portable Document Format

Collections