Stacking Ensemble for auto_ml

Ngo, Khai Thoi

Stacking Ensemble for auto_ml

dc.contributor.author	Ngo, Khai Thoi	en
dc.contributor.committeechair	Ernst, Joseph M.	en
dc.contributor.committeechair	Tokekar, Pratap	en
dc.contributor.committeemember	Broadwater, Robert P.	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2018-06-14T08:01:10Z	en
dc.date.available	2018-06-14T08:01:10Z	en
dc.date.issued	2018-06-13	en
dc.description.abstract	Machine learning has been a subject undergoing intense study across many different industries and academic research areas. Companies and researchers have taken full advantages of various machine learning approaches to solve their problems; however, vast understanding and study of the field is required for developers to fully harvest the potential of different machine learning models and to achieve efficient results. Therefore, this thesis begins by comparing auto ml with other hyper-parameter optimization techniques. auto ml is a fully autonomous framework that lessens the knowledge prerequisite to accomplish complicated machine learning tasks. The auto ml framework automatically selects the best features from a given data set and chooses the best model to fit and predict the data. Through multiple tests, auto ml outperforms MLP and other similar frameworks in various datasets using small amount of processing time. The thesis then proposes and implements a stacking ensemble technique in order to build protection against over-fitting for small datasets into the auto ml framework. Stacking is a technique used to combine a collection of Machine Learning models’ predictions to arrive at a final prediction. The stacked auto ml ensemble results are more stable and consistent than the original framework; across different training sizes of all analyzed small datasets.	en
dc.description.abstractgeneral	Machine learning is a concept of using known past data to predict unknown future data. Many different industries uses machine learning; hospitals use machine learning to find mutations in DNA, online retailers use machine learning to recommend items, and advertisers use machine learning to show interesting ads to viewers. With increasing adoption of machine learning in various fields, there are a significant number of developers who want to take advantages of this concept, but they are not deeply familiar with techniques used in machine learning. This thesis introduces auto_ml framework which reduces the required deep understanding of these techniques. auto_ml automatically selects the best technique to use for each individual process, which used to train and predict given datasets. In addition, the thesis also implements a stacking ensemble technique which helps to yield consistently good predictions on small datasets. As the result, auto_ml performs better than MLP and other frameworks. In addition, auto_ml with the stacking ensemble technique performs more consistently than auto_ml without the stacking ensemble technique.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:15478	en
dc.identifier.uri	http://hdl.handle.net/10919/83547	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Machine learning	en
dc.subject	Stacking Ensemble	en
dc.subject	Model Selection	en
dc.subject	Hyper-parameter optimization	en
dc.subject	auto_ml	en
dc.title	Stacking Ensemble for auto_ml	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Ngo_KT_T_2018.pdf
Size:: 1.84 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses