Stacking Ensemble for auto_ml
dc.contributor.author | Ngo, Khai Thoi | en |
dc.contributor.committeechair | Ernst, Joseph M. | en |
dc.contributor.committeechair | Tokekar, Pratap | en |
dc.contributor.committeemember | Broadwater, Robert P. | en |
dc.contributor.department | Electrical and Computer Engineering | en |
dc.date.accessioned | 2018-06-14T08:01:10Z | en |
dc.date.available | 2018-06-14T08:01:10Z | en |
dc.date.issued | 2018-06-13 | en |
dc.description.abstract | Machine learning has been a subject undergoing intense study across many different industries and academic research areas. Companies and researchers have taken full advantages of various machine learning approaches to solve their problems; however, vast understanding and study of the field is required for developers to fully harvest the potential of different machine learning models and to achieve efficient results. Therefore, this thesis begins by comparing auto ml with other hyper-parameter optimization techniques. auto ml is a fully autonomous framework that lessens the knowledge prerequisite to accomplish complicated machine learning tasks. The auto ml framework automatically selects the best features from a given data set and chooses the best model to fit and predict the data. Through multiple tests, auto ml outperforms MLP and other similar frameworks in various datasets using small amount of processing time. The thesis then proposes and implements a stacking ensemble technique in order to build protection against over-fitting for small datasets into the auto ml framework. Stacking is a technique used to combine a collection of Machine Learning models’ predictions to arrive at a final prediction. The stacked auto ml ensemble results are more stable and consistent than the original framework; across different training sizes of all analyzed small datasets. | en |
dc.description.abstractgeneral | Machine learning is a concept of using known past data to predict unknown future data. Many different industries uses machine learning; hospitals use machine learning to find mutations in DNA, online retailers use machine learning to recommend items, and advertisers use machine learning to show interesting ads to viewers. With increasing adoption of machine learning in various fields, there are a significant number of developers who want to take advantages of this concept, but they are not deeply familiar with techniques used in machine learning. This thesis introduces auto_ml framework which reduces the required deep understanding of these techniques. auto_ml automatically selects the best technique to use for each individual process, which used to train and predict given datasets. In addition, the thesis also implements a stacking ensemble technique which helps to yield consistently good predictions on small datasets. As the result, auto_ml performs better than MLP and other frameworks. In addition, auto_ml with the stacking ensemble technique performs more consistently than auto_ml without the stacking ensemble technique. | en |
dc.description.degree | Master of Science | en |
dc.format.medium | ETD | en |
dc.identifier.other | vt_gsexam:15478 | en |
dc.identifier.uri | http://hdl.handle.net/10919/83547 | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | Machine learning | en |
dc.subject | Stacking Ensemble | en |
dc.subject | Model Selection | en |
dc.subject | Hyper-parameter optimization | en |
dc.subject | auto_ml | en |
dc.title | Stacking Ensemble for auto_ml | en |
dc.type | Thesis | en |
thesis.degree.discipline | Computer Engineering | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |
Files
Original bundle
1 - 1 of 1