Active Learning for Microarray based Leukemia Classification

Zhu, Kecheng

Active Learning for Microarray based Leukemia Classification

Files

Published version (634.57 KB)

Downloads: 102

Date

2021-11-12

Authors

Zhu, Kecheng

Publisher

ACM

Abstract

In machine learning, data labeling is assumed to be easy and cheap. However, in real word cases especially clinical field, data sets are rare and expensive to obtain. Active learning is an approach that can query the most informative data for the training. This leads to an alternative to deal with the concern mentioned above. The Sampling method is one of the key parts in active learning because it minimizes the training cost of the classifier. By different query method, models with considerable difference could be produced. The difference in model could lead to significant difference in training cost and final accuracy outcome. The approaches that were used to in this experiment is uncertainty sampling, diversity sampling and query by committee. In the experiment, active learning is applied on the microarray data with improving results. The classification on two types leukemia (acute myeloid leukemia and acute lymophoblastic leukemia) indicates a boost in accuracy with the same number of samples compared to passive machine learning. The experiments leads to the conclusion that with small number of samples with randomness in the field of leukemia classification, active learning produce an more active model. Additionally, active learning with query by committee finds the most informative sample with fewest trials.

Persistent link

http://hdl.handle.net/10919/112239

Collections

Journal Articles, Association for Computing Machinery (ACM)
Scholarly Works, Computer Science

Full item page

Active Learning for Microarray based Leukemia Classification

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections