Mining Truth Tables and Straddling Biclusters in Binary Datasets

Owens, Clifford Conley

Mining Truth Tables and Straddling Biclusters in Binary Datasets

dc.contributor.author	Owens, Clifford Conley	en
dc.contributor.committeechair	Ramakrishnan, Naren	en
dc.contributor.committeecochair	Murali, T. M.	en
dc.contributor.committeemember	Brown, Ezra A.	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2014-03-14T20:48:05Z	en
dc.date.adate	2010-01-07	en
dc.date.available	2014-03-14T20:48:05Z	en
dc.date.issued	2009-11-05	en
dc.date.rdate	2010-01-07	en
dc.date.sdate	2009-11-18	en
dc.description.abstract	As the world swims deeper into a deluge of data, binary datasets relating objects to properties can be found in many different fields. Such datasets abound in practically any area of interest, including biology, politics, entertainment, and education. This explosion calls for the definition of new types of patterns in binary data, as well as algorithms to find efficiently find these patterns. In this work, we introduce truth tables as a new class of patterns to be mined in binary datasets. Truth tables represent a subset of properties which exhibit maximal variability (and hence, suggest independence) in occurrence patterns over the underlying objects. Unlike other measures of independence, truth tables possess anti-monotone features that can be exploited in order to mine them effectively. We present a level-wise algorithm that takes advantage of these features, showing results on real and synthetic data. These results demonstrate the scalability of our algorithm. We also introduce new methods of mining straddling biclusters. Biclusters relate subsets of objects to subsets of properties they share within a single dataset. Straddling biclusters extend biclusters by relating a subset of objects to subsets of properties they share in two datasets. We present two levelwise algorithms, named UnionMiner and TwoMiner, which discover straddling biclusters efficiently by treating multiple datasets as a single dataset. We show results on real and synthetic data, and explore the advantages and limitations of each algorithm. We develop guidelines which suggest which of these algorithms is likely to perform better based on features of the datasets.	en
dc.description.degree	Master of Science	en
dc.identifier.other	etd-11182009-172742	en
dc.identifier.sourceurl	http://scholar.lib.vt.edu/theses/available/etd-11182009-172742/	en
dc.identifier.uri	http://hdl.handle.net/10919/35745	en
dc.publisher	Virginia Tech	en
dc.relation.haspart	Owens_CA_T_2009	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	data mining	en
dc.subject	binary datasets	en
dc.title	Mining Truth Tables and Straddling Biclusters in Binary Datasets	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Owens_CA_T_2009.pdf
Size:: 954.38 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Masters Theses