Exploratory Data Analysis using Clusters and Stories

dc.contributor.authorHossain, Mahmud Shahriaren
dc.contributor.committeechairRamakrishnan, Narenen
dc.contributor.committeememberNorth, Christopher L.en
dc.contributor.committeememberWatson, Layne T.en
dc.contributor.committeememberDavidson, Ianen
dc.contributor.committeememberFox, Edward A.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2014-03-14T20:13:24Zen
dc.date.adate2012-07-25en
dc.date.available2014-03-14T20:13:24Zen
dc.date.issued2012-06-08en
dc.date.rdate2012-07-25en
dc.date.sdate2012-06-19en
dc.description.abstractExploratory data analysis aims to study datasets through the use of iterative, investigative, and visual analytic algorithms. Due to the difficulty in managing and accessing the growing volume of unstructured data, exploratory analysis of datasets has become harder than ever and an interest to data mining researchers. In this dissertation, we study new algorithms for exploratory analysis of data collections using clusters and stories. Clustering brings together similar entities whereas stories connect dissimilar objects. The former helps organize datasets into regions of interest, and the latter explores latent information by connecting the dots between disjoint instances. This dissertation specifically focuses on five different research aspects to demonstrate the applicability and usefulness of clusters and stories as exploratory data analysis tools. In the area of clustering, we investigate whether clustering algorithms can be automatically "alternatized" and how they can be guided to obtain alternative results using flexible constraints as "scatter-gather" operations. We demonstrate the application of these ideas in many application domains, including studying the bat biosonar system and designing sustainable products. In the area of storytelling, we develop algorithms that can generate stories using distance, clique, and syntactic constraints. We explore the use of storytelling for studying document collections in the biomedical literature and intelligence analysis domain.en
dc.description.degreePh. D.en
dc.identifier.otheretd-06192012-223659en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-06192012-223659/en
dc.identifier.urihttp://hdl.handle.net/10919/28085en
dc.publisherVirginia Techen
dc.relation.haspartHossain_MS_D_2012.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectAlternative clusteringen
dc.subjectGuided clusteringen
dc.subjectStorytellingen
dc.subjectConnecting the dotsen
dc.titleExploratory Data Analysis using Clusters and Storiesen
dc.typeDissertationen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hossain_MS_D_2012.pdf
Size:
15.83 MB
Format:
Adobe Portable Document Format