Approaches to the Label-Switching Problem of Classification, Based on Partition-Space Relabeling and Label-Invariant Visualization

dc.contributor.authorFarrar, Daviden
dc.contributor.departmentStatisticsen
dc.date.accessioned2019-05-08T19:46:15Zen
dc.date.available2019-05-08T19:46:15Zen
dc.date.issued2006-07-15en
dc.description.abstractIn the context of interest, a method of cluster analysis is used to classify a set of units into a fixed number of classes. Simulation procedures with various conceptual foundations may be used to evaluate uncertainty, stability, or sampling error of such a classification. However simulation approaches may be subject to a label-switching problem, when a likelihood function, posterior density, or some objective function is invariant under permutation of class labels. We suggest a relabeling algorithm that maximizes a simple measure of agreement among classifications. However, it is known that effective summaries and visualization tools can be based on sample concurrence fractions, which we define as sample fractions with given pairs of units falling in the same cluster, and which are invariant under permutation of class labels. We expand the study of concurrence fractions by presenting a matrix theory, which is employed in relabeling, as well as in elaboration of visualization tools. We explore an ordination approach treating concurrence fractions as similarities between pairs of units. A matrix result supports straightforward application of the method of principal coordinates, leading to ordination plots in which Euclidean distances between pairs of units have a simple relationship to concurrence fractions. The use of concurrence fractions complements relabeling, by providing an efficient initial labeling.en
dc.description.sponsorshipEPA: STAR Grant RD 83136801-0en
dc.format.extent25 pagesen
dc.format.mimetypeapplication/pdfen
dc.identifier.sourceurlhttps://www.stat.vt.edu/content/dam/stat_vt_edu/graphics-and-pdfs/research-papers/Technical_Reports/TechReport06-7.pdfen
dc.identifier.urihttp://hdl.handle.net/10919/89398en
dc.language.isoenen
dc.publisherVirginia Techen
dc.relation.ispartofseriesTechnical Report No. 06-7en
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectConsensus matrixen
dc.subjectlabel-switchingen
dc.subjectmodel-based clusteringen
dc.subjectMonte Carlo simulationen
dc.subjectprincipal coordinates analysisen
dc.subjectsimilarity and dissimilarityen
dc.titleApproaches to the Label-Switching Problem of Classification, Based on Partition-Space Relabeling and Label-Invariant Visualizationen
dc.typeTechnical reporten
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TechReport06-7.pdf
Size:
144.27 KB
Format:
Adobe Portable Document Format