Bridging Cognitive Gaps Between User and Model in Interactive Dimension Reduction
High-dimensional data is prevalent in all domains but is challenging to explore. Analysis and exploration of high-dimensional data are important for people in numerous fields. To help people explore and understand high-dimensional data, Andromeda, an interactive visual analytics tool, has been developed. However, our analysis uncovered several cognitive gaps relating to the Andromeda system: users do not realize the necessity of explicitly highlighting all the relevant data points; users are not clear about the dimensional information in the Andromeda visualization; and the Andromeda model cannot capture user intentions when constructing and deconstructing clusters. In this study, we designed and implemented solutions to address these gaps. Specifically, for the gap in highlighting all the relevant data points, we introduced a foreground and background view and distance lines. Our user study with a group of undergraduate students revealed that the foreground and background views and distance lines could significantly alleviate the highlighting issue. For the gap in understanding visualization dimensions, we implemented a dimension-assist feature. The results of a second user study with students with various backgrounds suggested that the dimension-assist feature could make it easier for users to find the extremum in one dimension and to describe correlations among multiple dimensions; however, the dimension-assist feature had only a small impact on characterizing the data distribution and assisting users in understanding the meanings of the weighted multidimensional scaling (WMDS) plot axes. Regarding the gap in creating and deconstructing clusters, we implemented a solution utilizing random sampling. A quantitative analysis of the random sampling strategy was performed, and the results demonstrated that the strategy improved Andromeda's capabilities in constructing and deconstructing clusters. We also applied the random sampling to two-point manipulations, making the Andromeda system more flexible and adaptable to differing data exploration tasks. Limitations are discussed, and potential future research directions are identified.