Scholarly Works, Center for Human-Computer Interaction (CHCI)
Permanent URI for this collection
Research articles, presentations, and other scholarship
Browse
Recent Submissions
- TAGGAR: General-Purpose Task Guidance from Natural Language in Augmented Reality using Vision-Language ModelsStover, Daniel; Bowman, Douglas A. (ACM, 2024-10-07)Augmented reality (AR) task guidance systems provide assistance for procedural tasks by rendering virtual guidance visuals within the real-world environment. Current AR task guidance systems are limited in that they require AR system experts to manually place visuals, require models of real-world objects, or only function for limited tasks or environments. We propose a general-purpose AR task guidance approach for tasks defined by natural language. Our approach allows an operator to take pictures of relevant objects and write task instructions for an end user, which are used by the system to determine where to place guidance visuals. Then, an end user can receive and follow guidance even if objects change locations or environments. Our approach utilizes current visionlanguage machine learning models for text and image semantic understanding and object localization. We built a proof-of-concept system called TAGGAR using our approach and tested its accuracy and usability in a user study. We found that all operators were able to generate clear guidance for tasks and end users were able to follow the guidance visuals to complete the expected action 85.7% of the time without any knowledge of the tasks.
- Photo Steward: A Deliberative Collective Intelligence Workflow for Validating Historical ArchivesMohanty, Vikram; Luther, Kurt (ACM, 2023-11-06)Historical photographs of people generate significant cultural and economic value, but correctly identifying the subjects of photos can be a difficult task, requiring careful attention to detail while synthesizing large amounts of data from diverse sources. When photos are misidentified, the negative consequences can include financial losses and inaccuracies in the historical record, and even the spread of mis- and disinformation. To address this challenge, we introduce Photo Steward, an information stewardship architecture that leverages a deliberative workflow for validating historical photo IDs. We explored Photo Steward in the context of Civil War Photo Sleuth (CWPS), a popular online community dedicated to identifying photos from the American Civil War era (1861–65) using facial recognition and crowdsourcing. While the platform has been successful in identifying hundreds of unknown photographs, there have been concerns about unverified identifications and misidentifications. Our exploratory evaluation of Photo Steward on CWPS showed that its validation workflow encouraged users to deliberate while making photo ID decisions. Further, its stewardship visualizations helped users to assess photo ID information accurately, while fostering diverse forms of stigmergic collaboration.
- Exploring Effect of Level of Storytelling Richness on Science Learning in Interactive and Immersive Virtual RealityZhang, Lei; Bowman, Douglas A. (ACM, 2022-06-21)Immersive and interactive storytelling in virtual reality (VR) is an emerging creative practice that has been thriving in recent years. Educational applications using immersive VR storytelling to explain complex science concepts have very promising pedagogical benefts because on the one hand, storytelling breaks down the complexity of science concepts by bridging them to people’s everyday experiences and familiar cognitive models, and on the other hand, the learning process is further reinforced through rich interactivity aforded by the VR experiences. However, it is unclear how diferent amounts of storytelling in an interactive VR storytelling experience may afect learning outcomes due to a paucity of literature on educational VR storytelling research. This preliminary study aims to add to the literature through an exploration of variations in the designs of essential storytelling elements in educational VR storytelling experiences and their impact on the learning of complex immunology concepts.
- Exploring Spatial UI Transition Mechanisms with Head-Worn Augmented RealityLu, Feiyu; Xu, Yan (ACM, 2022-04-29)Imagine in the future people comfortably wear augmented reality (AR) displays all day, how do we design interfaces that adapt to the contextual changes as people move around? In current operating systems, the majority of AR content defaults to staying at a fxed location until being manually moved by the users. However, this approach puts the burden of user interface (UI) transition solely on users. In this paper, we frst ran a bodystorming design workshop to capture the limitations of existing manual UI transition approaches in spatially diverse tasks. Then we addressed these limitations by designing and evaluating three UI transition mechanisms with different levels of automation and controllability (low-efort manual, semi-automated, fully-automated). Furthermore, we simulated imperfect contextual awareness by introducing prediction errors with diferent costs to correct them. Our results provide valuable lessons about the trade-ofs between UI automation levels, controllability, user agency, and the impact of prediction errors.
- Virtual replicas of real places: Experimental investigationsSkarbez, Richard; Bowman, Douglas A.; Ogle, J. Todd; Tucker, Thomas; Gabbard, Joseph L. (2021-07-13)The emergence of social virtual reality (VR) experiences, such as Facebook Spaces, Oculus Rooms, and Oculus Venues, will generate increased interest from users who want to share real places (both personal and public) with their fellow users in VR. At the same time, advances in scanning and reconstruction technology are making the realistic capture of real places more and more feasible. These complementary pressures mean that the representation of real places in virtual reality will be an increasingly common use case for VR. Despite this, there has been very little research into how users perceive such replicated spaces. This paper reports the results from a series of three user studies investigating this topic. Taken together, these results show that getting the scale of the space correct is the most important factor for generating a "feeling of reality", that it is important to avoid incoherent behaviors (such as floating objects), and that lighting makes little difference to perceptual similarity.
- Using Place as Provocation: In Situ Collaborative Narrative ConstructionSchaefer, Matthew; Tatar, Deborah; Harrison, Steve; Crandell, Alli (Research Center for Educational Technology, 2008)This paper describes a unique model for mobile, collaborative learning embodied in the use of a new software tool called PlaceMark©. The model is overtly intended to help learners reflect on their relationship to particular places, and the relationship between their own experience and other people’s experiences of those spaces. PlaceMark does this not by telling people what a place is, but by instead asking them, as they reflect and write about their own experiences of place. This paper describes how PlaceMark facilitates distributed control in coordinated classroom activities. We believe that the stance to knowledge embodied in the system encourages student responsibility within the learning process and helps teach about multiplicity of perspective in a visceral way. Additionally, as cell phones and other technologies become part of ordinary life, it is increasingly important that children (and all of us) come to have a deeper consciousness of place. This work reports on a pilot study of the software conducted with middle school students, and provides an analysis of the study activity.
- Learning When Less is More: “Bootstrapping” Undergraduate Programmers as Coordination DesignersLin, Strong; Tatar, Deborah; Harrison, Steve; Roschelle, Jeremy; Patton, Charles (Computer Professionals for Social Responsibility, 2006)In this paper, we describe an undergraduate computer science class in the United States that we started with the intention of creating a participatory design experience to create distributed mobile collaborative technologies for education. The case highlights the ways in which programmer understanding of an innovative new technology can depend on understanding the context of use. The students were to use Tuple-spaces, a language for coordination. However, it soon became clear that while the coordination of machines may be thought of as a computer science problem, the students could not understand the technical system without richer models of how, why, or when coordination is desirable. We were in the ironic position of teaching human coordination at the same time as describing the technical properties of a system to support it. To “bootstrap” the learning process, we asked the students to draw on their own coordination expertise by implementing familiar coordinative games. We propose games as an addition to the PD toolkit when implementers need help in stepping outside their everyday mindset.
- Aegis Audio Engine: Integrating a Real-Time Analog Signal Processing, Pattern Recognition, and a Procedural Soundtrack in a Live Twelve-Perfomer Spectacle With Crowd ParticipationBukvic, Ivica Ico; Matthews, Michael (Georgia Institute of Technology, 2015-07)In the following paper we present Aegis: a procedural networked soundtrack engine driven by real-time analog signal analysis and pattern recognition. Aegis was originally conceived as part of Drummer Game, a game-performancespectacle hybrid research project focusing on the depiction of a battle portrayed using terracotta soldiers. In it, each of the twelve cohorts—divided into two armies of six—are led by a drummer-performer who issues commands by accurately drumming precomposed rhythmic patterns on an original Chinese war drum. The ensuing spectacle is envisioned to also accommodate large audience participation whose input determines the morale of the two armies. An analog signal analyzer utilizes efficient pattern recognition to decipher the desired action and feed it both into the game and the soundtrack engine. The soundtrack engine then uses this action, as well as messages from the gaming simulation, to determine the most appropriate soundtrack parameters while ensuring minimal repetition and seamless transitions between various clips that account for tempo, meter, and key changes. The ensuing simulation offers a comprehensive system for pattern-driven input, holistic situation assessment, and a soundtrack engine that aims to generate a seamless musical experience without having to resort to cross-fades and other simplistic transitions that tend to disrupt a soundtrack’s continuity.
- L2OrkMote: Reimagining a Low-Cost Wearable Controller for a Live Gesture-Centric Music PerformanceTsoukalas, Kyriakos D.; Kubalak, Joseph R.; Bukvic, Ivica Ico (ACM, 2018-06)Laptop orchestras create music, although digitally produced, in a collaborative live performance not unlike a traditional orchestra. The recent increase in interest and investment in this style of music creation has paved the way for novel methods for musicians to create and interact with music. To this end, a number of nontraditional instruments have been constructed that enable musicians to control sound production beyond pitch and volume, integrating filtering, musical effects, etc. Wii Remotes (WiiMotes) have seen heavy use in maker communities, including laptop orchestras, for their robust sensor array and low cost. The placement of sensors and the form factor of the device itself are suited for video games, not necessarily live music creation. In this paper, the authors present a new controller design, based on the WiiMote hardware platform, to address usability in gesture-centric music performance. Based on the pilot-study data, the new controller offers unrestricted two-hand gesture production, smaller footprint, and lower muscle strain.
- Introducing D⁴: An Interactive 3D Audio Rapid Prototyping and Transportable Rendering Environment Using High Density Loudspeaker ArraysBukvic, Ivica Ico (University of Michigan, 2016)With a growing number of multimedia venues and research spaces equipped with High Density Loudspeaker Arrays, there is a need for an integrative 3D audio spatialization system that offers both a scalable spatialization algorithm and a battery of supporting rapid prototyping tools for time-based editing, rendering, and interactive low-latency manipulation. D⁴ library aims to assist this newfound whitespace by introducing a Layer Based Amplitude Panning algorithm and a collection of rapid prototyping tools for the 3D time-based audio spatialization and data sonification. The ensuing ecosystem is designed to be transportable and scalable. It supports a broad array of configurations, from monophonic to as many as hardware can handle. D⁴’s rapid prototyping tools leverage oculocentric strategies to importing and spatially rendering multidimensional data and offer an array of new approaches to time-based spatial parameter manipulation and representation. The following paper presents unique affordances of D⁴’s rapid prototyping tools.
- Introducing a K-12 Mechatronic NIME KitTsoukalas, Kyriakos D.; Bukvic, Ivica Ico (ACM, 2018-06)The following paper introduces a new mechatronic NIME kit that uses new additions to the Pd-L2Ork visual programing environment and its K-12 learning module. It is designed to facilitate the creation of simple mechatronics systems for physical sound production in K- 12 and production scenarios. The new set of objects builds on the existing support for the Raspberry Pi platform to also include the use of electric actuators via the microcomputer’s GPIO system. Moreover, we discuss implications of the newly introduced kit in the creative and K-12 education scenarios by sharing observations from a series of pilot workshops, with particular focus on using mechatronic NIMEs as a catalyst for the development of programing skills.
- NIMEhub: Toward a Repository for Sharing and Archiving Instrument DesignsMcPherson, Andrew P.; Berdahl, Edgar; Lyons, Michael J.; Jensensius, Alexander Refsum; Bukvic, Ivica Ico; Knudson, Arve (ACM, 2016-07)This workshop will explore the potential creation of a community database of digital musical instrument (DMI) designs. In other research communities, reproducible research practices are common, including open-source software, open datasets, established evaluation methods and community standards for research practice. NIME could benefit from similar practices, both to share ideas amongst geographically distant researchers and to maintain instrument designs after their first performances. However, the needs of NIME are different from other communities on account of NIME's reliance on custom hardware designs and the interdependence of technology and arts practice. This half-day workshop will promote a community discussion of the potential benefits and challenges of a DMI repository and plan concrete steps toward its implementation.
- Introducing Locus: a NIME for Immersive Exocentric Aural EnvironmentsSardana, Disha; Joo, Woohun; Bukvic, Ivica Ico; Earle, Gregory D. (ACM, 2019-06)Locus is a NIME designed specifically for an interactive, immersive high density loudspeaker array environment. The system is based on a pointing mechanism to interact with a sound scene comprising 128 speakers. Users can point anywhere to interact with the system, and the spatial interaction utilizes motion capture, so it does not require a screen. Instead it is completely controlled via hand gestures using a glove that is populated with motion-tracking markers. The main purpose of this system is to offer intuitive physical interaction with the perimeter-based spatial sound sources. Further, its goal is to minimize user-worn technology and thereby enhance freedom of motion by utilizing environmental sensing devices, such as motion capture cameras or infrared sensors. The ensuing creativity enabling technology is applicable to a broad array of possible scenarios, from researching limits of human spatial hearing perception to facilitating learning and artistic performances, including dance. Below we describe our NIME design and implementation, its preliminary assessment, and offer a Unity-based toolkit to facilitate its broader deployment and adoption.
- 3D Time-Based Aural Data Representation Using D⁴ Library’s Layer Based Amplitude Panning AlgorithmBukvic, Ivica Ico (Georgia Institute of Technology, 2016-07)The following paper introduces a new Layer Based Amplitude Panning algorithm and supporting D⁴ library of rapid prototyping tools for the 3D time-based data representation using sound. The algorithm is designed to scale and support a broad array of configurations, with particular focus on High Density Loudspeaker Arrays (HDLAs). The supporting rapid prototyping tools are designed to leverage oculocentric strategies to importing, editing, and rendering data, offering an array of innovative approaches to spatial data editing and representation through the use of sound in HDLA scenarios. The ensuing D⁴ ecosystem aims to address the shortcomings of existing approaches to spatial aural representation of data, offers unique opportunities for furthering research in the spatial data audification and sonification, as well as transportable and scalable spatial media creation and production.
- Cinemacraft: Immersive Live Machinima as an Empathetic Musical Storytelling PlatformNarayanan, Siddhart; Bukvic, Ivica Ico (University of Michigan, 2016)In the following paper we present Cinemacraft, a technology-mediated immersive machinima platform for collaborative performance and musical human-computer interaction. To achieve this, Cinemacraft innovates upon a reverse-engineered version of Minecraft, offering a unique collection of live machinima production tools and a newly introduced Kinect HD module that allows for embodied interaction, including posture, arm movement, facial expressions, and a lip syncing based on captured voice input. The result is a malleable and accessible sensory fusion platform capable of delivering compelling live immersive and empathetic musical storytelling that through the use of low fidelity avatars also successfully sidesteps the uncanny valley.
- OPERAcraft: Blurring the Lines between Real and VirtualBukvic, Ivica Ico; Cahoon, Cody; Wyatt, Ariana; Cowden, Tracy; Dredger, Katie (University of Michigan, 2014-09)In the following paper we present an innovative approach to coupling gaming, telematics, machinima, and opera to produce a hybrid performance art form and an arts+technology education platform. To achieve this, we leverage a custom Minecraft video game and sandbox mod and pd-l2ork real-time digital signal processing environment. The result is a malleable telematic-ready platform capable of supporting a broad array of artistic forms beyond its original intent, including theatre, cinema, as well as machinima and other experimental genres.
- New Interfaces for Spatial Musical ExpressionBukvic, Ivica Ico; Sardana, Disha; Joo, Woohun (ACM, 2020-07)With the proliferation of venues equipped with the high den- sity loudspeaker arrays there is a growing interest in developing new interfaces for spatial musical expression (NISME). Of particular interest are interfaces that focus on the emancipation of the spatial domain as the primary dimension for musical expression. Here we present Monet NISME that leverages multitouch pressure-sensitive surface and the D⁴ library’s spatial mask and thereby allows for a unique approach to interactive spatialization. Further, we present a study with 22 participants designed to assess its usefulness and compare it to the Locus, a NISME introduced in 2019 as part of a localization study which is built on the same design principles of using natural gestural interaction with the spatial content. Lastly, we briefly discuss the utilization of both NISMEs in two artistic performances and propose a set of guidelines for further exploration in the NISME domain.
- Consistency of Sedentary Behavior Patterns among Office Workers with Long-Term Access to Sit-Stand WorkstationsHuysmans, Maaike A.; Srinivasan, Divya; Mathiassen, Svend Erik (Oxford University Press, 2019-04-22)Introduction: Sit-stand workstations are a popular intervention to reduce sedentary behavior (SB) in office settings. However, the extent and distribution of SB in office workers long-term accustomed to using sit-stand workstations as a natural part of their work environment are largely unknown. In the present study, we aimed to describe patterns of SB in office workers with long-term access to sit-stand workstations and to determine the extent to which these patterns vary between days and workers. Methods: SB was objectively monitored using thigh-worn accelerometers for a full week in 24 office workers who had been equipped with a sit-stand workstation for at least 10 months. A comprehensive set of variables describing SB was calculated for each workday and worker, and distributions of these variables between days and workers were examined. Results: On average, workers spent 68% work time sitting [standard deviation (SD) between workers and between days (within worker): 10.4 and 18.2%]; workers changed from sitting to standing/ walking 3.2 times per hour (SDs 0.6 and 1.2 h−1); with bouts of sitting being 14.9 min long (SDs 4.2 and 8.5 min). About one-third of the workers spent >75% of their workday sitting. Between-workers variability was significantly different from zero only for percent work time sitting, while betweendays (within-worker) variability was substantial for all SB variables. Conclusions: Office workers accustomed to using sit-stand workstations showed homogeneous patterns of SB when averaged across several days, except for percent work time seated. However, SB differed substantially between days for any individual worker. The finding that many workers were extensively sedentary suggests that just access to sit-stand workstations may not be a sufficient remedy against SB; additional personalized interventions reinforcing use may be needed. To this end, differences in SB between days should be acknowledged as a potentially valuable source of variation.
- Read-Agree-Predict: A Crowdsourced Approach to Discovering Relevant Primary Sources for HistoriansWang, Nai-Ching; Hicks, David; Quigley, Paul; Luther, Kurt (Human Computation Institute, 2019)Historians spend significant time looking for relevant, high-quality primary sources in digitized archives and through web searches. One reason this task is time-consuming is that historians’ research interests are often highly abstract and specialized. These topics are unlikely to be manually indexed and are difficult to identify with automated text analysis techniques. In this article, we investigate the potential of a new crowdsourcing model in which the historian delegates to a novice crowd the task of labeling the relevance of primary sources with respect to her unique research interests. The model employs a novel crowd workflow, Read-Agree-Predict (RAP), that allows novice crowd workers to label relevance as well as expert historians. As a useful byproduct, RAP also reveals and prioritizes crowd confusions as targeted learning opportunities. We demonstrate the value of our model with two experiments with paid crowd workers (n=170), with the future goal of extending our work to classroom students and public history interventions. We also discuss broader implications for historical research and education.
- The Effects of Incorrect Occlusion Cues on the Understanding of Barehanded Referencing in Collaborative Augmented RealityLi, Yuan; Hu, Donghan; Wang, Boyuan; Bowman, Douglas A.; Lee, Sang Won (Frontiers, 2021-07-01)In many collaborative tasks, the need for joint attention arises when one of the users wants to guide others to a specific location or target in space. If the collaborators are co-located and the target position is in close range, it is almost instinctual for users to refer to the target location by pointing with their bare hands. While such pointing gestures can be efficient and effective in real life, performance will be impacted if the target is in augmented reality (AR), where depth cues like occlusion may be missing if the pointer’s hand is not tracked and modeled in 3D. In this paper, we present a study utilizing head-worn AR displays to examine the effects of incorrect occlusion cues on spatial target identification in a collaborative barehanded referencing task. We found that participants’ performance in AR was reduced compared to a real-world condition, but also that they developed new strategies to cope with the limitations of AR. Our work also identified mixed results of the effect of spatial relationships between users.