AI-Driven Interpretation of Nonverbal Communication in AR-Enhanced Real-Time Captions: Effects on Cognitive Load, Comprehension, and User Engagement

Loading...
Thumbnail Image

Files

TR Number

Date

2025-06-23

Journal Title

Journal ISSN

Volume Title

Publisher

ACM

Abstract

Current real-time captioning systems focus on transcribing speech, often overlooking facial expressions, body language, and vocal prosody that convey essential communicative cues. We present an AI-driven augmented reality (AR) captioning system that interprets non-verbal signals in real time and renders them as dynamic visual cues within the user’s view. Grounded in Cognitive Load Theory, cross-modal plasticity, and computational creativity, our approach supports Deaf and Hard of Hearing (DHH) and neurodiverse learners by transforming captions into creative, expressive media. We explore: (RQ1) how non-verbal cues affect comprehension, engagement, and creative interpretation; (RQ2) how cultural differences influence cue perception; and (RQ3) what AI and design strategies enable low-latency, customizable AR captions without increasing cognitive load. A user study shows 45% comprehension gains and 25% reduction in mental demand with emotional indicators in captions. Future work includes building a cross-cultural cue corpus, an open-source AR captioning pipeline, and design guidelines for inclusive STEM education, advancing accessibility and fostering creativity-driven communication.

Description

Keywords

Citation