AI-Driven Affective Captioning for Equitable STEM Access Among Deaf and Hard-of-Hearing Students

Ubur, Sunday David

AI-Driven Affective Captioning for Equitable STEM Access Among Deaf and Hard-of-Hearing Students

Files

Ubur_SD_D_2026.pdf (21.98 MB)

Downloads: 77

Date

2026-06-01

Authors

Ubur, Sunday David

Publisher

Virginia Tech

Abstract

This dissertation investigates how Artificial Intelligebce (AI)- and Augmented Reality (AR)-supported captioning can improve communication access for Deaf and Hard-of-Hearing (DHH) learners in STEM contexts. Traditional real-time captions provide essential access to spoken language, but they often omit nonverbal and contextual information such as speaker identity, tone, emphasis, affect, and conversational intent. Across a preliminary design study and three empirical studies, this work examines how caption augmentations can preserve these missing layers of meaning while maintaining readability, timeliness, trust, and user control. The preliminary study compared traditional captions with emotion-augmented caption designs and showed that affective and visual cues can support comprehension when they are lightweight and text-centered, but may increase workload when they compete with the main transcript or visual scene. Study 1, a qualitative study with DHH participants, found that users valued emotion-aware captions when they clarified tone, emphasis, or speaker intent, but only when cues were timely, legible, optional, and subordinate to the transcript. Study 2 evaluated culturally adaptive emotive captioning in AR by comparing two cue formats: compact symbolic cues, implemented as emoji/icon indicators, and explicit textual affect labels, implemented as inline text-tags, across high- and low-context cultural cohorts. Compact symbolic cues produced a robust cross-cultural preference, while qualitative findings showed that participants valued the cues differently: some emphasized speed and reduced distraction, while others emphasized easier access to speaker emotion. Study 3 evaluated Speaker-Aware Affective Captioning, a multi-speaker captioning interface that combined speaker-attributed captions, confidence-gated affect tags, and an on-demand AI Describe feature. The study showed that speaker attribution was the most consistently valued support, while AI Describe helped users recover from missed or unclear information. Affect tags showed promise, but their usefulness depended on timing, persistence, interpretability, and trust. Across these studies, findings show that accessible captioning should not simply add more expressive information. Instead, next-generation captioning systems should reduce users' inferential burden through layered support: preserving the transcript first, identifying speakers, supporting recovery from missed information, and adding affective interpretation only when it is accurate, low-burden, and user-controllable. This dissertation contributes empirical evidence and design guidelines for trustworthy, culturally sensitive, and readable affective captioning systems for inclusive STEM learning.

Keywords

Affective Captioning, Deaf or Hard-of-Hearing (DHH), Automatic Speech Recognition (ASR), Nonverbal Communication, Speaker-Aware Captioning

Persistent link

https://hdl.handle.net/10919/143221

Collections

Doctoral Dissertations

Full item page

AI-Driven Affective Captioning for Equitable STEM Access Among Deaf and Hard-of-Hearing Students

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections