Sonification of the Scene in the Image Environment and Metaverse Using Natural Language

Wasi, Mohd Sheeban

Sonification of the Scene in the Image Environment and Metaverse Using Natural Language

dc.contributor.author	Wasi, Mohd Sheeban	en
dc.contributor.committeechair	Polys, Nicholas F.	en
dc.contributor.committeemember	McCrickard, D. Scott	en
dc.contributor.committeemember	Bukvic, Ivica	en
dc.contributor.department	Computer Science and Applications	en
dc.date.accessioned	2023-01-18T09:00:47Z	en
dc.date.available	2023-01-18T09:00:47Z	en
dc.date.issued	2023-01-17	en
dc.description.abstract	This metaverse and computer vision-powered application is designed to serve people with low vision or a visual impairment, ranging from adults to old age. Specifically, we hope to improve the situational awareness of users in a scene by narrating the visual content from their point of view. The user would be able to understand the information through auditory channels as the system would narrate the scene's description using speech technology. This could increase the accessibility of visual-spatial information for the users in a metaverse and later in the physical world. This solution is designed and developed considering the hypothesis that if we enable the narration of a scene's visual content, we can increase the understanding and access to that scene. This study paves the way for VR technology to be used as a training and exploration tool not limited to blind people in generic environments, but applicable to specific domains such as military, healthcare, or architecture and planning. We have run a user study and evaluated our hypothesis about which set of algorithms will perform better for a specific category of tasks - like search or survey - and evaluated the narration algorithms by the user's ratings of naturalness, correctness and satisfaction. The tasks and algorithms have been discussed in detail in the chapters of this thesis.	en
dc.description.abstractgeneral	The solution is built using an object detection algorithm and virtual environments which run on the web browser using X3DOM. The solution would help improve situational awareness for normal people as well as for low vision individuals through speech. On a broader scale, we seek to contribute to accessibility solutions. We have designed four algorithms which will help user to understand the scene information through auditory channels as the system would narrate the scene's description using speech technology. The idea would increase the accessibility of visual-spatial information for the users in a metaverse and later in the physical world.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:36223	en
dc.identifier.uri	http://hdl.handle.net/10919/113216	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Metaverse	en
dc.subject	Machine Learning	en
dc.subject	Computer Vision	en
dc.subject	X3DOM	en
dc.title	Sonification of the Scene in the Image Environment and Metaverse Using Natural Language	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Wasi_MS_T_2023.pdf
Size:: 12.31 MB
Format:: Adobe Portable Document Format

Download

Name:: Wasi_MS_T_2023_support_1.pdf
Size:: 41.99 KB
Format:: Adobe Portable Document Format
Description:: Supporting documents

Download

Collections

Masters Theses