A Voice-based Multimodal User Interface for VTQuest

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


The original VTQuest web-based software system requires users to interact using a mouse or a keyboard, forcing the users' hands and eyes to be constantly in use while communicating with the system. This prevents the user from being able to perform other tasks which require the user's hands or eyes at the same time. This restriction on the user's ability to multitask while using VTQuest is unnecessary and has been eliminated with the creation of the VTQuest Voice web-based software system. VTQuest Voice extends the original VTQuest functionality by providing the user with a voice interface to interact with the system using the Speech Application Language Tags (SALT) technology. The voice interface provides the user with the ability to navigate through the site, submit queries, browse query results, and receive helpful hints to better utilize the voice system. Individuals with a handicap that prevents them from using their arms or hands, users who are not familiar with the mouse and keyboard style of communication, and those who have their hands preoccupied need alternative communication interfaces which do not require the use of their hands. All of these users require and benefit from a voice interface being added onto VTQuest. Through the use of the voice interface, all of the system's features can be accessed exclusively with voice and without the use of a user's hands. Using a voice interface also frees the user's eyes from being used during the process of selecting an option or link on a page, which allows the user to look at the system less frequently. VTQuest Voice is implemented and tested for operation on computers running Microsoft Windows using Microsoft Internet Explorer with the correct SALT and Adobe Scalable Vector Graphics (SVG) Viewer plug-ins installed. VTQuest Voice offers a variety of features including an extensive grammar and out-of-turn interaction, which are flexible for future growth. The grammar offers ways in which users may begin or end a query to better accommodate the variety of ways users may phrase their queries. To accommodate for abbreviations of building names and alternate pronunciations of building names, the grammar also includes nicknames for the buildings. The out-of-turn interaction combines multiple steps into one spoken sentence thereby shortening the interaction and also making the process more natural for the user. The addition of a voice interface is recommended for web applications which a user may need to use his or her eyes and hands to multitask. Additional functionality which can be added later to VTQuest Voice is touch screen support and accessibility from cell phones, Personal Digital Assistants (PDAs), and other mobile devices.



human computer interaction, voice user interface, multi modal user interface, Client/server software