VTechWorks staff will be away for the winter holidays starting Tuesday, December 24, 2024, through Wednesday, January 1, 2025, and will not be replying to requests during this time. Thank you for your patience, and happy holidays!
 

The Art of Deep Connection - Towards Natural and Pragmatic Conversational Agent Interactions

dc.contributor.authorRay, Arijiten
dc.contributor.committeechairHuang, Jia-Binen
dc.contributor.committeechairParikh, Devien
dc.contributor.committeememberAbbott, A. Lynnen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2017-07-13T08:00:23Zen
dc.date.available2017-07-13T08:00:23Zen
dc.date.issued2017-07-12en
dc.description.abstractAs research in Artificial Intelligence (AI) advances, it is crucial to focus on having seamless communication between humans and machines in order to effectively accomplish tasks. Smooth human-machine communication requires the machine to be sensible and human-like while interacting with humans, while simultaneously being capable of extracting the maximum information it needs to accomplish the desired task. Since a lot of the tasks required to be solved by machines today involve the understanding of images, training machines to have human-like and effective image-grounded conversations with humans is one important step towards achieving this goal. Although we now have agents that can answer questions asked for images, they are prone to failure from confusing input, and cannot ask clarification questions, in turn, to extract the desired information from humans. Hence, as a first step, we direct our efforts towards making Visual Question Answering agents human-like by making them resilient to confusing inputs that otherwise do not confuse humans. Not only is it crucial for a machine to answer questions reasonably, it should also know how to ask questions sequentially to extract the desired information it needs from a human. Hence, we introduce a novel game called the Visual 20 Questions Game, where a machine tries to figure out a secret image a human has picked by having a natural language conversation with the human. Using deep learning techniques like recurrent neural networks and sequence-to-sequence learning, we demonstrate scalable and reasonable performances on both the tasks.en
dc.description.abstractgeneralResearch in Artificial Intelligence has reached to a point where computers can answer natural freeform questions asked to arbitrary images in a somewhat reasonable manner. These machines are called Visual Question Answering agents. However, they are prone to failure from even a slightly confusing input. For example, for an obviously irrelevant question asked to an image, they would answer something non-sensical instead of recognizing that the question is irrelevant. Furthermore, they also cannot ask questions in turn to humans for clarification or for more information. These shortcomings not only harm their efficacy, but also harm their perceived trust from human users. In order to remedy these problems, we first direct our efforts towards making Visual Question Answering agents capable of identifying an irrelevant question for an image. Next, we also try to train machines to be able to ask questions to extract more information from humans to make an informed decision. We do this by introducing a novel game called the Visual 20 Questions game, where a machine tries to figure out a secret image a human has picked by having a natural language conversation with the human. Deep learning techniques such as sequence-to-sequence learning using recurrent neural networks make it possible for machines to learn how to converse based on a series of conversational exchanges made between two humans. Techniques like reinforcement learning make it possible for machines to better themselves based on rewards it gets for accomplishing a task in a certain way. Using such algorithms, we demonstrate promise towards scalable and reasonable performances on both the tasks.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:11441en
dc.identifier.urihttp://hdl.handle.net/10919/78335en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectComputer Visionen
dc.subjectNatural Language Processingen
dc.subjectConversational Agentsen
dc.subjectChatbotsen
dc.subjectDeep learning (Machine learning)en
dc.subjectMachine Learningen
dc.subjectArtificial Intelligenceen
dc.titleThe Art of Deep Connection - Towards Natural and Pragmatic Conversational Agent Interactionsen
dc.typeThesisen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Ray_A_T_2017.pdf
Size:
7.53 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
Ray_A_T_2017_support_3.pdf
Size:
409.38 KB
Format:
Adobe Portable Document Format
Description:
Supporting documents

Collections