The Art of Deep Connection - Towards Natural and Pragmatic Conversational Agent Interactions

Ray, Arijit

The Art of Deep Connection - Towards Natural and Pragmatic Conversational Agent Interactions

dc.contributor.author	Ray, Arijit	en
dc.contributor.committeechair	Huang, Jia-Bin	en
dc.contributor.committeechair	Parikh, Devi	en
dc.contributor.committeemember	Abbott, A. Lynn	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2017-07-13T08:00:23Z	en
dc.date.available	2017-07-13T08:00:23Z	en
dc.date.issued	2017-07-12	en
dc.description.abstract	As research in Artificial Intelligence (AI) advances, it is crucial to focus on having seamless communication between humans and machines in order to effectively accomplish tasks. Smooth human-machine communication requires the machine to be sensible and human-like while interacting with humans, while simultaneously being capable of extracting the maximum information it needs to accomplish the desired task. Since a lot of the tasks required to be solved by machines today involve the understanding of images, training machines to have human-like and effective image-grounded conversations with humans is one important step towards achieving this goal. Although we now have agents that can answer questions asked for images, they are prone to failure from confusing input, and cannot ask clarification questions, in turn, to extract the desired information from humans. Hence, as a first step, we direct our efforts towards making Visual Question Answering agents human-like by making them resilient to confusing inputs that otherwise do not confuse humans. Not only is it crucial for a machine to answer questions reasonably, it should also know how to ask questions sequentially to extract the desired information it needs from a human. Hence, we introduce a novel game called the Visual 20 Questions Game, where a machine tries to figure out a secret image a human has picked by having a natural language conversation with the human. Using deep learning techniques like recurrent neural networks and sequence-to-sequence learning, we demonstrate scalable and reasonable performances on both the tasks.	en
dc.description.abstractgeneral	Research in Artificial Intelligence has reached to a point where computers can answer natural freeform questions asked to arbitrary images in a somewhat reasonable manner. These machines are called Visual Question Answering agents. However, they are prone to failure from even a slightly confusing input. For example, for an obviously irrelevant question asked to an image, they would answer something non-sensical instead of recognizing that the question is irrelevant. Furthermore, they also cannot ask questions in turn to humans for clarification or for more information. These shortcomings not only harm their efficacy, but also harm their perceived trust from human users. In order to remedy these problems, we first direct our efforts towards making Visual Question Answering agents capable of identifying an irrelevant question for an image. Next, we also try to train machines to be able to ask questions to extract more information from humans to make an informed decision. We do this by introducing a novel game called the Visual 20 Questions game, where a machine tries to figure out a secret image a human has picked by having a natural language conversation with the human. Deep learning techniques such as sequence-to-sequence learning using recurrent neural networks make it possible for machines to learn how to converse based on a series of conversational exchanges made between two humans. Techniques like reinforcement learning make it possible for machines to better themselves based on rewards it gets for accomplishing a task in a certain way. Using such algorithms, we demonstrate promise towards scalable and reasonable performances on both the tasks.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:11441	en
dc.identifier.uri	http://hdl.handle.net/10919/78335	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Computer Vision	en
dc.subject	Natural Language Processing	en
dc.subject	Conversational Agents	en
dc.subject	Chatbots	en
dc.subject	Deep learning (Machine learning)	en
dc.subject	Machine Learning	en
dc.subject	Artificial Intelligence	en
dc.title	The Art of Deep Connection - Towards Natural and Pragmatic Conversational Agent Interactions	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Ray_A_T_2017.pdf
Size:: 7.53 MB
Format:: Adobe Portable Document Format

Download

Name:: Ray_A_T_2017_support_3.pdf
Size:: 409.38 KB
Format:: Adobe Portable Document Format
Description:: Supporting documents

Download

Collections

Masters Theses