Building a Trustworthy Question Answering System for Covid-19 Tracking

Files

TR Number

Date

2021-09-02

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

During the unprecedented global pandemic of Covid-19, the general public is suffering from inaccurate Covid-19 related information including outdated information and fake news. The most used media: TV, social media, newspaper, and radio are incompetent in providing certitude and flash updates that people are seeking. In order to cope with this challenge, several public data resources that are dedicated to providing Covid-19 information were born. They rallied with experts from different fields to provide authoritative and up-to-date pandemic updates. However, the general public cannot still make complete use of such resources since the learning curve is too steep, especially for the aged and under-aged users.

To address this problem, in this Thesis, we propose a question answering system that can be interacted with using simple natural language-based sentences. While building this system, we investigate qualified public data resources and from the data content they are providing, and we collect a set of frequently asked questions for Covid-19 tracking. We further build a dedicated dataset named CovidQA for evaluating the performance of the question answering system with different models. Based on the new dataset, we assess multiple machine learning-based models that are built for retrieving relevant information from databases, and then propose two empirical models which utilize the pre-defined templates to generate SQL queries. In our experiments, we demonstrate both quantitative and qualitative results and provide a comprehensive comparison between different types of methods. The results show that the proposed template-based methods are simple but effective in building question answering systems for specific domain problems.

Description

Keywords

Information Retrieval, Question Answering, Database, Machine Learning, Natural Language Processing, Healthcare, Covid-19 Dashboard

Citation

Collections