Covid-19 Fake News Detection

Abstract

The Covid-19 virus is a respiratory illness that causes the isolation and retreat of people globally. People wanted updates and information in real time related to the virus such as what regions/areas are affected, to what degree are the regions affected (heavily infected, none infected, etc), how to prevent catching the virus, and cures for the virus. Social media became a popular platform for people to share information, news, and opinions about the virus. As much positive information that may be spread among social media, just as much, if not more, misinformation can be spread on social media platforms. Misinformation is harmful because it can directly affect the health of individuals who fall victim to the misinformation. For example, say a twitter user tweets medical advice about Covid-19, and people who see the tweet choose to follow the advice. Now consider the scenario where they were intentionally spreading false information, which is indeed the opposite of what you should do. The individuals who followed the twitter trolls medical advice may have their own health at risk, and anyone in their sphere of influence. Our aim is to understand the types of misinformation spread in social media, and help people identify misinformation spread on Twitter related to the subject of Covid-19. We’re going to do this by extracting relevant information such as the content of a tweet (the tweet itself) or the author of the tweet. Then, we will identify whether the tweets include true information or fabricated information. Once we do this, we are going to test and train an AI model to identify whether a tweet is spreading misinformation, or real information. After we train an AI model to identify the type of information, we will categorize the tweet into the category it was trying to spread information about. Our end goal is to integrate the preprocessing script and the AI model with a website that shows the analysis of the tweets. We want users to be able to insert a tweet into our website related to Covid-19, and the user should be returned with the relevant classification of the tweet. Also, users will be able to download a Web Archive file (WARC) of the archived tweet. Overall, we think the combination of these tasks will help aid users in identifying misinformation related to Covid-19.

Description

Keywords

COVID 19, Fake News, Fake News Detection, Python, Machine Learning, TWARC, MySQL, Data Processing, Text Classifier, Tweets, Twitter

Citation