RUHate-MM: Identification of Hate Speech and Targets using Multimodal Data from Russia-Ukraine Crisis

TR Number

Date

2024-05-13

Journal Title

Journal ISSN

Volume Title

Publisher

ACM

Abstract

During the conflict between Ukraine and Russia, hate speech targeted toward specific groups was widespread on different social media platforms. With most social platforms allowing multimodal content, the use of multimodal content to express hate speech is widespread on the Internet. Although there has been considerable research in detecting hate speech within unimodal content, the investigation into multimodal content remains insufficient. The limited availability of annotated multimodal datasets further restricts our ability to explore new methods to interpret and identify hate speech and its targets. The availability of annotated datasets for hate speech detection during political events, such as invasions, are even limited. To fill this gap, we introduce a comprehensive multimodal dataset consisting of 20,675 posts related to the Russia- Ukraine crisis, which were manually annotated as either ‘Hate Speech’ or ‘No Hate Speech’. Additionally, we categorize the hate speech data into three targets: ‘Individual’, ‘Organization’, and ‘Community’. Our benchmarked evaluations show that there is still room for improvement in accurately identifying hate speech and its targets. We hope that the availability of this dataset and the evaluations performed on it will encourage the development of new methods for identifying hate speech and its targets during political events like invasions and wars. The dataset and resources are made available at https://github.com/Farhan-jafri/Russia-Ukraine.

Description

Keywords

Citation