Browsing by Author "Rauniyar, Kritesh"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- CHUNAV: Analyzing Hindi Hate Speech and Targeted Groups in Indian Election DiscourseJafri, Farhan; Rauniyar, Kritesh; Thapa, Surendrabikram; Siddiqui, Mohammad; Khushi, Matloob; Naseem, Usman (ACM, 2024)In the ever-evolving landscape of online discourse and political dialogue, the rise of hate speech poses a significant challenge to maintaining a respectful and inclusive digital environment. The context becomes particularly complex when considering the Hindi language—a low-resource language with limited available data. To address this pressing concern, we introduce the CHUNAV dataset—a collection of 11,457 Hindi tweets gathered during assembly elections in various states. CHUNAV is purpose-built for hate speech categorization and the identification of target groups. The dataset is a valuable resource for exploring hate speech within the distinctive socio-political context of Indian elections. The tweets within CHUNAV have been meticulously categorized into "Hate" and "Non-Hate" labels, and further subdivided to pinpoint the specific targets of hate speech, including "Individual", "Organization", and "Community" labels (as shown in Figure 1). Furthermore, this paper presents multiple benchmark models for hate speech detection, along with an innovative ensemble and oversampling-based method. The paper also delves into the results of topic modeling, all aimed at effectively addressing hate speech and target identification in the Hindi language. This contribution seeks to advance the field of hate speech analysis and foster a safer and more inclusive online space within the distinctive realm of Indian Assembly Elections. The dataset is available at https://github.com/Farhan-jafri/Chunav
- RUHate-MM: Identification of Hate Speech and Targets using Multimodal Data from Russia-Ukraine CrisisThapa, Surendrabikram; Jafri, Farhan; Rauniyar, Kritesh; Nasim, Mehwish; Naseem, Usman (ACM, 2024-05-13)During the conflict between Ukraine and Russia, hate speech targeted toward specific groups was widespread on different social media platforms. With most social platforms allowing multimodal content, the use of multimodal content to express hate speech is widespread on the Internet. Although there has been considerable research in detecting hate speech within unimodal content, the investigation into multimodal content remains insufficient. The limited availability of annotated multimodal datasets further restricts our ability to explore new methods to interpret and identify hate speech and its targets. The availability of annotated datasets for hate speech detection during political events, such as invasions, are even limited. To fill this gap, we introduce a comprehensive multimodal dataset consisting of 20,675 posts related to the Russia- Ukraine crisis, which were manually annotated as either ‘Hate Speech’ or ‘No Hate Speech’. Additionally, we categorize the hate speech data into three targets: ‘Individual’, ‘Organization’, and ‘Community’. Our benchmarked evaluations show that there is still room for improvement in accurately identifying hate speech and its targets. We hope that the availability of this dataset and the evaluations performed on it will encourage the development of new methods for identifying hate speech and its targets during political events like invasions and wars. The dataset and resources are made available at https://github.com/Farhan-jafri/Russia-Ukraine.