Cross Platform MultiModal Retrieval Augmented Distillation for Code-Switched Content Understanding
Files
TR Number
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In the era of digital communication and social media, the proliferation of multimodal content, such as code-switched memes, has become a ubiquitous form of expression. This phenomenon is especially significant for low-resource languages like Nepali, where the need for sentiment analysis and hate speech detection remains unmet due to the unavailability of publicly available datasets. To address this gap, we provide ENeMeme, an annotated dataset of 4,211 code-switched memes in the Nepali-English language for sentiment and hate speech. While the previous state-of-the-art methods of meme analysis particularly focus on high-resource language, they fail to perform well in low-resource language. To bridge this gap, our paper also builds on existing literature to adapt a novel multimodal model, MM-RAD, designed to understand codeswitched Nepali-English memes, leveraging both textual and visual content. The model’s effectiveness is analyzed across various retrieval platforms. Our proposed MM-RAD demonstrates superior performance in sentiment analysis and hate speech detection compared to individual baseline models. The dataset can be availed through https://github.com/therealthapa/crossplatform