Browsing by Author "Thapa, Surendrabikram"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
- CHUNAV: Analyzing Hindi Hate Speech and Targeted Groups in Indian Election DiscourseJafri, Farhan; Rauniyar, Kritesh; Thapa, Surendrabikram; Siddiqui, Mohammad; Khushi, Matloob; Naseem, Usman (ACM, 2024)In the ever-evolving landscape of online discourse and political dialogue, the rise of hate speech poses a significant challenge to maintaining a respectful and inclusive digital environment. The context becomes particularly complex when considering the Hindi language—a low-resource language with limited available data. To address this pressing concern, we introduce the CHUNAV dataset—a collection of 11,457 Hindi tweets gathered during assembly elections in various states. CHUNAV is purpose-built for hate speech categorization and the identification of target groups. The dataset is a valuable resource for exploring hate speech within the distinctive socio-political context of Indian elections. The tweets within CHUNAV have been meticulously categorized into "Hate" and "Non-Hate" labels, and further subdivided to pinpoint the specific targets of hate speech, including "Individual", "Organization", and "Community" labels (as shown in Figure 1). Furthermore, this paper presents multiple benchmark models for hate speech detection, along with an innovative ensemble and oversampling-based method. The paper also delves into the results of topic modeling, all aimed at effectively addressing hate speech and target identification in the Hindi language. This contribution seeks to advance the field of hate speech analysis and foster a safer and more inclusive online space within the distinctive realm of Indian Assembly Elections. The dataset is available at https://github.com/Farhan-jafri/Chunav
- Deidentification of Face Videos in Naturalistic Driving ScenariosThapa, Surendrabikram (Virginia Tech, 2023-09-05)The sharing of data has become integral to advancing scientific research, but it introduces challenges related to safeguarding personally identifiable information (PII). This thesis addresses the specific problem of sharing drivers' face videos for transportation research while ensuring privacy protection. To tackle this issue, we leverage recent advancements in generative adversarial networks (GANs) and demonstrate their effectiveness in deidentifying individuals by swapping their faces with those of others. Extensive experimentation is conducted using a large-scale dataset from ORNL, enabling the quantification of errors associated with head movements, mouth movements, eye movements, and other human factors cues. Additionally, qualitative analysis using metrics such as PERCLOS (Percentage of Eye Closure) and human evaluators provide valuable insights into the quality and fidelity of the deidentified videos. To enhance privacy preservation, we propose the utilization of synthetic faces as substitutes for real faces. Moreover, we introduce practical guidelines, including the establishment of thresholds and spot checking, to incorporate human-in-the-loop validation, thereby improving the accuracy and reliability of the deidentification process. In addition to this, this thesis also presents mitigation strategies to effectively handle reidentification risks. By considering the potential exploitation of soft biometric identifiers or non-biometric cues, we highlight the importance of implementing comprehensive measures such as robust data user licenses and privacy protection protocols.
- Face De-identification of Drivers from NDS Data and Its Effectiveness in Human FactorsThapa, Surendrabikram; Cook, Julie; Sarkar, Abhijit (National Surface Transportation Safety Center for Excellence, 2023-08-08)Advancements in artificial intelligence (AI) and the Internet of Things (IoT) have made data the foundation for most technological innovations. As we embark on the era of big data analysis, broader access to quality data is essential for consistent advancement in research. Therefore, data sharing has become increasingly important in all fields, including transportation safety research. Data sharing can accelerate research by providing access to more data and the ability to replicate experiments and validate results. However, data sharing also poses challenges, such as the need to protect the privacy of research participants and address ethical and safety concerns when data contains personally identifiable information (PII). This report mainly focuses on the problem of sharing drivers’ face videos for transportation research. Driver video collected either through naturalistic driving studies (NDS) or simulator-based experiments contains useful information for driver behavior and human factors research. The report first gives an overview of the multitude of problems that are associated with sharing driver videos. Then, it demonstrates possible directions for data sharing by de-identifying drivers’ faces using AI-based techniques. We have demonstrated that recent developments in generative adversarial networks (GANs) can effectively help in de-identifying a person by swapping their face with that of another person. The results achieved through the proposed techniques were then evaluated qualitatively and quantitatively to prove the validity of such a system. Specifically, the report demonstrates how face-swapping algorithms can effectively de-identify faces while still preserving important attributes related to human factors research, including eye movements, head movements, and mouth movements. The experiments were done to assess the validity of GAN-based face de-identification on faces with varied anthropometric measurements. The participants used in the data had varied physical features as well. The dataset used was under lighting conditions that varied from normal to extreme conditions. This helped to check the robustness of the GAN-based techniques. The experiment was carried out for over 115,000 frames to account for most naturalistic driving conditions. Error metrics for head movements like differences in roll angle, pitch angle, and yaw angle were calculated. Similarly, the errors in eye aspect ratio, lip aspect ratio, and pupil circularity were also calculated as they are important in the assessment of various secondary behaviors of drivers while driving. We also calculated errors to assess the de-identified and original pairs more quantitatively. Next, we showed that a face can be swapped with faces that are artificially generated. We used GAN-based techniques to generate faces that were not present in the dataset used for training the model and were not known to exist before the generation process. These faces were then used for swapping with the original faces in our experiments. This gives researchers additional flexibility in choosing the type of face they want to swap. The report concludes by discussing possible measures to share such de-identified videos with the greater research community. Data sharing across disciplines helps to build collaboration and advance research, but it is important to ensure that ethical and safety concerns are addressed when data contains PII. The proposed techniques in this report provide a way to share driver face videos while still protecting the privacy of research participants; however, we recommend that such sharing should still be done under proper guidance from institutional review boards and should have a proper data use license.
- MDKG: Graph-Based Medical Knowledge-Guided Dialogue GenerationNaseem, Usman; Thapa, Surendrabikram; Zhang, Qi; Hu, Liang; Nasim, Mehwish (ACM, 2023-07-19)Medical dialogue systems (MDS) have shown promising abilities to diagnose through a conversation with a patient like a human doctor would. However, current systems are mostly based on sequence modeling, which does not account for medical knowledge. This makes the systems more prone to misdiagnosis in case of diseases with limited information. To overcome this issue, we present MDKG, an end-to-end dialogue system for medical dialogue generation (MDG) specifically designed to adapt to new diseases by quickly learning and evolving a meta-knowledge graph that allows it to reason about disease-symptom correlations. Our approach relies on a medical knowledge graph to extract disease-symptom relationships and uses a dynamic graph-based meta-learning framework to learn how to evolve the given knowledge graph to reason about disease-symptom correlations. Our approach incorporates medical knowledge and hence reduces the need for a large number of dialogues. Evaluations show that our system outperforms existing approaches when tested on benchmark datasets.
- RUHate-MM: Identification of Hate Speech and Targets using Multimodal Data from Russia-Ukraine CrisisThapa, Surendrabikram; Jafri, Farhan; Rauniyar, Kritesh; Nasim, Mehwish; Naseem, Usman (ACM, 2024-05-13)During the conflict between Ukraine and Russia, hate speech targeted toward specific groups was widespread on different social media platforms. With most social platforms allowing multimodal content, the use of multimodal content to express hate speech is widespread on the Internet. Although there has been considerable research in detecting hate speech within unimodal content, the investigation into multimodal content remains insufficient. The limited availability of annotated multimodal datasets further restricts our ability to explore new methods to interpret and identify hate speech and its targets. The availability of annotated datasets for hate speech detection during political events, such as invasions, are even limited. To fill this gap, we introduce a comprehensive multimodal dataset consisting of 20,675 posts related to the Russia- Ukraine crisis, which were manually annotated as either ‘Hate Speech’ or ‘No Hate Speech’. Additionally, we categorize the hate speech data into three targets: ‘Individual’, ‘Organization’, and ‘Community’. Our benchmarked evaluations show that there is still room for improvement in accurately identifying hate speech and its targets. We hope that the availability of this dataset and the evaluations performed on it will encourage the development of new methods for identifying hate speech and its targets during political events like invasions and wars. The dataset and resources are made available at https://github.com/Farhan-jafri/Russia-Ukraine.
- Using Artificial Intelligence/Machine Learning Tools to Analyze Safety, Road Scene, Near-Misses and CrashesYang, Gary; Sarkar, Abhijit; Ridgeway, Christie; Thapa, Surendrabikram; Jain, Sandesh; Miller, Andrew M. (National Surface Transportation Safety Center for Excellence, 2024-11-18)Artificial intelligence (AI) and machine learning technologies have the potential to enhance road safety by monitoring driver behavior and analyzing road scene and safety-critical events (SCEs). This study combined a detailed literature review on the application of AI to driver monitoring systems (DMS) and road scene perception, a market scan of commercially available AI tools for transportation safety, and an experiment to study the capability of large vision language models (LVLMs) to describe road scenes. Finally, the report provides recommendations, focusing on integrating advanced AI methods, data sharing, and collaboration between industry and academia. The report emphasizes the importance of ethical considerations and the potential of AI to significantly enhance road safety through innovative applications and continuous advancements. Future research directions include improving the robustness of AI models, addressing ethical and privacy concerns, and fostering industry-academic collaborations to advance AI applications in road safety.
- Vision-Language Models for Biomedical ApplicationsThapa, Surendrabikram; Naseem, Usman; Zhou, Luping; Kim, Jinman (ACM, 2024-10-28)Vision-language models (VLMs) are transforming the landscape of biomedical research and healthcare by enabling the seamless integration and interpretation of complex multimodal data, including medical images and clinical texts. Recognizing the growing impact of these models, the first international workshop on Vision- Language Models for Biomedicine (VLM4Bio) was held in conjunction with ACM Multimedia 2024. The workshop aimed to address the critical need for advanced techniques that can leverage VLMs in applications such as medical imaging, diagnostics, and personalized treatment. As healthcare data increasingly involves both visual and textual information, VLM4Bio provided a platform for interdisciplinary collaboration between experts in natural language processing, computer vision, biomedical engineering, and AI ethics. This paper provides an overview of the inaugural edition of the VLM4Bio workshop, summarizing the key discussions, contributions, and future directions for expanding the workshop’s scope and influence in subsequent editions.