Browsing by Author "Shah, Aditya"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- End-to-End Multimodal Fact-Checking and Explanation Generation: A Challenging Dataset and ModelsYao, Barry; Shah, Aditya; Sun, Lichao; Cho, Jin-Hee; Huang, Lifu (ACM, 2023-07-19)We propose end-to-end multimodal fact-checking and explanation generation, where the input is a claim and a large collection of web sources, including articles, images, videos, and tweets, and the goal is to assess the truthfulness of the claim by retrieving relevant evidence and predicting a truthfulness label (e.g., support, refute or not enough information), and to generate a statement to summarize and explain the reasoning and ruling process. To support this research, we construct Mocheg, a large-scale dataset consisting of 15,601 claims where each claim is annotated with a truthfulness label and a ruling statement, and 33,880 textual paragraphs and 12,112 images in total as evidence. To establish baseline performances on Mocheg, we experiment with several state-of-the-art neural architectures on the three pipelined subtasks: multimodal evidence retrieval, claim verification, and explanation generation, and demonstrate that the performance of the state-of-the-art end-to-end multimodal factchecking does not provide satisfactory outcomes. To the best of our knowledge, we are the first to build the benchmark dataset and solutions for end-to-end multimodal fact-checking and explanation generation. The dataset, source code and model checkpoints are available at https://github.com/VT-NLP/Mocheg.
- Team 4: Segmentation, Summarization, and ClassificationGanesan, Kaushik; Nanjundan, Deepak; Srivastava, Deval; Neog, Abhilash; Jayaprakash, Dharneeshkar; Shah, Aditya (Virginia Tech, 2023-01-11)Under the guidance of Dr. Edward A. Fox, the class of CS5604 Fall 2022 semester at Virginia Tech was assigned the task of building an Information Retrieval and Analysis System that can support a collection of at least 5000 Electronic Theses and Dissertations (ETDs). The system would act as a search engine, supporting a number of features, such as searching, providing recommendations, ranking search results, and browsing. In order to achieve this, the class was divided into five teams, each assigned separate tasks with the intent of collaborating through CI/CD. The roles can be described as follows: Content and Representation, End-user Recommendation and Search, Object Detection and Topic Models, Classification and Summarization with Language Models, and Integration and Coordination. The intent of the report is to outline the contribution of Team 4, which focuses on language models, classification, summarization, and segmentation. In this project, Team 4 was successful in reproducing Akbar Javaid Manzoor’s pipeline to segment ETD into chapters, summarize the segmented chapters using extractive and abstractive summarizing techniques, and classify the chapters using deep learning and language models. Using the APIs developed by Team 1, Team 4 was also tasked with storing the outcomes of 5000 ETDs in the file system and database. Team 4 containerized the services and assisted Team 5 with workflow automation to help automate the services. The project’s main lessons were effective team collaboration, efficient code maintenance, containerization of services, upkeep of a CI/CD workflow, and finally effective information storage retrieval at scale. The report describes the goals, tasks, and achievements, along with our coordination with the other teams in completing the higher-level tasks concerning the entire project.