End-to-End Multimodal Fact-Checking and Explanation Generation: A Challenging Dataset and Models
dc.contributor.author | Yao, Barry | en |
dc.contributor.author | Shah, Aditya | en |
dc.contributor.author | Sun, Lichao | en |
dc.contributor.author | Cho, Jin-Hee | en |
dc.contributor.author | Huang, Lifu | en |
dc.date.accessioned | 2023-08-02T17:46:48Z | en |
dc.date.available | 2023-08-02T17:46:48Z | en |
dc.date.issued | 2023-07-19 | en |
dc.date.updated | 2023-08-01T07:57:47Z | en |
dc.description.abstract | We propose end-to-end multimodal fact-checking and explanation generation, where the input is a claim and a large collection of web sources, including articles, images, videos, and tweets, and the goal is to assess the truthfulness of the claim by retrieving relevant evidence and predicting a truthfulness label (e.g., support, refute or not enough information), and to generate a statement to summarize and explain the reasoning and ruling process. To support this research, we construct Mocheg, a large-scale dataset consisting of 15,601 claims where each claim is annotated with a truthfulness label and a ruling statement, and 33,880 textual paragraphs and 12,112 images in total as evidence. To establish baseline performances on Mocheg, we experiment with several state-of-the-art neural architectures on the three pipelined subtasks: multimodal evidence retrieval, claim verification, and explanation generation, and demonstrate that the performance of the state-of-the-art end-to-end multimodal factchecking does not provide satisfactory outcomes. To the best of our knowledge, we are the first to build the benchmark dataset and solutions for end-to-end multimodal fact-checking and explanation generation. The dataset, source code and model checkpoints are available at https://github.com/VT-NLP/Mocheg. | en |
dc.description.version | Published version | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.doi | https://doi.org/10.1145/3539618.3591879 | en |
dc.identifier.uri | http://hdl.handle.net/10919/115965 | en |
dc.language.iso | en | en |
dc.publisher | ACM | en |
dc.rights | Creative Commons Attribution 4.0 International | en |
dc.rights.holder | The author(s) | en |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | en |
dc.title | End-to-End Multimodal Fact-Checking and Explanation Generation: A Challenging Dataset and Models | en |
dc.type | Article - Refereed | en |
dc.type.dcmitype | Text | en |