M3D: Multimodal MultiDocument Fine-Grained Inconsistency Detection

dc.contributor.authorTang, Chia-Weien
dc.contributor.committeechairThomas, Christopher Leeen
dc.contributor.committeememberLourentzou, Isminien
dc.contributor.committeememberHuang, Lifuen
dc.contributor.departmentComputer Science and#38; Applicationsen
dc.date.accessioned2024-06-11T08:01:10Zen
dc.date.available2024-06-11T08:01:10Zen
dc.date.issued2024-06-10en
dc.description.abstractValidating claims from misinformation is a highly challenging task that involves understanding how each factual assertion within the claim relates to a set of trusted source materials. Existing approaches often make coarse-grained predictions but fail to identify the specific aspects of the claim that are troublesome and the specific evidence relied upon. In this paper, we introduce a method and new benchmark for this challenging task. Our method predicts the fine-grained logical relationship of each aspect of the claim from a set of multimodal documents, which include text, image(s), video(s), and audio(s). We also introduce a new benchmark (M^3DC) of claims requiring multimodal multidocument reasoning, which we construct using a novel claim synthesis technique. Experiments show that our approach significantly outperforms state-of-the-art baselines on this challenging task on two benchmarks while providing finer-grained predictions, explanations, and evidence.en
dc.description.abstractgeneralIn today's world, we are constantly bombarded with information from various sources, making it difficult to distinguish between what is true and what is false. Validating claims and determining their truthfulness is an essential task that helps us separate facts from fiction, but it can be a time-consuming and challenging process. Current methods often fail to pinpoint the specific parts of a claim that are problematic and the evidence used to support or refute them. In this study, we present a new method and benchmark for fact-checking claims using multiple types of information sources, including text, images, videos, and audio. Our approach analyzes each aspect of a claim and predicts how it logically relates to the available evidence from these diverse sources. This allows us to provide more detailed and accurate assessments of the claim's validity. We also introduce a new benchmark dataset called M^3DC, which consists of claims that require reasoning across multiple sources and types of information. To create this dataset, we developed a novel technique for synthesizing claims that mimic real-world scenarios. Our experiments show that our method significantly outperforms existing state-of-the-art approaches on two benchmarks while providing more fine-grained predictions, explanations, and evidence. This research contributes to the ongoing effort to combat misinformation and fake news by providing a more comprehensive and effective approach to fact-checking claims.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:40024en
dc.identifier.urihttps://hdl.handle.net/10919/119382en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectmulti-modality reasoningen
dc.subjectfine-grained reasoningen
dc.subjectmulti-document understandingen
dc.subjecttexten
dc.subjectimageen
dc.subjectvideoen
dc.subjectaudioen
dc.titleM3D: Multimodal MultiDocument Fine-Grained Inconsistency Detectionen
dc.typeThesisen
thesis.degree.disciplineComputer Science & Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tang_C_T_2024.pdf
Size:
48.81 MB
Format:
Adobe Portable Document Format

Collections