Toward Deliberative AI: Multi-Agent LLMs for Real-World Reasoning
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Multi-agent debate has emerged as a promising strategy for improving the reasoning abilities of large language models (LLMs). However, existing approaches often fall short due to inefficiencies, shallow agreement, and a lack of real-world applicability. In this thesis, we introduce two novel frameworks CONSENSAGENT and CCAGENTdesigned to improve both the effectiveness and efficiency of LLM debates across objective and real-world tasks. CONSENSAGENT tackles key limitations such as sycophancy (models blindly agreeing with each other) and ambiguous prompts by introducing a trigger-based architecture that automatically refines prompts using past agent dis- cussions. This results in better reasoning, fewer debate rounds, and reduced computational cost. We evaluate the framework across six benchmark datasets and show that CONSENSAGENT con- sistently outperforms baselines. CCAGENT extends this work to real-world decision-making. We introduce two new datasetsone from interviews with city planners, another from U.S. Senate voting recordsand propose structured debate strategies (e.g., moderation, nudging) along with behavioral metrics (e.g., sycophancy, vote switching). A lightweight few-shot DPO training method is used to align agent behavior with collaborative reasoning goals. Together, these contributions demonstrate how we can move from toy benchmarks to deliberative, scalable systems that better reflect how human decision-making worksand how AI can meaningfully assist it.