Evaluating Function Calling in Language Models through a Conversational Agent

Sharma Dulal, Sandesh

Evaluating Function Calling in Language Models through a Conversational Agent

Files

Sharma_Dulal_S_T_2026.pdf (4.19 MB)

Downloads: 65

Date

2026-05-07

Authors

Sharma Dulal, Sandesh

Publisher

Virginia Tech

Abstract

This study investigates the effectiveness of function calling in Large Language Models (LLMs) for data-driven conversational systems. While LLMs excel in natural language understanding, reliably mapping user queries to structured computational functions remains a key challenge. To address this, the study develops the India Policy Insights (IPI) Chatbot, a GenAI-powered system that enables natural language interaction with complex, spatiotemporal public policy datasets. The chatbot integrates LLMs with backend functions to translate user queries into structured parameters, execute database operations, and generate multi-modal outputs, including text, charts, and maps. Proposed workflow demonstrates how natural language queries are converted into actionable analytical tasks. A systematic evaluation of GPT-4o and GPT-4o-mini is conducted across prompt specificity, query difficulty, and query type. Results show that function-calling accuracy improves significantly with more explicit and structured prompts, with GPT-4o achieving the highest performance. Spatial and constraint-based queries yield consistently high accuracy, while complex multi-indicator tasks remain challenging, particularly for smaller models. Overall, this study highlights the importance of prompt design, parameter clarity, and model selection in optimizing function-calling performance. It also demonstrates the potential of conversational AI to improve accessibility to policy-relevant data, contributing to more inclusive and data-informed decision-making.

Keywords

Chatbot; Dashboard; Generative AI; India Policy Insights (IPI); Large Language Model (LLM)

Persistent link

https://hdl.handle.net/10919/143048

Collections

Masters Theses

Full item page

Evaluating Function Calling in Language Models through a Conversational Agent

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections