Performance Evaluation of Large Language Models for High-Performance Code Generation: A Multi-Agent Approach (MARCO)

Rahman, Asif; Cvetkovic, Veljko; Reece, Kathleen; Walters, Aidan; Hassan, Yasir; Tummeti, Aneesh; Torres, Brian; Cooney, Denise; Ellis, Margaret; Nikolopoulos, Dimitrios

Performance Evaluation of Large Language Models for High-Performance Code Generation: A Multi-Agent Approach (MARCO)

dc.contributor.author	Rahman, Asif	en
dc.contributor.author	Cvetkovic, Veljko	en
dc.contributor.author	Reece, Kathleen	en
dc.contributor.author	Walters, Aidan	en
dc.contributor.author	Hassan, Yasir	en
dc.contributor.author	Tummeti, Aneesh	en
dc.contributor.author	Torres, Brian	en
dc.contributor.author	Cooney, Denise	en
dc.contributor.author	Ellis, Margaret	en
dc.contributor.author	Nikolopoulos, Dimitrios	en
dc.date.accessioned	2025-05-07T17:22:43Z	en
dc.date.available	2025-05-07T17:22:43Z	en
dc.date.issued	2025-05-07	en
dc.description.abstract	Large language models (LLMs) have transformed software development through code generation capabilities, yet their effectiveness for high-performance computing (HPC) remains limited. HPC code requires specialized optimizations for parallelism, memory efficiency, and architecture-specific considerations that general-purpose LLMs often overlook. We present MARCO (Multi-Agent Reactive Code Optimizer), a novel framework that enhances LLM-generated code for HPC through a specialized multi-agent architecture. MARCO employs separate agents for code generation and performance evaluation, connected by a feedback loop that progressively refines optimizations. A key innovation is MARCO's web-search component that retrieves real-time optimization techniques from recent conference proceedings and research publications, bridging the knowledge gap in pre-trained LLMs. Our extensive evaluation on the LeetCode 75 problem set demonstrates that MARCO achieves a 14.6% average runtime reduction compared to Claude 3.5 Sonnet alone, while the integration of the web-search component yields a 30.9% performance improvement over the base MARCO system. These results highlight the potential of multi-agent systems to address the specialized requirements of high-performance code generation, offering a cost-effective alternative to domain-specific model fine-tuning.	en
dc.description.version	Submitted version	en
dc.format.extent	9 page(s)	en
dc.format.mimetype	application/pdf	en
dc.identifier.orcid	Nikolopoulos, Dimitrios [0000-0003-0217-8307]	en
dc.identifier.uri	https://hdl.handle.net/10919/129385	en
dc.language.iso	en	en
dc.relation.ispartof	Performance Evaluation of Large Language Models for High-Performance Code Generation: A Multi-Agent Approach (MARCO)	en
dc.relation.uri	https://www.cs.vt.edu/~dsn	en
dc.relation.uri	http://arxiv.org/	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.title	Performance Evaluation of Large Language Models for High-Performance Code Generation: A Multi-Agent Approach (MARCO)	en
dc.type	Article	en
dc.type.dcmitype	Text	en
pubs.organisational-group	Virginia Tech	en
pubs.organisational-group	Virginia Tech/Engineering	en
pubs.organisational-group	Virginia Tech/Engineering/Computer Science	en
pubs.organisational-group	Virginia Tech/All T&R Faculty	en
pubs.organisational-group	Virginia Tech/Engineering/COE T&R Faculty	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: BURGS_LLM_HPC.pdf
Size:: 482.19 KB
Format:: Adobe Portable Document Format
Description:: Submitted version

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.5 KB
Format:: Plain Text
Description:

Download

Collections

All Faculty Deposits
Scholarly Works, Computer Science