Performance Evaluation of Large Language Models for High-Performance Code Generation: A Multi-Agent Approach (MARCO)
dc.contributor.author | Rahman, Asif | en |
dc.contributor.author | Cvetkovic, Veljko | en |
dc.contributor.author | Reece, Kathleen | en |
dc.contributor.author | Walters, Aidan | en |
dc.contributor.author | Hassan, Yasir | en |
dc.contributor.author | Tummeti, Aneesh | en |
dc.contributor.author | Torres, Brian | en |
dc.contributor.author | Cooney, Denise | en |
dc.contributor.author | Ellis, Margaret | en |
dc.contributor.author | Nikolopoulos, Dimitrios | en |
dc.date.accessioned | 2025-05-07T17:22:43Z | en |
dc.date.available | 2025-05-07T17:22:43Z | en |
dc.date.issued | 2025-05-07 | en |
dc.description.abstract | Large language models (LLMs) have transformed software development through code generation capabilities, yet their effectiveness for high-performance computing (HPC) remains limited. HPC code requires specialized optimizations for parallelism, memory efficiency, and architecture-specific considerations that general-purpose LLMs often overlook. We present MARCO (Multi-Agent Reactive Code Optimizer), a novel framework that enhances LLM-generated code for HPC through a specialized multi-agent architecture. MARCO employs separate agents for code generation and performance evaluation, connected by a feedback loop that progressively refines optimizations. A key innovation is MARCO's web-search component that retrieves real-time optimization techniques from recent conference proceedings and research publications, bridging the knowledge gap in pre-trained LLMs. Our extensive evaluation on the LeetCode 75 problem set demonstrates that MARCO achieves a 14.6% average runtime reduction compared to Claude 3.5 Sonnet alone, while the integration of the web-search component yields a 30.9% performance improvement over the base MARCO system. These results highlight the potential of multi-agent systems to address the specialized requirements of high-performance code generation, offering a cost-effective alternative to domain-specific model fine-tuning. | en |
dc.description.version | Submitted version | en |
dc.format.extent | 9 page(s) | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.orcid | Nikolopoulos, Dimitrios [0000-0003-0217-8307] | en |
dc.identifier.uri | https://hdl.handle.net/10919/129385 | en |
dc.language.iso | en | en |
dc.relation.ispartof | Performance Evaluation of Large Language Models for High-Performance Code Generation: A Multi-Agent Approach (MARCO) | en |
dc.relation.uri | https://www.cs.vt.edu/~dsn | en |
dc.relation.uri | http://arxiv.org/ | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.title | Performance Evaluation of Large Language Models for High-Performance Code Generation: A Multi-Agent Approach (MARCO) | en |
dc.type | Article | en |
dc.type.dcmitype | Text | en |
pubs.organisational-group | Virginia Tech | en |
pubs.organisational-group | Virginia Tech/Engineering | en |
pubs.organisational-group | Virginia Tech/Engineering/Computer Science | en |
pubs.organisational-group | Virginia Tech/All T&R Faculty | en |
pubs.organisational-group | Virginia Tech/Engineering/COE T&R Faculty | en |