Are Bigger LLMs Always Better? A Study of Open and Closed-Source Models in Code Generation and Translation

Shiung, Tian-Yu

Are Bigger LLMs Always Better? A Study of Open and Closed-Source Models in Code Generation and Translation

dc.contributor.author	Shiung, Tian-Yu	en
dc.contributor.committeechair	Brown, Dwayne Christian	en
dc.contributor.committeemember	Tilevich, Eli	en
dc.contributor.committeemember	Seyam, Mohammed Saad Mohamed Elmahdy	en
dc.contributor.department	Computer Science and#38; Applications	en
dc.date.accessioned	2025-04-30T08:00:17Z	en
dc.date.available	2025-04-30T08:00:17Z	en
dc.date.issued	2025-04-29	en
dc.description.abstract	As Large Language Models (LLMs) advance, their roles in both code generation and translation are gaining increasing attention in software engineering. Evaluating their effectiveness across different programming languages remains a critical challenge. This paper presents the results of a study that evaluates LLMs in generating and translating code snippets across Java, Go, and Python, with a focus on accuracy, efficiency, and quality. We conduct a comparative analysis of both open-source and closed-source LLMs, including GPT-3.5, Google Gemini, Gemma 2, and Llama-3.1, using a curated dataset of LeetCode solutions. Problems were selected across three difficulty levels (easy, medium, and hard), with solutions randomly sourced from Github and verified on the LeetCode platform. Our investigation assesses the feasibility and cost-effectiveness of code translation tasks, particularly under resource constraints, and examines different methodologies suitable for such conditions. Our findings indicate that both open-source and closed-source LLMs exhibit hallucinations in solving LeetCode problems and translating code. However, some closed-source LLMs produce more useless explanations, particularly in generating non-existent programming constructs. We identify instances in which LLMs fail to translate code correctly and across which languages, uncovering novel insights. Notably, smaller, open-source models demonstrate unexpected commendable performance for some LeetCode problems. Although LLMs show great promise for modernizing legacy codebases, our results suggest that these models in their current form may lack the necessary accuracy and speed for real-world applications.	en
dc.description.abstractgeneral	With the advancement of software development, enterprises and developers increasingly leverage advanced tools to enhance efficiency. Among these, large language models (LLMs) such as ChatGPT have gained significant attention. As LLMs grow in size and capabilities, more developers and enterprises incorporate them into their workflows for various tasks, including code generation, which helps accelerate the development process and improve efficiency. However, many enterprises and novice programmers hold misconceptions about the coding capabilities of closed-source LLMs. Closed-source models (e.g., GPT-3.5, Gemini) are proprietary, while open-source models (e.g., Llama-3.1, Gemma) provide transparency and flexibility. Many assume that closed-source LLMs and larger models inherently outperform their open-source and smaller counterparts in code generation and translation. These assumptions may influence organizations to invest in expensive models without fully evaluating their real-world performance. In this research, we systematically evaluate LLMs in generating and translating code across Java, Go, and Python. Our comparative analysis examines both open-source and closed-source models, considering their architectures, parameter sizes, and accessibility. By using LeetCode, a well-known platform for technical assessments, we assess code generation and translation. Our findings reveal that while large closed-source models often achieve higher accuracy, some smaller open-source models perform comparably with lower computational costs. These insights help developers and enterprises choose LLMs wisely, balancing accuracy, cost, and efficiency.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:42884	en
dc.identifier.uri	https://hdl.handle.net/10919/127260	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Efficient Large Language Model	en
dc.subject	Code Generation	en
dc.subject	Code Translation	en
dc.title	Are Bigger LLMs Always Better? A Study of Open and Closed-Source Models in Code Generation and Translation	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Shiung_T_T_2025.pdf
Size:: 5.39 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses