Transforming Free-Form Sentences into Sequence of Unambiguous Sentences with Large Language Model

Yeole, Nikita Kiran

Transforming Free-Form Sentences into Sequence of Unambiguous Sentences with Large Language Model

dc.contributor.author	Yeole, Nikita Kiran	en
dc.contributor.committeechair	Huang, Lifu	en
dc.contributor.committeechair	Hsiao, Michael S.	en
dc.contributor.committeemember	Viswanath, Bimal	en
dc.contributor.department	Computer Science and#38; Applications	en
dc.date.accessioned	2024-12-18T09:00:11Z	en
dc.date.available	2024-12-18T09:00:11Z	en
dc.date.issued	2024-12-17	en
dc.description.abstract	In the realm of natural language programming, translating free-form sentences in natural language into a functional, machine-executable program remains difficult due to the following 4 challenges. First, the inherent ambiguity of natural languages. Second, the high-level verbose nature in user descriptions. Third, the complexity in the sentences and Fourth, the invalid or semantically unclear sentences. Our first solution is a Large Language Model (LLM) based Artificial Intelligence driven assistant to process free-form sentences and decompose them into sequences of simplified, unambiguous sentences that abide by a set of rules, thereby stripping away the complexities embedded within the original sentences. These resulting sentences are then used to generate the code. We applied the proposed approach to a set of free-form sentences written by middle-school students for describing the logic behind video games. More than 60% of the free-form sentences containing these problems were sufficiently converted to sequences of simple unambiguous object-oriented sentences by our approach. Next, the thesis also presents "IntentGuide," a neuro-symbolic integration framework to enhance the clarity and executability of human intentions expressed in freeform sentences. IntentGuide effectively integrates the rule-based error detection capabilities of symbolic AI with the powerful adaptive learning abilities of Large Language Model to convert ambiguous or complex sentences into clear, machine-understandable instructions. The empirical evaluation of IntentGuide performed on natural language sentences written by middle school students for designing video games, reveals a significant improvement in error correction and code generation abilities compared to previous approach, attaining an accuracy rate of 90%.	en
dc.description.abstractgeneral	Imagine if you could talk to machines in everyday language and they could understand exactly what you meant, turning your words into programs that do exactly what you describe. That's the goal of the thesis. We've developed a system that helps machines make sense of the kind of free-form language that people, especially students, use when they describe what they want a computer to do. Understanding and converting everyday language into computer code is a complex challenge, primarily because the way we naturally speak can be vague, overly detailed, or just complex. This thesis presents a new tool using artificial intelligence that helps break down and simplify these sentences. By transforming them into clearer, rulefollowing instructions, this tool makes it easier for machines to understand and execute the tasks we describe. The technology was tested using descriptions from middle-school students on how video games should work. Over 60% of these complex or unclear descriptions were sufficiently converted into straightforward instructions that a machine could use. Additionally, a new system called "IntentGuide" was introduced, combining traditional AI methods with advanced language models to improve how effectively machines can interpret and act on human instructions. This improved system showed a 90% accuracy in understanding and correcting errors in the students' game descriptions, marking a significant step forward in helping computers better understand us.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:42030	en
dc.identifier.uri	https://hdl.handle.net/10919/123829	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Natural language programming	en
dc.subject	decomposition	en
dc.subject	Natural language processing	en
dc.subject	neuro-symbolic	en
dc.title	Transforming Free-Form Sentences into Sequence of Unambiguous Sentences with Large Language Model	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science & Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en