VTechWorks staff will be away for the Thanksgiving holiday beginning at noon on Wednesday, November 27, through Friday, November 29. We will resume normal operations on Monday, December 2. Thank you for your patience.
 

Measurement and Development for Automated Secure Coding Solutions

TR Number

Date

2024-09-09

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

With the rise of development efforts, there has also been a rise in source code vulnerabilities. Advanced security tools have been created to identify these vulnerabilities throughout the lifetime of the developer's ecosystem and afterward, before the vulnerabilities are exposed. One such popular method is Static Code Analysis (Code Analysis) (SCA), which scans developers' source code to identify potential vulnerabilities in the code. My Ph.D. work aims to help reduce the vulnerabilities exposed by YIELD, ENHANCE, and EVALUATE (EYE) SCA tools to identify vulnerabilities while the developer writes the code. We first look into evaluating tools that support developers with their source code by determining how accurate they are with identifying vulnerability information. Large Language Machine Learning Model (LLM)s have been on the rise recently with the introduction of Chat Generative Pre-trained Transformer (ChatGPT) 3.5, ChatGPT 4.1, Google Gemini, and many more. Using a common framework, we created a zero-shot prompt instructing the LLM to identify; whether there is a vulnerability in the provided source code and what Common Weakness Enumeration (CWE) value represents the vulnerability. With our Python cryptographic benchmark PyCryptoBench, we sent vulnerable samples to four different LLMs and two different versions of ChatGPT Application Program Interface (API)s. The samples allow us to measure how reliable each LLM is at vulnerability identification and defining. The Chat- GPT APIs include multiple reproducible fields that allowed us to measure how reproducible the responses are. Next, we yield a new SCA tool to apply what we learned to a current gap in increasingly complex source code. Cryptolation, our state-of-the-art (SOA) Python SCA tool uses constant propagation-supported variable inference to obtain insight into the data flow state through the program's execution. Python source code has ever-increasing complexities and a lack of SCA tools compared to Java. We compare Cryptolation with the other SOA SCA tools Bandit, Semgrep, and Dlint. To verify the Precision of our tool, we created the benchmark PyCryptoBench, which contains 1,836 test cases and encompasses five different language features. Next, we crawled over 1,000 cryptographic-related Python projects on GitHub and each with each tool. Finally, we reviewed all PyCryptoBench results and sampled over 10,000 cryptographic-related Python projects. The results reveal Cryptolation has a 100% Precision on the benchmark, with the second highest Precision with cryptographic-related projects. Finally, we look at enhancing SCA tools. The SOA tools already compete to have the highest Precision, Recall, and Accuracy. However, we examine several developer surveys to determine their reasons for not adopting such tools. These are generally better aesthetics, usability, customization, and a low effort cost to use consistently. To achieve this, we enhance the SOA Java SCA tool CryptoGuard with the following: integrated build tools, modern terminal Command Line Interface (CLI) usage, customizable and vendor-specific output formats, and no-install demos.

Description

Keywords

Large Language Models, Machine Learning, Static Code Analysis, Python, Java, Benchmark

Citation