Browsing by Author "Gulzar, Muhammad Ali"
Now showing 1 - 20 of 22
Results Per Page
Sort Options
- 2D Jupyter: Design and Evaluation of 2D Computational NotebooksChristman, Elizabeth (Virginia Tech, 2023-06-12)Computational notebooks are a popular tool for data analysis. However, the 1D linear structure used by many computational notebooks can lead to challenges and pain points in data analysis, including messiness, tedious navigation, inefficient use of screen space, and presentation of non-linear narratives. To address these problems, we designed a prototype Jupyter Notebooks extension called 2D Jupyter that enables a 2D organization of code cells in a multi-column layout, as well as freeform cell placement. We conducted a user study using this extension to evaluate the usability of 2D computational notebooks and understand the advantages and disadvantages that it provides over a 1D layout. As a result of this study, we found evidence that the 2D layout provides enhanced usability and efficiency in computational notebooks. Additionally, we gathered feedback on the design of the prototype that can be used to inform future work. Overall, 2D Jupyter was positively received and users not only enjoyed using the extension, but also expressed a desire to use 2D notebook environments in the future.
- Android Game Testing using Reinforcement LearningKhurana, Suhani (Virginia Tech, 2023-06-30)Android is the most popular operating system and occupies close to 70% of the market share. With the growth in the usage of Android OS, the number of games also increased and the Android play store has over 500,000 games. Testing of Android games is done either manually or through some of the existing tools which automate some parts of this testing. Manual testing requires a great deal of effort and can be expensive to afford. The existing tools which automate testing do not make use of any domain knowledge. This can cause the testing to be ineffective as the game may involve complex strategies, intricate details, widgets, etc. Existing tools like Android Monkey and Time Machine generate random Android events, including gestures like touch, swipe, clicks, and other system-level events across the application. Some deep learning methods like Wuji were only created for combat-type games. These limitations make it imperative to create a testing paradigm that uses domain knowledge as well as is easy to use by a developer who doesn't have any machine or deep learning knowledge. In this work, we develop a tool called DRAG- Deep Reinforcement learning based Android Gamer - which leverages Reinforcement Learning to learn the requisite domain knowledge and play the game in a fashion like a human would. DRAG uses a unified Reinforcement Learning agent and a Unified Reinforcement Learning environment. It only customizes the action space for each game. This generalization is done in the following ways- 1) Record an 8-minute demo video of the game and capture the underlying Android action log. 2) Analyze the recorded video and the action log to generate an action space for the Reinforcement Learning Agent. The unified RL agent is trained by providing it the score and coverage as a reward and screenshots of the game as observed states. We chose a set of 19 different open-sourced games for evaluation of the created tool. These games differ in the action set required by each of them - some require tapping icons, some require swiping in random directions, and some require more complex actions which are a combination of different gestures. The evaluation of our tool outperformed state-of-the-art TimeMachine for all 19 games and outperformed Monkey in 16 of the 19 games. This strengthens the fact that Deep Reinforcement Learning can be used to test Android games and can provide better results than tools that make no use of any domain knowledge.
- Automatic Restoration and Management of Computational NotebooksVenkatesan, Satish (Virginia Tech, 2022-03-03)Computational Notebook platforms are very commonly used by programmers and data scientists. However, due to the interactive development environment of notebooks, developers struggle to maintain effective code organization which has an adverse effect on their productivity. In this thesis, we research and develop techniques to help solve issues with code organization that developers face in an effort to improve productivity. Notebooks are often executed out of order which adversely effects their portability. To determine cell execution orders in computational notebooks, we develop a technique that determines the execution order for a given cell and if need be, attempt to rearrange the cells to match the intended execution order. With such a tool, users would not need to manually determine the execution orders themselves. In a user study with 9 participants, our approach on average saves users about 95% of the time required to determine execution orders manually. We also developed a technique to support insertion of cells in rows in addition to the standard column insertion to help better represent multiple contexts. In a user study with 9 participants, this technique on a scale of one to ten on average was judged as a 8.44 in terms of representing multiple contexts as opposed to standard view which was judged as 4.77.
- Blocking Tracking JavaScript at the Function GranularityAmjad, Abdul Haddi; Munir, Shaoor; Shafiq, Zubair; Gulzar, Muhammad Ali (ACM, 2024-12-02)Modern websites extensively rely on JavaScript to implement both functionality and tracking. Existing privacy-enhancing content blocking tools struggle against mixed scripts, which simultaneously implement both functionality and tracking. Blocking such scripts would break functionality, and not blocking themwould allowtracking. We propose NoT.js, a fine-grained JavaScript blocking tool that operates at the function-level granularity. NoT.js’s strengths lie in analyzing the dynamic execution context, including the call stack and calling context of each JavaScript function, and then encoding this context to build a rich graph representation. NoT.js trains a supervised machine learning classifier on a webpage’s graph representation to first detect tracking at the function-level and then automatically generates surrogate scripts that preserve functionality while removing tracking. Our evaluation of NoT.js on the top-10K websites demonstrates that it achieves high precision (94%) and recall (98%) in detecting tracking functions, outperforming the state-of-the-art while being robust against off-the-shelf JavaScript obfuscation. Fine-grained detection of tracking functions allows NoT.js to automatically generate surrogate scripts, which our evaluation shows that successfully remove tracking functions without causing major breakage. Our deployment of NoT.js shows that mixed scripts are present on 62.3% of the top-10K websites, with 70.6% of the mixed scripts being third-party that engage in tracking activities such as cookie ghostwriting.
- A Characterization Study of Merge Conflicts in Java ProjectsShen, Bowen; Gulzar, Muhammad Ali; He, Fei; Meng, Na (ACM, 2022)In collaborative software development, programmers create branches to add features and fix bugs, and merge branches to integrate edits. When edits from different branches textually overlap (i.e., textual conflicts) or lead to compilation and runtime errors (i.e., build and test conflicts), it is challenging for developers to remove conflicts. Prior work proposed tools to detect and solve conflicts. However, many questions are not fully investigated, such as what types of conflicts exist in practice and how developers or tools handle them. For this paper, we used automated textual merge, compilation, and testing to reveal 3 types of conflicts in 208 open-source repositories: textual conflicts, build conflicts (i.e., conflicts causing build errors), and test conflicts (i.e., conflicts triggering test failures). We manually inspected 538 conflicts and their resolutions to characterize merge conflicts. Our analysis revealed three phenomena. First, higher-order conflicts (i.e., build and test conflicts) are harder to handle, while existing tools mainly focus on textual conflicts. Second, developers resolved most higher-order conflicts by applying similar edits to multiple program locations. Third, developers resolved 64% of true textual conflicts by keeping complete edits from either a left or right branch. Our work will shed light on future research of software merge.
- Co-dependence Aware Fuzzing for Dataflow-Based Big Data AnalyticsHumayun, Ahmad; Kim, Miryung; Gulzar, Muhammad Ali (ACM, 2023-11-30)Data-intensive scalable computing has become popular due to the increasing demands of analyzing big data. For example, Apache Spark and Hadoop allow developers to write dataflow-based applications with user-defined functions to process data with custom logic. Testing such applications is difficult. (1) These applications often take multiple datasets as input. (2) Unlike in SQL, there is no explicit schema for these datasets and each unstructured (or semi-structured) dataset is segmented and parsed at runtime. (3) Dataflow operators (e.g., join) create implicit co-dependence constraints between the fields of multiple datasets. An efficient and effective testing technique must analyze co-dependence among different regions of multiple datasets at the level of rows and columns and orchestrate input mutations jointly on co-dependent regions. We propose DepFuzz to increase the effectiveness and efficiency of fuzz testing dataflow-based big data applications. The key insight behind DepFuzz is twofold. It keeps track of which code segments operate on which datasets, which rows, and which columns. By analyzing the use of dataflow operators (e.g., join and groupByKey) in tandem with the semantics of UDFs, DepFuzz generates test data that subsequently reach hard-to-reach regions of the application code. In real-world big data applications, DepFuzz finds 3.4× more faults, achieving 29% more statement coverage in half the time as Jazzer’s, a state-of-the-art commercial fuzzer for Java bytecode. It outperforms prior DISC testing by exposing deeper semantic faults beyond simpler input formatting errors, especially when multiple datasets have complex interactions through dataflow operators.
- Deep Learning for Code Generation using Snippet Level Parallel DataJain, Aneesh (Virginia Tech, 2023-01-05)In the last few years, interest in the application of deep learning methods for software engineering tasks has surged. A variety of different approaches like transformer based methods, statistical machine translation models, models inspired from natural language settings have been proposed and shown to be effective at tasks like code summarization, code synthesis and code translation. Multiple benchmark data sets have also been released but all suffer from one limitation or the other. Some data sets only support a select few programming languages while others support only certain tasks. These limitations restrict researchers' ability to be able to perform thorough analyses of their proposed methods. In this work we aim to alleviate some of the limitations faced by researchers who work in the paradigm of deep learning applications for software engineering tasks. We introduce a large, parallel, multi-lingual programming language data set that supports tasks like code summarization, code translation, code synthesis and code search in 7 different languages. We provide benchmark results for the current state of the art models on all these tasks and we also explore some limitations of current evaluation metrics for code related tasks. We provide a detailed analysis of the compilability of code generated by deep learning models because that is a better measure of ascertaining usability of code as opposed to scores like BLEU and CodeBLEU. Motivated by our findings about compilability, we also propose a reinforcement learning based method that incorporates code compilability and syntax level feedback as rewards and we demonstrate it's effectiveness in generating code that has less syntax errors as compared to baselines. In addition, we also develop a web portal that hosts the models we have trained for code translation. The portal allows translation between 42 possible language pairs and also allows users to check compilability of the generated code. The intent of this website is to give researchers and other audiences a chance to interact with and probe our work in a user-friendly way, without requiring them to write their own code to load and inference the models.
- DeSQL: Interactive Debugging of SQL in Data-Intensive Scalable ComputingHaroon, Sabaat; Brown, Chris; Gulzar, Muhammad Ali (ACM, 2024-07-12)SQL is the most commonly used front-end language for data-intensive scalable computing (DISC) applications due to its broad presence in new and legacy workflows and shallow learning curve. However, DISC-backed SQL introduces several layers of abstraction that significantly reduce the visibility and transparency of workflows, making it challenging for developers to find and fix errors in a query. When a query returns incorrect outputs, it takes a non-trivial effort to comprehend every stage of the query execution and find the root cause among the input data and complex SQL query. We aim to bring the benefits of step-through interactive debugging to DISC-powered SQL with DeSQL. Due to the declarative nature of SQL, there are no ordered atomic statements to place a breakpoint to monitor the flow of data. DeSQL’s automated query decomposition breaks a SQL query into its constituent sub-queries, offering natural locations for setting breakpoints and monitoring intermediate data. However, due to advanced query optimization and translation in DISC systems, a user query rarely matches the physical execution, making it challenging to associate subqueries with their intermediate data. DeSQL performs fine-grained taint analysis to dynamically map the subqueries to their intermediate data, while also recognizing subqueries removed by the optimizers. For such subqueries, DeSQL efficiently regenerates the intermediate data from a nearby subquery’s data. On the popular TPC-DC benchmark, DeSQL provides a complete debugging view in 13% less time than the original job time while incurring an average overhead of 10% in addition to retaining Apache Spark’s scalability. In a user study comprising 15 participants engaged in two debugging tasks, we find that participants utilizing DeSQL identify the root cause behind a wrong query output in 74% less time than the de-facto, manual debugging.
- Detecting Build Conflicts in Software Merge for Java Programs via Static AnalysisTowqir, Sheikh Shadab; Shen, Bowen; Gulzar, Muhammad Ali; Meng, Na (ACM, 2022-10-10)In software merge, the edits from different branches can textually overlap (i.e., textual conflicts) or cause build and test errors (i.e., build and test conflicts), jeopardizing programmer productivity and software quality. Existing tools primarily focus on textual conflicts; few tools detect higher-order conflicts (i.e., build and test conflicts). However, existing detectors of build conflicts are limited. Due to their heavy usage of automatic build, current detectors (e.g., Crystal) only report build errors instead of identifying the root causes; developers have to manually locate conflicting edits. These detectors only help when the branches-to-merge have no textual conflict. We present a new static analysis-based approach Bucond (“build conflict detector”). Given three code versions in a merging scenario: base b, left l , and right r , Bucond models each version as a graph, and compares graphs to extract entity-related edits (e.g., class renaming) in l and r . We believe that build conflicts occur when certain edits are co-applied to related entities between branches. Bucond realizes this insight via pattern matching to identify any cross-branch edit combination that can trigger build conflicts (e.g., one branch adds a reference to field F while the other branch removes F). We systematically explored and devised 57 patterns, covering 97% of the build conflicts in our experiments. Our evaluation shows Bucond to complement build-based detectors, as it (1) detects conflicts with 100% precision and 88%–100% recall, (2) locates conflicting edits, and (3) works well when those detectors do not.
- Empirical Investigations of More Practical Fault Localization ApproachesDao, Tung Manh (Virginia Tech, 2023-10-18)Developers often spend much of their valuable development time on software debugging and bug finding. In addition, software defects cost software industry as a whole hundreds or even a trillion of US dollars. As a result, many fault localization (FL) techniques for localizing bugs automatically, have been proposed. Despite its popularity, adopting FL in industrial environments has been impractical due to its undesirable accuracy and high runtime overhead cost. Motivated by the real-world challenges of FL applicability, this dissertation addresses these issues by proposing two main enhancements to the existing FL. First, it explores different strategies to combine a variety of program execution information with Information Retrieval-based fault localization (IRFL) techniques to increase FL's accuracy. Second, this dissertation research invents and experiments with the unconventional techniques of Instant Fault Localization (IFL) using the innovative concept of triggering modes. Our empirical evaluations of the proposed approaches on various types of bugs in a real software development environment shows that both FL's accuracy is increased and runtime is reduced significantly. We find that execution information helps increase IRFL's Top-10 by 17–33% at the class level, and 62–100% at the method level. Another finding is that IFL achieves as much as 100% runtime cost reduction while gaining comparable or better accuracy. For example, on single-location bugs, IFL scores 73% MAP, compared with 56% of the conventional approach. For multi-location bugs, IFL's Top-1 performance on real bugs is 22%, just right below 24% that of the existing FL approaches. We hope the results and findings from this dissertation help make the adaptation of FL in the real-world industry more practical and prevalent.
- An Empirical Study of API Breaking Changes in BioconductorChowdhury, Hemayet Ahmed (Virginia Tech, 2023-01-10)Bioconductor is the second largest R software package repository that is primarily used for the analysis of genomic and biological data. With downloads exceeding millions in recent years, the widespread growth of the repository's adoption can be attributed to it's diverse selection of community-created packages, written in the programming language R, that allow statistical methodologies for analysis and modelling of data. However, as these packages evolve, their APIs go through changes that can break existing user code. Fixing these API breaking changes whenever a package is updated can be frustrating and time-consuming, especially since a large fraction of the user community are researchers who do not necessarily have software engineering background. In that context, we first present a tool that can detect syntactic API breaking changes between two released versions of a library written in R through static analysis of the package source code. This tool can be of utility to R package developers, so that they can more comprehensively report or handle the breaking changes in their releases, and to R package users, who want to be aware of the API differences that may exist between two releases before upgrading the libraries in their code. Through the use of this tool and manual inspection, we also conducted an empirical study of the breaking changes and backward incompatibility in Bioconductor packages. We studied the 100 most downloaded packages in the repository and found that 28% of all packages releases are backward incompatible. We also found that 55% of these breaking changes go undocumented and developers don't maintain semantic versioning for 22% of the releases. Finally, we manually inspected 10 library releases that consisted of breaking changes and found 2% of the API-s to affect 31 client projects.
- Exploring the Evolution of the TLS Certificate EcosystemFarhan, Syed Muhammad (Virginia Tech, 2022-06-01)A vast majority of popular communication protocols for the internet employ the use of TLS (Transport Layer Security) to secure communication. As a result, there have been numerous efforts including the introduction of Certificate Transparency logs and Free Automated CAs to improve the SSL certificate ecosystem. Our work highlights the effectiveness of these efforts using the Certificate Transparency dataset as well as certificates collected via full IPv4 scans. We show that a large proportion of invalid certificates still exists and outline reasons why these certificates are invalid and where they are hosted. Moreover, we show that the incorrect use of template certificates has led to incorrect SCTs being embedded in the certificates. Taken together, our results emphasize continued involvement for the research community to improve the web's PKI ecosystem.
- FedDefender: Backdoor Attack Defense in Federated LearningGill, Waris; Anwar, Ali; Gulzar, Muhammad Ali (ACM, 2023-12-04)Federated Learning (FL) is a privacy-preserving distributed machine learning technique that enables individual clients (e.g., user participants, edge devices, or organizations) to train a model on their local data in a secure environment and then share the trained model with an aggregator to build a global model collaboratively. In this work, we propose FedDefender, a defense mechanism against targeted poisoning attacks in FL by leveraging differential testing. FedDefender first applies differential testing on clients’ models using a synthetic input. Instead of comparing the output (predicted label), which is unavailable for synthetic input, FedDefender fingerprints the neuron activations of clients’ models to identify a potentially malicious client containing a backdoor. We evaluate FedDefender using MNIST and FashionMNIST datasets with 20 and 30 clients, and our results demonstrate that FedDefender effectively mitigates such attacks, reducing the attack success rate (ASR) to 10% without deteriorating the global model performance.
- Impact of using Suggestion Bot while code reviewingPalvannan, Nivishree (Virginia Tech, 2023-07-03)Peer code reviews play a critical role in maintaining code quality, and GitHub has introduced several new features to assist with the review process. One of these features is suggested changes, which allows for precise code modifications in pull requests to be suggested in review comments. Despite the availability of such helpful features, many pull requests remain unattended due to lower priority. To address this issue, we developed a bot called ``Suggestion Bot" to automatically review the codebase using GitHub's suggested changes functionality. An empirical study was also conducted to compare the effectiveness of this bot with manual reviews. The findings suggest that implementing this bot can expedite response times and improve the quality of pull request comments for pull-based software development projects. In addition to providing automated suggestions, this feature also offers valuable, concise, and targeted feedback.
- An Investigation into Code Search Engines: The State of the Art Versus Developer ExpectationsLi, Shuangyi (Virginia Tech, 2022-07-15)An essential software development tool, code search engines are expected to provide superior accuracy, usability, and performance. However, prior research has neither (1) summarized, categorized, and compared representative code search engines, nor (2) analyzed the actual expectations that developers have for code search engines. This missing knowledge can empower developers to fully benefit from search engines, academic researchers to uncover promising research directions, and industry practitioners to properly marshal their efforts. This thesis fills the aforementioned gaps by drawing a comprehensive picture of code search engines, including their definition, standard processes, existing solutions, common alternatives, and developers' perspectives. We first study the state of the art in code search engines by analyzing academic papers, industry releases, and open-source projects. We then survey more than a 100 software developers to ascertain their usage of and preferences for code search engines. Finally, we juxtapose the results of our study and survey to synthesize a call-for-action for researchers and industry practitioners to better meet the demands software developers make on code search engines. We present the first comprehensive overview of state-of-the-art code search engines by categorizing and comparing them based on their respective search strategies, applicability, and performance. Our user survey revealed a surprising lack of awareness among many developers w.r.t. code search engines, with a high preference for using general-purpose search engines (e.g., Google) or code repositories (e.g., GitHub) to search for code. Our results also clearly identify typical usage scenarios and sought-after properties of code search engines. Our findings can guide software developers in selecting code search engines most suitable for their programming pursuits, suggest new research directions for researchers, and help programming tool builders in creating effective code search engine solutions.
- Methodologies, Techniques, and Tools for Understanding and Managing Sensitive Program InformationLiu, Yin (Virginia Tech, 2021-05-20)Exfiltrating or tampering with certain business logic, algorithms, and data can harm the security and privacy of both organizations and end users. Collectively referred to as sensitive program information (SPI), these building blocks are part and parcel of modern software systems in domains ranging from enterprise applications to cyberphysical setups. Hence, protecting SPI has become one of the most salient challenges of modern software development. However, several fundamental obstacles stand on the way of effective SPI protection: (1) understanding and locating the SPI for any realistically sized codebase by hand is hard; (2) manually isolating SPI to protect it is burdensome and error-prone; (3) if SPI is passed across distributed components within and across devices, it becomes vulnerable to security and privacy attacks. To address these problems, this dissertation research innovates in the realm of automated program analysis, code transformation, and novel programming abstractions to improve the state of the art in SPI protection. Specifically, this dissertation comprises three interrelated research thrusts that: (1) design and develop program analysis and programming support for inferring the usage semantics of program constructs, with the goal of helping developers understand and identify SPI; (2) provide powerful programming abstractions and tools that transform code automatically, with the goal of helping developers effectively isolate SPI from the rest of the codebase; (3) provide programming mechanism for distributed managed execution environments that hides SPI, with the goal of enabling components to exchange SPI safely and securely. The novel methodologies, techniques, and software tools, supported by programming abstractions, automated program analysis, and code transformation of this dissertation research lay the groundwork for establishing a secure, understandable, and efficient foundation for protecting SPI. This dissertation is based on 4 conference papers, presented at TrustCom'20, GPCE'20, GPCE'18, and ManLang'17, as well as 1 journal paper, published in Journal of Computer Languages (COLA).
- Natural Symbolic Execution-Based Testing for Big Data AnalyticsWu, Yaoxuan; Humayun, Ahmad; Gulzar, Muhammad Ali; Kim, Miryung (ACM, 2024-07-12)Symbolic execution is an automated test input generation technique that models individual program paths as logical constraints. However, the realism of concrete test inputs generated by SMT solvers often comes into question. Existing symbolic execution tools only seek arbitrary solutions for given path constraints. These constraints do not incorporate the naturalness of inputs that observe statistical distributions, range constraints, or preferred string constants. This results in unnatural-looking inputs that fail to emulate real-world data. In this paper, we extend symbolic execution with consideration for incorporating naturalness. Our key insight is that users typically understand the semantics of program inputs, such as the distribution of height or possible values of zipcode, which can be leveraged to advance the ability of symbolic execution to produce natural test inputs. We instantiate this idea in NaturalSym, a symbolic execution-based test generation tool for data-intensive scalable computing (DISC) applications. NaturalSym generates natural-looking data that mimics real-world distributions by utilizing user-provided input semantics to drastically enhance the naturalness of inputs, while preserving strong bug-finding potential. On DISC applications and commercial big data test benchmarks, NaturalSym achieves a higher degree of realism —as evidenced by a perplexity score 35.1 points lower on median, and detects 1.29× injected faults compared to the state-of-the-art symbolic executor for DISC, BigTest. This is because BigTest draws inputs purely based on the satisfiability of path constraints constructed from branch predicates, while NaturalSym is able to draw natural concrete values based on user-specified semantics and prioritize using these values in input generation. Our empirical results demonstrate that NaturalSym finds injected faults 47.8× more than NaturalFuzz (a coverage-guided fuzzer) and 19.1× more than ChatGPT. Meanwhile, TestMiner (a mining-based approach) fails to detect any injected faults. NaturalSym is the first symbolic executor that combines the notion of input naturalness in symbolic path constraints during SMT-based input generation. We make our code available at https://github.com/UCLA-SEAL/NaturalSym.
- Reinforcement Learning for Self-adapting Time Discretizations of Complex SystemsGallagher, Conor Dietrich (Virginia Tech, 2021-08-27)The overarching goal of this project is to develop intelligent, self-adapting numerical algorithms for the time discretization of complex real-world problems with Q-Learning methodologies. The specific application is ordinary differential equations which can resolve problems in mathematics, social and natural sciences, but which usually require approximations to solve because direct analytical solutions are rare. Using the traditional Brusellator and Lorenz differential equations as test beds, this research develops models to determine reward functions and dynamically tunes controller parameters that minimize both the error and number of steps required for approximate mathematical solutions. Our best reward function is based on an error that does not overly punish rejected states. The Alpha-Beta Adjustment and Safety Factor Adjustment Model is the most efficient and accurate method for solving these mathematical problems. Allowing the model to change the alpha/beta value and safety factor by small amounts provides better results than if the model chose values from discrete lists. This method shows potential for training dynamic controllers with Reinforcement Learning.
- Secure Coding Practice in Java: Automatic Detection, Repair, and Vulnerability DemonstrationZhang, Ying (Virginia Tech, 2023-10-12)The Java platform and third-party open-source libraries provide various Application Programming Interfaces (APIs) to facilitate secure coding. However, using these APIs securely is challenging for developers who lack cybersecurity training. Prior studies show that many developers use APIs insecurely, thereby introducing vulnerabilities in their software. Despite the availability of various tools designed to identify API insecure usage, their effectiveness in helping developers with secure coding practices remains unclear. This dissertation focuses on two main objectives: (1) exploring the strengths and weaknesses of the existing automated detection tools for API-related vulnerabilities, and (2) creating better tools that detect, repair, and demonstrate these vulnerabilities. Our research started with investigating the effectiveness of current tools in helping with developers' secure coding practices. We systematically explored the strengths and weaknesses of existing automated tools for detecting API-related vulnerabilities. Through comprehensive analysis, we observed that most existing tools merely report misuses, without suggesting any customized fixes. Moreover, developers often rejected tool-generated vulnerability reports due to their concerns on the correctness of detection, and the exploitability of the reported issues. To address these limitations, the second work proposed SEADER, an example-based approach to detect and repair security-API misuses. Given an exemplar ⟨insecure, secure⟩ code pair, SEADER compares the snippets to infer any API-misuse template and corresponding fixing edit. Based on the inferred information, given a program, SEADER performs inter-procedural static analysis to search for security-API misuses and to propose customized fixes. The third work leverages ChatGPT-4.0 to automatically generate security test cases. These test cases can demonstrate how vulnerable API usage facilitates supply chain attacks on specific software applications. By running such test cases during software development and maintenance, developers can gain more relevant information about exposed vulnerabilities, and may better create secure-by-design and secure-by-default software.
- Theory and Patterns for Avoiding Regex Denial of ServiceHassan, Sk Adnan (Virginia Tech, 2022-06-01)Regular expressions are ubiquitous. They are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS(Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. ReDoS has a serious and wide impact: since applications written in most programming languages can be vulnerable to it, ReDoS has caused outages at prominent web services including Cloudflare and Stack Overflow. Due to the severity and prevalence of ReDoS, past work proposed mechanisms to identify and repair regexes. In this work, we set a different goal: helping developers avoid introducing regexes that could trigger ReDoS in the first place. A necessary condition for a regex to trigger ReDoS is to be infinitely ambiguous (IA). We propose a theory and a collection of anti-patterns to characterize infinitely ambiguous (IA) regexes. We evaluate our proposed anti-patterns in two complementary ways: quantitatively, over a dataset of 209,188 regexes from open- source software; and qualitatively, by observing humans using them in practice. In our large-scale evaluation, our anti-patterns characterized IA regexes with 100% precision and 99% recall, showing that they can capture the large majority of IA regexes, even when they are a simplified version of our theory. In our human experiment, practitioners applying our anti-patterns correctly assessed whether the regex that they were composing was IA or not in all of our studied regex-composition tasks.