Theory and Patterns for Avoiding Regex Denial of Service

Hassan, Sk Adnan

Theory and Patterns for Avoiding Regex Denial of Service

dc.contributor.author	Hassan, Sk Adnan	en
dc.contributor.committeechair	Servant Cortes, Francisco Javier	en
dc.contributor.committeemember	Meng, Na	en
dc.contributor.committeemember	Gulzar, Muhammad Ali	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2022-06-02T08:00:14Z	en
dc.date.available	2022-06-02T08:00:14Z	en
dc.date.issued	2022-06-01	en
dc.description.abstract	Regular expressions are ubiquitous. They are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS(Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. ReDoS has a serious and wide impact: since applications written in most programming languages can be vulnerable to it, ReDoS has caused outages at prominent web services including Cloudflare and Stack Overflow. Due to the severity and prevalence of ReDoS, past work proposed mechanisms to identify and repair regexes. In this work, we set a different goal: helping developers avoid introducing regexes that could trigger ReDoS in the first place. A necessary condition for a regex to trigger ReDoS is to be infinitely ambiguous (IA). We propose a theory and a collection of anti-patterns to characterize infinitely ambiguous (IA) regexes. We evaluate our proposed anti-patterns in two complementary ways: quantitatively, over a dataset of 209,188 regexes from open- source software; and qualitatively, by observing humans using them in practice. In our large-scale evaluation, our anti-patterns characterized IA regexes with 100% precision and 99% recall, showing that they can capture the large majority of IA regexes, even when they are a simplified version of our theory. In our human experiment, practitioners applying our anti-patterns correctly assessed whether the regex that they were composing was IA or not in all of our studied regex-composition tasks.	en
dc.description.abstractgeneral	Regular expressions are used by developers for different purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS(Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. ReDoS has caused outages at prominent web services including Cloudflare and Stack Overflow. ReDoS has a serious and wide impact: since applications written in most programming languages can be vulnerable to it. With this work, we wanted to help developers avoid introducing regexes that could trigger ReDoS in the first place. A necessary condition for a regex to trigger ReDoS is to be infinitely ambiguous (IA). We propose a theory and a collection of anti-patterns to characterize infinitely ambiguous (IA) regexes	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:34920	en
dc.identifier.uri	http://hdl.handle.net/10919/110392	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	security	en
dc.subject	denial of service	en
dc.subject	redos	en
dc.title	Theory and Patterns for Avoiding Regex Denial of Service	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Hassan_S_T_2022.pdf
Size:: 351.3 KB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses