Synthesizing Deployable Suricata Rules from Raw Network Traffic

Loading...
Thumbnail Image

TR Number

Date

2026-06-11

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Rule-based intrusion detection systems such as Suricata are central to network security due to their effectiveness and explainability, yet crafting effective detection rules demands deep expert knowledge and cannot keep pace with emerging threats. Existing large language model based approaches can reduce analyst effort, but they either rely on curated threat intelligence that is produced only after the underlying traffic artifacts already exist, or they require costly model use without sufficient quality control. This thesis presents RulePilot, an end-to-end agentic framework that generates deployable Suricata rules directly and efficiently from raw malware network traffic, with no prior threat intelligence required. A key challenge is noise: network traffic captures often contain a small amount of security-relevant traffic mixed with large volumes of background traffic, which reduces model reasoning quality and increases cost. RulePilot addresses this challenge with a Benign Traffic Fingerprinting stage that removes known benign background flows before model processing. It then synthesizes rules and iteratively refines them. Rules that fail syntax checks, do not trigger on the source malware traffic, or generate false positives on a large benign traffic corpus are automatically repaired using structured feedback from RulePilot's repair agent. Evaluated on 1,296 malware PCAPs across 192 malware families, spanning samples from 2014 to 2026, along with 1,172 benign PCAPs across fingerprinting and evaluation corpora, RulePilot produces syntactically valid and behaviorally meaningful rules with strong precision and recall, demonstrating the potential of automated intrusion detection rule generation at scale.

Description

Keywords

Intrusion detection, Suricata, rule generation, large language models, agentic AI, network traffic analysis

Citation

Collections