Streamlining SOC Analyst Diagnosis Through Workflow Extraction from Demonstrations

Loading...
Thumbnail Image

TR Number

Date

2026-05-29

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Security Operations Center (SOC) analysts triage SIEM alerts and identify which alerts belong to which kind of attack, a bottleneck in incident response since alert volume far ex- ceeds what analysts can read. Hand-authored detection rules are brittle; single-shot Large Language Model (LLM) calls are non-deterministic and unauditable. This thesis introduces demonstration-driven workflow generation (DDW): a four-step pipeline (demonstration → chat log → distillation → executable workflow) that converts one analyst's investigative rea- soning into a deterministic, reusable detector under an input policy that excludes labels and phase names. The analyst uses deterministic analytical tools (rule-frequency aggregation, record sampling, full-text inspection, MITRE ATTandCK lookup) to identify the attack phase of an anonymised log subset once; the chat log is distilled into a three-node JSON workflow (aggregate, sample, classify) that runs on new scenarios with one parameter. On eight AIT- ADS scenarios and a 136-cell matrix totalling 2.6 million Wazuh records, DDW achieves macro-F1 = 0.971 on the two demonstrable phases (dirb, wpscan), cell accuracy 99.3%, and Wilson 95 % CI [67.6%, 100.0%]. On the same two phases Sigma reaches 0.733 and ReAct reaches 0.899; a within-method ablation isolates the distillation pathway as decisive (agent- loop 0.971 vs single-shot 0.322). DDW also surfaces demonstrability as a dataset-level signal property: phases on which demonstration fails are the same phases on which both reference baselines underperform.

Description

Keywords

Security Operations Center, SIEM, AIT-ADS, Workflow Generation, Programming by Demonstration, Large Language Models, Empirical Evaluation

Citation

Collections