Streamlining SOC Analyst Diagnosis Through Workflow Extraction from Demonstrations
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Security Operations Center (SOC) analysts triage SIEM alerts and identify which alerts belong to which kind of attack, a bottleneck in incident response since alert volume far ex- ceeds what analysts can read. Hand-authored detection rules are brittle; single-shot Large Language Model (LLM) calls are non-deterministic and unauditable. This thesis introduces demonstration-driven workflow generation (DDW): a four-step pipeline (demonstration → chat log → distillation → executable workflow) that converts one analyst's investigative rea- soning into a deterministic, reusable detector under an input policy that excludes labels and phase names. The analyst uses deterministic analytical tools (rule-frequency aggregation, record sampling, full-text inspection, MITRE ATTandCK lookup) to identify the attack phase of an anonymised log subset once; the chat log is distilled into a three-node JSON workflow (aggregate, sample, classify) that runs on new scenarios with one parameter. On eight AIT- ADS scenarios and a 136-cell matrix totalling 2.6 million Wazuh records, DDW achieves macro-F1 = 0.971 on the two demonstrable phases (dirb, wpscan), cell accuracy 99.3%, and Wilson 95 % CI [67.6%, 100.0%]. On the same two phases Sigma reaches 0.733 and ReAct reaches 0.899; a within-method ablation isolates the distillation pathway as decisive (agent- loop 0.971 vs single-shot 0.322). DDW also surfaces demonstrability as a dataset-level signal property: phases on which demonstration fails are the same phases on which both reference baselines underperform.