"A defender who practices network security monitoring has accepted that prevention will eventually fail, and has decided to invest in the ability to detect intrusions quickly and respond effectively. The measure of success is not whether intrusions occur; it is how fast you find them." -- Richard Bejtlich, The Practice of Network Security Monitoring, Ch 1
Lecture (100 min -- first of two NSM weeks)
6.1 NET-201 NSM and the Scale Gap
NET-201's NSM module introduced the core tools: Suricata (signature-based IDS/IPS), Zeek (protocol-aware log generation), and the Bejtlich four-data-types framework (full-content, session, statistical, alert data). A Belt-4 student who completes NET-201 can write Suricata signatures and Zeek scripts and run them against a single sensor monitoring a small network.
The scale gap becomes apparent when the network has:
- Multiple sites, each requiring a sensor
- Traffic volume that exceeds a single sensor's packet-capture capacity
- Log volume that requires a pipeline to aggregate, normalize, correlate, and store
- Multiple analysts who need concurrent access to query historical and live data
Week 6 addresses the scale gap. The architecture goal is a deployment where sensor failures do not create blind spots, log ingestion keeps up with traffic volume, and the SIEM provides a single query surface across all sensors.
6.2 Bejtlich's Four-Data-Types Framework at Scale
The four data types from The Practice of Network Security Monitoring remain the organizing principle at scale:
| Data type | What it is | NSM tool | Scale concern |
|---|---|---|---|
| Full content | Complete packet capture | Suricata EVE JSON + pcap | Storage; capture rate limits at high throughput |
| Session data | Connection metadata | Zeek conn.log |
Log volume; need streaming pipeline |
| Statistical data | Aggregate traffic metrics | NetFlow/IPFIX; Zeek weird.log |
Summarization vs retention trade-off |
| Alert data | Signature-matched events | Suricata alert in EVE JSON |
False-positive noise; correlation required |
At scale, the challenge is not generating these data types (Suricata + Zeek handle that at single-sensor level) but delivering them to a unified query surface reliably, at volume, with retention policies that match the organization's threat-hunting window.
6.3 Suricata at Scale: AF_PACKET Cluster Mode
Single-threaded Suricata runs one capture thread. A 10G interface at full load generates approximately 14.88 million packets per second; a single thread can process approximately 1-2 million. Suricata's cluster mode solves this via AF_PACKET in a cluster hash model:
AF_PACKET cluster hash: the kernel hashes packets (5-tuple) and delivers each flow consistently to the same CPU ring. Suricata runs multiple capture threads, each bound to a different ring. All flows in a connection see the same thread; thread state for TCP stream reassembly remains consistent per flow.
# suricata.yaml: multi-threaded AF_PACKET cluster
af-packet:
- interface: eth0
threads: auto # one thread per CPU core
cluster-id: 99
cluster-type: cluster_flow # 5-tuple hash
defrag: yes
use-mmap: yes
ring-size: 200000
block-size: 32768
buffer-size: 128mb
Suricata's EVE JSON: Suricata's unified log format. All events (alerts, DNS, HTTP, TLS, SMTP, file info, etc.) are emitted as JSON to a configurable output. In a clustered deployment, EVE JSON goes to a log shipper (Filebeat, Fluent Bit) for downstream aggregation.
Multi-sensor coordination: in a multi-site deployment, each site has its own Suricata cluster. Rule updates must be synchronized across all clusters; the standard tool is Suricata-Update (pulls from Emerging Threats, ProofPoint, local rulesets; generates merged rule files). A Git repository + CI pipeline (same automation discipline as Week 4) handles synchronization.
6.4 Zeek at Scale: Log-Pipeline Integration
Zeek generates structured logs (TSV or JSON format) for every protocol it observes: conn.log (all connections), dns.log, http.log, ssl.log, files.log, x509.log, and custom logs from Zeek scripts. In a production deployment, Zeek's log volume requires a log-aggregation pipeline.
Log-shipping pipeline:
Zeek sensor → Filebeat → (Kafka/Redis message bus) → Logstash/Vector → Elasticsearch
Or the simpler path for smaller deployments:
Zeek sensor → Filebeat → Elasticsearch (ingest pipelines)
Kafka as the backbone: in large-scale NSM deployments (Security Onion at scale, the CERN NSMD deployment), Kafka serves as the log bus. Multiple Zeek sensors write to Kafka topics; multiple consumers (Elasticsearch indexer, custom threat-hunting scripts, alert correlation engine) consume from the same topics. This decouples sensor throughput from consumer processing speed.
Zeek cluster mode: for a single sensor monitoring a high-throughput link (10G+), Zeek itself can be clustered on a single machine:
- One manager process: receives logs from workers; coordinates
- Multiple worker processes: each captures a subset of traffic (assigned by the cluster manager)
- One proxy: optional; routes inter-worker communication
# zeekctl.cfg for a single-machine cluster
[manager]
type=manager
host=localhost
[logger]
type=logger
host=localhost
[proxy-1]
type=proxy
host=localhost
[worker-1]
type=worker
host=localhost
interface=eth0
lb_method=pf_ring # or cpu, or bpf
[worker-2]
type=worker
host=localhost
interface=eth0
lb_method=pf_ring
6.5 Architecture Comparison Sidebar: Snort vs Suricata vs Zeek
| Property | Snort | Suricata | Zeek |
|---|---|---|---|
| Primary paradigm | Signature-based IDS | Signature-based IDS/IPS + protocol dissection | Protocol-aware log generation + scripting |
| Rule language | Snort rules (original) | Snort rules + Suricata extensions | Zeek scripts (Turing-complete) |
| Multi-threading | Limited (Snort 3 improved) | Native; AF_PACKET cluster | Native cluster mode |
| Protocol dissection | Limited | Yes (7 layers; TLS, HTTP, DNS, etc.) | Extensive (40+ protocols; byte-level) |
| Output format | Text logs | EVE JSON (unified) | TSV / JSON (per-protocol logs) |
| Community corpus | ET rules (partially; Suricata now primary) | ET Open + ET Pro; ProofPoint; community | Zeek Package Manager (dozens of community scripts) |
| Alert noise | High without tuning | High without tuning | Lower (behavior-based; not signature-based) |
| Primary use at scale | Legacy deployments; SOAR integration | IDS/IPS + EVE-JSON pipeline | Protocol telemetry + threat-hunting |
| Open source | Yes (GPL) | Yes (GPL2) | Yes (BSD) |
The practitioner's deployment pattern: Suricata handles signature-based alerting and flow-level protocol extraction; Zeek handles deep protocol logging and custom behavioral detection via scripts. They are complementary tools, not alternatives. In the NSM corpus (the academy's four threat scenarios), Zeek's conn.log detects the anomalous connection patterns; Suricata's alert events confirm the specific signatures.
6.6 SIEM Integration: Wazuh and Elasticsearch/Kibana
A SIEM (Security Information and Event Management) system provides:
- Log ingestion from heterogeneous sources
- Index and search interface
- Correlation rules: "if alert A occurs within 5 minutes of connection B from the same source IP, generate incident C"
- Dashboards and reporting
Wazuh + OpenSearch stack:
Wazuh (open source; cloud and self-hosted) ingests Suricata EVE JSON and Zeek logs via Filebeat shippers. Wazuh active response rules can trigger automated actions (block IP in Suricata rules, add firewall entry) based on correlated events.
Elasticsearch + Kibana + Security Onion: the academy NSM corpus labs use this stack. Security Onion bundles Suricata, Zeek, Elasticsearch, Kibana, and the Kibana network-monitoring dashboards into a single deployment.
Key correlation patterns for the academy corpus:
| Threat scenario | Zeek log signature | Suricata alert | Correlation rule |
|---|---|---|---|
| C2 beaconing | conn.log: periodic connections same dst:port; duration pattern |
ET MALWARE Beacon sig | Same source IP; periodic interval < 5% jitter |
| DNS exfiltration | dns.log: unusually long QNAME; high entropy labels; TXT record bursts |
ET DNS Exfil query sig | DNS query rate + entropy score threshold |
| Lateral movement | conn.log: horizontal sweep from internal host; SMB/WinRM |
ET POLICY SMB scan | Internal-to-internal scan from single source |
| Credential harvest | http.log: POST to /login; high 401 rate |
ET BRUTE force sig | HTTP 401 rate > threshold from source |
Bejtlich Weave
Bejtlich's Ch 5 (NSM in Practice) describes the analyst's daily rhythm: review alerts, pivot to sessions, pivot to full-content captures when sessions are anomalous. This workflow scales; the tooling does not. At scale, the alert queue overwhelms an analyst working manually. The SIEM's correlation rules automate the first-pass triage, escalating only the events that cross multiple detection layers simultaneously. The four-data-types framework remains the analyst's mental model; the SIEM automates the first-pass traversal of it.
Lab 7 Introduction
Lab 7 deploys a three-node Suricata cluster and a Zeek sensor against the academy NSM corpus. The EVE JSON and Zeek logs are shipped via Filebeat into a Wazuh (or Elasticsearch) deployment. You will write a Wazuh rule that correlates the C2 beaconing pattern (Suricata alert + Zeek conn.log periodic-connection signature) and verify it triggers on the academy c2-beacon.pcap. A dashboard query will show the four-data-types coverage of the corpus across all threat scenarios.
Independent Practice (6 hr)
- Bejtlich Practice of Network Security Monitoring Ch 1, 5, 6 -- foundational reading for Belt-5 NSM
- Suricata documentation:
suricata.readthedocs.io-- "Performance" section (AF_PACKET, multi-threading) - Zeek documentation:
docs.zeek.org-- "Cluster Configuration" section - Lab 7 -- Part A (Suricata + Zeek sensor deployment)
- Supplemental: Security Onion documentation
securityonionsolutions.com/documentation-- architecture overview; 30 min