"eBPF does to Linux what JavaScript does to the browser: it makes the kernel programmable without changing kernel source code or loading kernel modules. It is the most significant change to the Linux kernel infrastructure in the last decade." -- Liz Rice, Learning eBPF, O'Reilly, 2023
Lecture (100 min)
5.1 The Line-Rate Problem
Modern datacenter servers have 10G, 25G, or 100G network interfaces. A packet-processing application in user space can handle roughly 1-2 million packets per second on a single CPU core. At 10G with 64-byte minimum-size packets, the line rate is approximately 14.8 million packets per second. User-space processing cannot keep up.
Three architectural solutions have emerged, each making a different trade-off between flexibility and performance:
- eBPF/XDP -- kernel-space programmability with verifier safety; integrated with the kernel networking stack; moderate complexity
- DPDK (Data Plane Development Kit) -- user-space packet processing via kernel bypass; maximum performance; highest complexity and operational cost
- Smart NICs (ASIC / FPGA) -- hardware offload; vendor-specific; out of scope for NET-301
This chapter covers eBPF/XDP in depth and DPDK at a conceptual level sufficient to understand the trade-off.
5.2 eBPF: Architecture and Safety Model
eBPF (extended Berkeley Packet Filter, formerly BPF) is a subsystem of the Linux kernel that allows user-space programs to inject small programs into the kernel. These programs run in response to kernel events: packet arrival, system call entry/exit, kprobes (kernel function entry points), uprobes (user-space function entry points), tracepoints.
The eBPF execution model:
- User space writes an eBPF program in C (or Rust, Python with BCC, etc.)
- The program is compiled to eBPF bytecode by clang/llvm (
-target bpf) - The bytecode is loaded into the kernel via the
bpf()system call - The kernel's eBPF verifier checks the program: is it safe? No infinite loops (bounded loops only), no invalid memory access, no kernel data corruption. The verifier rejects programs that fail these checks.
- The JIT compiler compiles the verified bytecode to native machine code (x86-64, ARM64, etc.)
- The program runs in the kernel at the specified attach point
eBPF program types:
| Type | Attach point | Use cases |
|---|---|---|
| XDP | Network driver (NIC RX path) | Packet filtering, DDoS mitigation, load balancing at line rate |
| TC (Traffic Control) | Linux TC ingress/egress | Packet modification, QoS, service routing |
| Socket filter | Socket-level packet filtering | Per-application packet inspection |
| kprobe/kretprobe | Kernel function entry/exit | Performance tracing, security monitoring |
| tracepoint | Kernel tracepoints (stable ABI) | System-call monitoring, scheduler observation |
| uprobe/uretprobe | User-space function entry/exit | Application performance, argument inspection |
| LSM (Linux Security Module) | Security hooks | Policy enforcement; the basis of Tetragon |
eBPF maps: the primary data structure for sharing state between eBPF programs and user space, or between multiple eBPF programs. Types include hash maps, array maps, ring buffers (for streaming events to user space), and LRU maps (for connection state).
// eBPF map definition (in kernel-space C)
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, 65536);
__type(key, __u32); // source IP
__type(value, __u64); // packet count
} packet_counter SEC(".maps");
5.3 XDP: Express Data Path
XDP is an eBPF attach point at the earliest possible point in the network RX path -- before the kernel allocates an sk_buff (socket buffer), before any protocol processing. This is where the performance comes from: XDP programs run before most of the kernel networking stack overhead.
XDP execution modes:
| Mode | Where XDP runs | Performance | Requirements |
|---|---|---|---|
| Native XDP | Inside the NIC driver (ndo_xdp_xmit) |
Maximum (NIC processes before DMA) | Driver support (mlx5, ice, i40e, virtio_net, etc.) |
| Generic XDP | After skb allocation | Slower (same path as TC) | Any network driver |
| Offloaded XDP | On the NIC hardware itself | Maximum + CPU-zero | Specialized NICs (Netronome, etc.) |
XDP verdict codes: an XDP program returns one of five verdicts:
| Verdict | Meaning |
|---|---|
| XDP_DROP | Drop the packet; never reaches the networking stack |
| XDP_PASS | Pass the packet to the normal kernel networking stack |
| XDP_TX | Transmit the packet back out the same interface |
| XDP_REDIRECT | Redirect to another interface or CPU/queue |
| XDP_ABORTED | Error; packet dropped with a perf event |
XDP_DROP performance: dropping packets in XDP is approximately 10x faster than dropping them with iptables, because iptables runs much later in the packet-processing pipeline (after skb allocation, connection tracking, netfilter traversal). A production DDoS mitigation system using XDP_DROP can discard 14.88 Mpps (10G line rate) on a single core.
5.4 Writing an XDP Program
A minimal XDP program that drops all UDP packets on port 4444:
// xdp_drop_udp_4444.c
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/udp.h>
#include <bpf/bpf_helpers.h>
SEC("xdp")
int xdp_drop_udp_4444(struct xdp_md *ctx) {
void *data_end = (void *)(long)ctx->data_end;
void *data = (void *)(long)ctx->data;
// Ethernet header bounds check
struct ethhdr *eth = data;
if ((void *)(eth + 1) > data_end) return XDP_PASS;
if (eth->h_proto != __constant_htons(ETH_P_IP)) return XDP_PASS;
// IP header bounds check
struct iphdr *ip = (void *)(eth + 1);
if ((void *)(ip + 1) > data_end) return XDP_PASS;
if (ip->protocol != IPPROTO_UDP) return XDP_PASS;
// UDP header bounds check
struct udphdr *udp = (void *)(ip + 1);
if ((void *)(udp + 1) > data_end) return XDP_PASS;
// Drop if destination port is 4444
if (udp->dest == __constant_htons(4444)) return XDP_DROP;
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";
Compile and load:
# Compile to BPF bytecode
clang -O2 -target bpf -c xdp_drop_udp_4444.c -o xdp_drop_udp_4444.o
# Load onto interface eth0 in native mode
ip link set dev eth0 xdp obj xdp_drop_udp_4444.o sec xdp
# Verify it's loaded
bpftool net list dev eth0
# Detach
ip link set dev eth0 xdp off
Measuring drop rate:
# Generate UDP traffic on port 4444
nping --udp -p 4444 --rate 1000 --count 10000 <target_ip>
# Observe XDP drop counter
bpftool prog show # find program ID
cat /proc/net/dev # observe RX drops on the interface
5.5 Cilium: eBPF in Production at Kubernetes Scale
Cilium is the dominant production deployment of eBPF for Kubernetes networking. It replaces the traditional kube-proxy (which uses iptables) with eBPF programs that provide service load balancing, network policy enforcement, and observability at line rate.
Cilium architecture:
| Component | Role |
|---|---|
| cilium-agent | DaemonSet on each node; manages eBPF program lifecycle |
| Hubble | Observability layer; reads eBPF ring-buffer events; L3/L4/L7 flow logs |
| Hubble UI | Network topology visualization from eBPF-sourced flow data |
| Tetragon | Runtime security; attaches eBPF programs to LSM hooks; detects process/network anomalies |
CiliumNetworkPolicy: extends Kubernetes NetworkPolicy to L7. A policy can allow HTTP GET to /api/v1/users but deny POST -- enforced entirely in eBPF without a sidecar proxy.
# CiliumNetworkPolicy: allow only GET /api/v1/*
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "api-allow-get"
spec:
endpointSelector:
matchLabels:
app: api-server
ingress:
- toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: "GET"
path: "/api/v1/.*"
5.6 Architecture Comparison Sidebar: eBPF/XDP vs DPDK vs Kernel Bypass
| Property | eBPF/XDP | DPDK | Kernel-bypass (RDMA/RDMA-like) |
|---|---|---|---|
| Where packets are processed | Kernel (before skb) | User space (PMD loop) | NIC hardware / RDMA |
| Kernel involvement | Yes (eBPF verifier + JIT) | No (kernel bypass) | None |
| Safety model | Verifier guarantees termination + safety | None; any bug = kernel crash | Hardware; can be very safe or very unsafe |
| Maximum throughput | ~100 Mpps (line rate for XDP_DROP) | ~200+ Mpps (measured; NIC-dependent) | Near-wire-rate; latency < 1us |
| Flexibility | High (eBPF C programs) | High (C code) | Limited (fixed acceleration functions) |
| Integration with Linux | Full (uses normal network stack for pass-through) | None (exclusive NIC access) | Limited (special drivers) |
| Security observability | Excellent (eBPF has read-only access to all kernel state) | Poor (bypasses kernel entirely) | None |
| Deployment complexity | Low (kernel 5.15+ required; bpftool + iproute2) | High (hugepages, CPU pinning, IOMMU setup) | Very high (specialized hardware + drivers) |
| Primary use cases | DDoS mitigation, service mesh, NSM, service LB | Line-rate packet processing, telco VNF | HPC, financial trading, RDMA storage |
The security practitioner's summary: eBPF/XDP is the right choice for network security workloads because it integrates with the kernel's visibility surface. DPDK is the right choice for maximum throughput in workloads that do not need the kernel's security or observability infrastructure. DPDK's bypass of the kernel networking stack also bypasses any kernel-level security policy -- an important consideration when deploying DPDK-based VNFs in a carrier or datacenter.
Liz Rice / Learning eBPF Weave
Liz Rice's Learning eBPF is the most accessible current reference for eBPF from first principles. The first three chapters cover the verifier, the JIT compiler, and the map abstraction. NET-301 Week 5 follows Rice's framing for the program-type taxonomy and the safety model, then extends it to the security-specific XDP and Cilium/Tetragon use cases. Rice's central claim -- that eBPF makes the kernel programmable without the risk of kernel modules -- is the design property that makes eBPF the default observability and policy enforcement substrate for the modern Linux datacenter.
Lab 5 Introduction
Lab 5 runs on the lab Linux environment (kernel 5.15+). You will write, compile, load, and verify an XDP program that counts packets by protocol, using a BPF hash map to record counts and a user-space reader to display them. You will then extend the program to drop packets matching a configurable blacklist (source IPs specified in a BPF hash map), load the blacklist at runtime from user space without reloading the XDP program, and measure the per-packet latency and throughput versus an iptables-based drop rule implementing the same policy.
Independent Practice (5 hr)
- Rice Learning eBPF Ch 1-4 (BPF, eBPF program types, maps, and the verifier -- ~100 pages)
- Cilium documentation:
docs.cilium.io-- read "Getting Started with Kubernetes (Kind)" + "Network Policy" sections - Lab 5 -- Part A (XDP program compile + load + verify)
- Supplemental: BPFtrace one-liners reference (
github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md); read §1-§4 - Supplemental: the DPDK "Performance Study of Packet Processing Frameworks" paper (Intel, publicly available; ~30 min); establishes the performance numbers cited in the sidebar