Classroom Glossary Public page

Lab 10: Bufferbloat Measurement, FQ-CoDel, and TCP BBR

548 words

Chapter: 10 (Performance)
Duration: 90 minutes
Tools: tc (Linux traffic control), Flent, iperf3, ping
Points: 10


Objectives

  1. Create an artificial bottleneck with a large buffer (simulating bufferbloat)
  2. Measure baseline latency-under-load (RRUL test) and confirm bufferbloat
  3. Apply FQ-CoDel to eliminate the bufferbloat
  4. Switch TCP congestion control from CUBIC to BBR and measure the effect
  5. Generate plots comparing all three configurations

Setup

# Verify tools
tc -h 2>&1 | head -3
iperf3 --version
flent --version
ping -V

# Identify your outbound interface (likely eth0 or ens3)
ip route get 8.8.8.8 | grep dev
IFACE=$(ip route get 8.8.8.8 | grep -oP 'dev \K\S+')
echo "Interface: $IFACE"

# Start iperf3 server in background for flent tests
iperf3 -s -D -p 5201

Part 1: Create Artificial Bottleneck (10 min)

Simulate a 10 Mbps bottleneck with a large buffer (400ms latency budget = severe bufferbloat):

# Add Token Bucket Filter (TBF) qdisc: 10Mbps, 400ms buffer
sudo tc qdisc del dev $IFACE root 2>/dev/null   # remove any existing qdisc
sudo tc qdisc add dev $IFACE root handle 1: \
  tbf rate 10mbit burst 32kbit latency 400ms

# Verify
tc qdisc show dev $IFACE

Test baseline (with bottleneck, before AQM)

# Run a 30-second download while measuring latency
iperf3 -c 127.0.0.1 -t 30 &
ping -i 0.1 -c 300 127.0.0.1 | tee /tmp/ping_baseline.txt

# Simple latency check: ping gateway during iperf3
iperf3 -c 127.0.0.1 -t 30 &
GATEWAY=$(ip route | grep default | awk '{print $3}')
ping -i 0.1 -c 300 $GATEWAY | tee /tmp/ping_bufferbloat.txt

Record:

  1. Baseline RTT to gateway (before starting iperf3): X ms
  2. RTT to gateway during iperf3 download: Y ms
  3. RTT increase during load (Y - X): Z ms
  4. Does this indicate bufferbloat? (A 10x or greater increase indicates severe bufferbloat)

Part 2: Flent RRUL Test - Baseline (20 min)

RRUL (Real-time Response Under Load) simultaneously saturates upload and download while measuring latency. This is the most realistic bufferbloat test.

# Remove previous qdisc first (flent uses netserver)
# Start netserver for Flent
netserver -p 12865 &

# Run Flent RRUL test - baseline WITH bufferbloat qdisc
sudo tc qdisc del dev $IFACE root 2>/dev/null
sudo tc qdisc add dev $IFACE root tbf rate 10mbit burst 32kbit latency 400ms

flent rrul \
  -H 127.0.0.1 \
  -t "Baseline_Bufferbloat" \
  -l 30 \
  -o /tmp/rrul_baseline.png \
  2>/dev/null

echo "Baseline test complete. Plot saved to /tmp/rrul_baseline.png"

Record from the Flent output:

  1. Average download throughput (Mbps)
  2. Average upload throughput (Mbps)
  3. Average ICMP RTT during load (ms) - this should be high (100-400ms) indicating bufferbloat
  4. Minimum ICMP RTT (ms)
  5. Maximum ICMP RTT (ms)

Part 3: Apply FQ-CoDel (20 min)

# Replace TBF with FQ-CoDel
sudo tc qdisc del dev $IFACE root
sudo tc qdisc add dev $IFACE root fq_codel

# Verify
tc -s qdisc show dev $IFACE

# Run Flent RRUL again
flent rrul \
  -H 127.0.0.1 \
  -t "FQ-CoDel" \
  -l 30 \
  -o /tmp/rrul_fqcodel.png \
  2>/dev/null

echo "FQ-CoDel test complete."

Record:

  1. Average ICMP RTT during load with FQ-CoDel (ms) - expected: much lower
  2. Throughput maintained? (Should be similar to baseline - FQ-CoDel does not reduce bandwidth)
  3. Maximum ICMP RTT with FQ-CoDel (ms)

RTT improvement: (baseline_avg_rtt - fqcodel_avg_rtt) / baseline_avg_rtt × 100%

Run a manual verification:

# Confirm interactivity is preserved under load
sudo tc qdisc del dev $IFACE root
sudo tc qdisc add dev $IFACE root fq_codel

iperf3 -c 127.0.0.1 -P 4 -t 30 &
ping -i 0.1 -c 300 $GATEWAY | tee /tmp/ping_fqcodel.txt

Compare /tmp/ping_bufferbloat.txt vs /tmp/ping_fqcodel.txt.


Part 4: TCP BBR vs. CUBIC (20 min)

4.1 Enable BBR

# Check available congestion control algorithms
sysctl net.ipv4.tcp_available_congestion_control

# Enable BBR (requires kernel 4.9+)
sudo modprobe tcp_bbr 2>/dev/null
echo tcp_bbr | sudo tee -a /etc/modules-load.d/modules.conf 2>/dev/null
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr
sudo sysctl -w net.core.default_qdisc=fq

# Verify
sysctl net.ipv4.tcp_congestion_control

4.2 Test BBR throughput

# Run with FQ-CoDel + BBR
sudo tc qdisc del dev $IFACE root
sudo tc qdisc add dev $IFACE root fq_codel

flent rrul \
  -H 127.0.0.1 \
  -t "FQ-CoDel+BBR" \
  -l 30 \
  -o /tmp/rrul_bbr.png \
  2>/dev/null

4.3 Switch back to CUBIC and compare

sudo sysctl -w net.ipv4.tcp_congestion_control=cubic

flent rrul \
  -H 127.0.0.1 \
  -t "FQ-CoDel+CUBIC" \
  -l 30 \
  -o /tmp/rrul_cubic.png \
  2>/dev/null

4.4 Inspect congestion window with ss

# During an iperf3 run, sample ss output every 0.5s
iperf3 -c 127.0.0.1 -t 10 &
for i in $(seq 1 10); do
  ss -tin | grep -A 1 "ESTAB.*:5201" | tail -1
  sleep 0.5
done

Record:

  1. What cwnd values does CUBIC show during steady-state?
  2. What cwnd values does BBR show?
  3. Does BBR achieve equal or higher throughput vs. CUBIC in this loopback scenario?
  4. What is the pacing_rate field in the BBR ss output?

Part 5: Comparison Table and Plot Interpretation (10 min)

Fill in the comparison table from your measurements:

Configuration Avg RTT under load (ms) Max RTT (ms) Avg throughput (Mbps)
Baseline (TBF + CUBIC, large buffer)
FQ-CoDel + CUBIC
FQ-CoDel + BBR

Examine the three Flent plots (/tmp/rrul_*.png). Write 3-4 sentences interpreting:

  • Which configuration shows the flattest latency curve during load?
  • Where does the "bloat" appear in the baseline plot (pre-AQM)?
  • Does BBR change the shape of the throughput curve compared to CUBIC?

Lab Report

  1. A network admin says "bigger buffers mean fewer drops, therefore better network performance." Explain why this reasoning is wrong in the context of TCP performance and interactive applications.
  2. FQ-CoDel was designed to solve bufferbloat without sacrificing throughput. Based on your measurements, did it succeed? What mechanism allows it to maintain throughput while reducing latency?
  3. A mobile network engineer observes that TCP CUBIC achieves very low throughput on a satellite link (600ms RTT, 100 Mbps bandwidth) but TCP BBR achieves near-line-rate. Explain why.

Cleanup

# Remove artificial qdisc, restore defaults
sudo tc qdisc del dev $IFACE root 2>/dev/null
sudo sysctl -w net.ipv4.tcp_congestion_control=cubic
pkill netserver 2>/dev/null
pkill iperf3 2>/dev/null

Grading (10 points)

Item Points
Bufferbloat confirmed: RTT increase under load documented 2
Flent RRUL baseline: plot saved; avg RTT, throughput recorded 2
FQ-CoDel test: RTT reduction documented; quantified improvement 2
BBR vs. CUBIC: cwnd comparison from ss; throughput comparison 2
Lab report: bufferbloat explanation; FQ-CoDel mechanism; BBR satellite case 2