NET-301 Week 3 -- Internet-Scale BGP: Route Reflectors, Communities, RPKI, Prefix Hijacking · NET-301

"BGP is the routing protocol of the Internet. It is also, without exaggeration, one of the most critical and fragile pieces of infrastructure in the global telecommunications system." -- Kurose & Ross, Computer Networking: A Top-Down Approach, 9th ed., §5.4

Lecture (100 min -- first of two BGP weeks)

3.1 NET-201 BGP and What It Left Out

NET-201's BGP module built a working iBGP/eBGP topology in GNS3, observed path-vector advertisements, and sandboxed a prefix hijack. That module intentionally left three topics for NET-301:

Route reflectors -- how iBGP scales to hundreds of routers without a full mesh
BGP communities -- how carriers attach policy metadata to routes
RPKI -- the cryptographic origin-validation overlay that makes BGP prefix origination auditable

Week 3 covers the first two. Week 4 (in this outline: the second BGP week, delivered without a separate week file) covers RPKI and prefix-hijacking detection. Lab 3 covers RPKI deployment.

3.2 The iBGP Full-Mesh Problem

BGP's split-horizon rule requires that a route learned from an iBGP peer not be re-advertised to another iBGP peer. This prevents loops within an AS, but it also means that in a full iBGP mesh, every router must peer directly with every other router. An AS with N routers requires N*(N-1)/2 iBGP sessions.

At NET-201 scale (5-10 routers): manageable. At carrier scale:

Routers in AS	Full-mesh sessions
10	45
50	1,225
100	4,950
500	124,750

A major carrier with 500 BGP-speaking routers needs 124,750 iBGP sessions. This is not operationally feasible.

3.3 Route Reflectors (RFC 4456)

Route Reflectors (RRs) solve the iBGP scaling problem by relaxing the split-horizon rule for designated servers. An RR can re-advertise routes received from one client to other clients and to non-client peers.

RR terminology:

Term	Definition
RR Client	An iBGP peer that has been configured to "point to" the RR; forms a single session to the RR
Non-client	An iBGP peer that does NOT peer through the RR (maintains full-mesh with the RR and other non-clients)
Cluster	A set of RR + its clients; identified by CLUSTER_ID
ORIGINATOR_ID	BGP attribute added by the RR when reflecting a route; prevents re-advertisement loops
CLUSTER_LIST	BGP attribute carrying the list of CLUSTER_IDs a route has traversed; prevents reflection loops

RR topology patterns:

Two-level hierarchy (most common in large ASes):

Level 1 RRs: placed at major PoPs; each clients a set of edge routers
Level 2 RRs: cluster of two or three, peering with all Level 1 RRs in full mesh

Pair-of-RRs (common in enterprise): two RRs, each with all other routers as clients; the two RRs peer with each other in iBGP. Provides redundancy with only N-1 client sessions per RR.

Caveat: route reflectors are a policy-propagation tool, not a traffic-forwarding tool. The RR reflects routing information; traffic still flows on whatever path the route specifies. Hot-potato routing, IGP metric differences, and NEXT_HOP resolution must all be considered when deploying RR hierarchies.

3.4 BGP Communities

A BGP community (RFC 1997) is a 32-bit attribute attached to a route, used to carry policy information between BGP peers. Format: AS_number:community_value (e.g., 65001:100).

Well-known communities:

Community	Hex value	Meaning
NO_EXPORT	0xFFFFFF01	Do not advertise outside the AS
NO_ADVERTISE	0xFFFFFF02	Do not advertise to any BGP peer
NO_EXPORT_SUBCONFED	0xFFFFFF03	Do not advertise to eBGP peers (keep within confederation)
BLACKHOLE	65535:666 (de facto)	Trigger remote triggered black hole at peers

Large communities (RFC 8092): the 32-bit community format limits the AS-number field to 16 bits, causing issues since 4-byte ASNs became common. Large communities use a 96-bit format: {Global_Administrator}:{Local_Data_1}:{Local_Data_2}, all 32-bit fields.

Operator use cases:

Use case	Community pattern
Route-tagging for origin AS	`65001:origin_code` -- internal policy classification
Prepend control	`65000:prepend1` -- ask peer to prepend AS-path once when advertising to their customers
No-export to specific peer	`65000:nopeer_{peer_ASN}`
RTBH (Remote Triggered Black Hole)	`65535:666` -- advertise a /32 with this community; peer drops all traffic destined to it

3.5 RPKI: Resource Public Key Infrastructure

RPKI (RFC 6480) is the cryptographic infrastructure that allows the rightful holder of an IP prefix to publish a signed attestation of which AS is authorized to originate that prefix. This directly addresses the BGP prefix hijack attack class.

The BGP hijack problem:

BGP has no native mechanism to verify that the AS announcing a prefix actually owns it. An AS announcing 8.8.8.0/24 (Google's DNS) will receive traffic destined for Google's addresses, whether it owns the prefix or not. Real-world hijacks include:

Pakistan Telecom's 2008 announcement of YouTube's prefix (brought down YouTube globally for ~2 hours)
Rostelecom's 2020 announcement of routes for Amazon, Cloudflare, Akamai, and others (8,800 prefixes hijacked for approximately 1 hour)
China Telecom's documented pattern of brief, low-volume route announcements consistent with traffic interception

ROA (Route Origin Authorization): an RPKI object signed by the prefix holder using their key material from the Regional Internet Registry (RIR). A ROA specifies:

The prefix
The maximum prefix length that can be announced (to prevent more-specific hijacks)
The authorized origin AS

ROA example:
  Prefix: 8.8.8.0/24
  Max Length: 24
  Origin AS: AS15169 (Google)
  Signed by: Google's ARIN-issued certificate chain

RPKI validation states:

State	Meaning	Action
Valid	The BGP announcement matches a ROA (same prefix, origin AS, and within max-length)	Accept; mark Valid
Invalid	A ROA exists for the prefix but the origin AS or prefix length does not match	Reject (if policy enforces); high-confidence hijack indicator
NotFound (Unknown)	No ROA exists for this prefix	Accept (conservative) or investigate

RPKI-to-Router (RTR) protocol: routers do not perform RPKI validation themselves (certificate parsing is expensive). Instead, a validator cache (Routinator, Fort, OctoRPKI, RTRR) fetches and validates RPKI data from all five RIRs (ARIN, RIPE, APNIC, LACNIC, AFRINIC), then serves the validated ROA table to routers via the lightweight RTR protocol.

RIR repositories → Validator Cache (Routinator) → RTR → FRR/Cisco/Juniper router
                                                        → Policy: reject Invalid

3.6 Detecting Prefix Hijacks in Production Traffic

Even with RPKI deployed, detection requires active monitoring:

Real-time BGP monitoring services:

RIPE RIS (Routing Information Service): BGP route collector network; API for querying current RIB state
RouteViews: similar collector network operated by University of Oregon
Cloudflare Radar and BGPStream: commercial + open tools for BGP event detection

MOAS (Multiple Origin AS) detection: if a prefix is simultaneously announced by two different ASes, this is either a hijack or a misconfiguration. BGP looking glasses (RIPEstat, HE BGP) flag these in near-real-time.

Prefix more-specific detection: a hijacker often announces a more-specific prefix (/25 vs /24) to attract traffic via the longest-prefix-match rule. Monitoring for unexpected more-specific announcements of your prefixes is a standard carrier practice.

Time-series analysis: legitimate routes are stable. A prefix appearing in the global DFZ (Default-Free Zone) for only a few minutes, then disappearing, is a hijack signature. Tools like bgpmon.net and Kentik alert on these patterns.

BGP at Scale: The Kurose-Ross Framing

Kurose-Ross 9e §5.4 covers BGP as "inter-AS routing." At NET-301 depth, the lesson is not just how BGP works mechanically but why its design guarantees make it simultaneously the Internet's most critical infrastructure and its most attackable one. BGP was designed in an era of trusted peers; it has no origin authentication because authentication was assumed to be a social problem (contracts between ISPs), not a cryptographic problem. RPKI retrofits cryptographic authentication onto a 1990s trust model. That architectural debt -- the reason RPKI exists -- is worth naming explicitly: a protocol designed for trusted parties, now running the Internet's routing system at a scale its designers never anticipated.

Lab 3 Introduction

Lab 3 deploys Routinator against the live RPKI repositories and integrates it with an FRR router in Containerlab via the RTR protocol. You will query the Routinator REST API to observe ROA validity for several prefixes (including some with Invalid state), configure FRR to drop Invalid routes via a route-map, and simulate a prefix hijack: announce a /24 with a different origin AS and verify the FRR router rejects it as Invalid.

Independent Practice (6 hr)

Kurose-Ross 9e §5.4 -- re-read with focus on iBGP, communities, and RPKI sections
RFC 4456 §§1-4 (Route Reflectors -- architecture sections)
RFC 8092 §§1-3 (Large BGP Communities)
RFC 6480 §§1-3 (RPKI -- architecture overview)
Read the "Pakistan Telecom hijacks YouTube" post-mortem (NANOG archives; 2008; ~20 min)
Lab 3 -- Part A (Routinator setup + ROA validity queries)