"BGP's job is not to find the shortest path. BGP's job is to implement policy." -- Ivan Pepelnjak, BGP in Large Networks (blog, 2009)
Lecture (50 min)
3.1 Why BGP Exists: The Inter-AS Problem
OSPF and IS-IS work beautifully inside a single organization's network (an Autonomous System). They flood topology information freely; every router sees every other router's links. But you cannot run OSPF across the entire Internet: no single organization can trust all others with its internal topology; the LSDB would be enormous; flooding would never converge.
BGP (Border Gateway Protocol, RFC 4271) is the inter-AS routing protocol. It connects Autonomous Systems to each other and to the Internet. BGP carries:
- Which network prefixes each AS can reach
- The AS path (sequence of AS numbers) to reach each prefix
- Policy attributes that influence path selection
BGP's core design principle: policy over optimality. Where OSPF finds the mathematically shortest path, BGP selects the path that satisfies the network operator's policy preferences (business relationships, traffic engineering, security constraints).
3.2 BGP Terminology
| Term | Meaning |
|---|---|
| AS | Autonomous System; a network under a single administrative domain; identified by an AS Number (ASN) |
| ASN | AS Number; 16-bit (1-65535) or 32-bit (1-4294967295); assigned by IANA via regional registries (ARIN, RIPE, APNIC) |
| NLRI | Network Layer Reachability Information; the prefix + prefix-length a BGP UPDATE advertises |
| eBGP | External BGP; peering between routers in different ASes |
| iBGP | Internal BGP; peering between routers within the same AS |
| RIB | Routing Information Base; BGP's table of all received routes before best-path selection |
| Loc-RIB | Local RIB; routes that passed best-path selection and are candidates for the routing table |
| BGP speaker | Any router running BGP |
| Peer / neighbor | A BGP session partner; manually configured (BGP does NOT discover neighbors automatically) |
3.3 The BGP State Machine
BGP uses a TCP session (port 179) between peers. The state machine has six states:
Idle -> Connect -> Active -> OpenSent -> OpenConfirm -> Established
Unlike OSPF (which multicast-discovers neighbors), BGP neighbors must be configured explicitly:
router bgp 65001
neighbor 10.1.1.2 remote-as 65002
A BGP session is not established until the TCP connection succeeds AND the OPEN message exchange completes. Once in Established state, peers exchange KEEPALIVE messages every 60 seconds (hold timer default 180 seconds). A session that misses 3 keepalives enters Idle.
3.4 BGP Attributes and Path Selection
BGP chooses among multiple paths to the same prefix using an ordered decision process. Key attributes:
| Attribute | Type | Description |
|---|---|---|
| Weight | Cisco-local | Highest preferred; local to router only; not sent to peers |
| LOCAL_PREF | Well-known | Preferred within an AS; higher is better; default 100 |
| AS_PATH | Well-known | Sequence of ASes the route has traversed; shorter is preferred (loop prevention) |
| ORIGIN | Well-known | How the route entered BGP (IGP=i, EGP=e, incomplete=?); i preferred |
| MED | Optional | Multi-Exit Discriminator; tells external ASes which entry point to prefer; lower is preferred |
| eBGP vs iBGP | -- | eBGP-learned routes preferred over iBGP |
| IGP metric | -- | Lowest IGP cost to next-hop preferred (tiebreaker) |
The decision process runs in order: Weight, then LOCAL_PREF, then originated locally, then AS_PATH length, then ORIGIN, then MED, then eBGP>iBGP, then IGP metric. The first criterion that differentiates two paths determines the winner.
Route maps control policy: before accepting or advertising routes, BGP operators apply route maps to set or match attributes:
route-map SET_LOCPREF permit 10
set local-preference 200
!
router bgp 65001
neighbor 10.1.1.2 route-map SET_LOCPREF in
3.5 iBGP: The Full Mesh Problem
Inside an AS, BGP routes are propagated between iBGP speakers. A critical rule: iBGP routers do NOT re-advertise routes received from iBGP to other iBGP peers. This prevents routing loops between iBGP sessions.
The consequence: in an AS with N iBGP routers, every router must peer with every other router -- a full mesh of N*(N-1)/2 sessions. This scales very poorly. Solutions:
Route Reflectors (RFC 4456): designate one or more routers as Route Reflectors (RRs). Other routers (Clients) peer only with the RR. The RR re-reflects routes between clients, eliminating the full-mesh requirement. Most enterprise and SP networks use RRs.
Confederations (RFC 5065): divide a large AS into Sub-ASes internally; use eBGP-like semantics between Sub-ASes. More complex than RRs; used in some very large networks.
3.6 BGP and the Internet Routing Table
The global BGP routing table (the "Default-Free Zone") contains over 950,000 IPv4 prefixes and 200,000 IPv6 prefixes as of 2026. Every edge router at an Internet exchange point (IXP) maintains a copy. The BGP table is publicly observable:
- RIPEstat (stat.ripe.net): route origin data, AS path analysis, prefix routing history
- Hurricane Electric BGP Toolkit (bgp.he.net): AS graph, prefix queries, peer listing
- Cloudflare Radar (radar.cloudflare.com): real-time BGP routing anomalies and hijacks
These looking-glass tools ground the theoretical BGP content in observable Internet behavior.
Lab Preview
Lab 2 builds a 3-AS BGP topology: AS 65001 (your network), AS 65002 (upstream ISP), AS 65003 (peer). You configure eBGP sessions between them, observe path selection, manipulate LOCAL_PREF to prefer one exit over another, and demonstrate a sandboxed prefix hijack (AS 65003 announces AS 65001's prefix; observe what happens to traffic).
Homework
Reading (45 min): Kurose-Ross 9e Ch 5.4 (Routing Among the ISPs: BGP). Focus on the BGP policy framework, the iBGP/eBGP distinction, and the route-reflector concept.
Hands-on (60 min): Using the RIPEstat BGP looking glass, look up ASN 13335 (Cloudflare). Record:
- How many IPv4 prefixes does AS 13335 originate?
- Which upstream AS numbers appear most frequently in path data for Cloudflare routes?
- Find one documented BGP hijacking incident (from Cloudflare Radar BGP Anomaly Detection) that occurred within the past 12 months. Describe the victim prefix, the hijacking ASN, and the duration.
Toolchain Diary Entry
First-introduce this week: BGP looking-glass tools (RIPEstat, HE BGP Toolkit, Cloudflare Radar)
stat.ripe.net/widget/bgp-state: BGP state widget -- enter an ASN or prefix; observe current routing state and historical data.
bgp.he.net/AS65001: Hurricane Electric BGP Toolkit page -- AS graph, peer list, prefix list, route server query.
radar.cloudflare.com/routing: Cloudflare BGP anomaly detection -- real-time route leak and hijack detection with confidence scores.
Key Terms
- BGP: Border Gateway Protocol; inter-AS routing protocol; path-vector; TCP port 179; policy-driven path selection
- ASN: Autonomous System Number; 2-byte (1-65535) or 4-byte; public ASNs assigned by ARIN/RIPE/APNIC; private range: 64512-65534 (2-byte), 4200000000-4294967295 (4-byte)
- eBGP: External BGP; session between routers in different ASes; routes originate here
- iBGP: Internal BGP; session between routers in the same AS; distributes eBGP-learned routes within the AS; requires full mesh or route reflectors
- LOCAL_PREF: BGP well-known attribute; shared within an AS to express preference for one exit over another; higher is preferred
- AS_PATH: ordered list of ASNs a route has traversed; primary loop-prevention mechanism (router discards UPDATE with own ASN in path); shorter path preferred by default
- Route Reflector: iBGP extension that allows a router to re-advertise iBGP-learned routes, eliminating the full-mesh requirement
- BGP hijack: when an AS announces prefixes it does not own; diverts traffic to a different AS; detectable via looking-glass tools; defended by RPKI