-- Jeff Doyle & Jenn">
Classroom Glossary Public page

NET-201 Week 2 -- Routing I: IS-IS, OSPF Multi-Area, and Convergence Measurement

1,023 words

"IS-IS is the routing protocol of the Internet's backbone carriers. The fact that most engineers have never configured it does not mean it is unimportant." -- Jeff Doyle & Jennifer Carroll, Routing TCP/IP, Vol. 1, Ch. 6 (IS-IS)


Lecture (50 min)

2.1 IS-IS: Intermediate System to Intermediate System

IS-IS is another link-state IGP, originally defined for the OSI network layer (ISO 10589) and extended to carry IPv4 and IPv6 (RFC 1195 + RFC 5308). It competes with OSPF for backbone IGP dominance and is the preferred choice at most major Internet service providers (ISPs), including Tier 1 carriers.

Why ISPs choose IS-IS over OSPF:

Factor IS-IS OSPF
Protocol layer Runs directly on Layer 2 (ISO CLNP) Runs over IP (IP protocol 89)
IPv4/IPv6 dual-stack Single instance, both address families via Multi-Topology Requires OSPFv2 (IPv4) + OSPFv3 (IPv6) separately or OSPFv3 address-family extensions
Scalability Level 1/Level 2 hierarchy analogous to OSPF area hierarchy; SPF tree computation similar Similar; OSPF areas equivalent to IS-IS levels
Stability Less susceptible to malformed packet processing (no IP layer dependency for control-plane) IP-dependent; crafted IP packets can disrupt OSPF peering
Vendor origin Originally DEC/Digital, adopted by IETF Developed by IETF throughout

IS-IS router levels:

  • Level 1 (L1) router: knows only about destinations within its area (like an OSPF totally-stubby area router)
  • Level 2 (L2) router: carries inter-area (backbone) routes; equivalent to OSPF Area 0 backbone router
  • Level 1-2 (L1L2) router: participates in both levels; equivalent to OSPF ABR

IS-IS TLV architecture: where OSPF uses typed LSAs, IS-IS uses a flexible Type-Length-Value (TLV) encoding inside a single PDU type. New protocol extensions (IPv6, traffic engineering, segment routing) add new TLV types without changing the core PDU format. This is why IS-IS has been easier to extend than OSPF over the years.

2.2 OSPF Multi-Area: Lab 1 Anatomy

Lab 1's 4-router topology typically looks like:

Area 0 (Backbone):  R1 --- R2
                          |
Area 1:             R3 --- R2 (R2 = ABR)
                          |
Area 2:             R4 --- R2 (R2 = ABR, or separate ABR)

In this topology:

  • R1 and R2 are backbone routers (Area 0)
  • R2 is an ABR connecting Area 1 and Area 2 to Area 0
  • R3 (Area 1) and R4 (Area 2) are internal routers

R2 maintains THREE separate LSDBs: one for Area 0, one for Area 1, one for Area 2. When R3 needs to reach R4, its next-hop is through R2 (the ABR), which redistributes Type 3 Summary LSAs from each area into the others.

Verifying inter-area routing:

# On R3 (Area 1): should see a Type 3 Summary LSA for R4's subnet
docker exec -it clab-ospf4-r3 vtysh -c "show ip ospf database summary"

# Routing table on R3 should show R4's network as O IA (OSPF inter-area)
docker exec -it clab-ospf4-r3 vtysh -c "show ip route ospf"

2.3 Convergence: What Happens When a Link Dies

OSPF convergence after a link failure involves:

  1. Detection: the router detects the link failure (interface goes down, Bidirectional Forwarding Detection [BFD] alarm, or Hello timeout). Default Hello interval = 10 sec; Dead interval = 40 sec. BFD reduces detection to milliseconds.

  2. LSA generation: the router generates a new Router LSA reflecting the lost link. LSA age starts at MaxAge-to-originate = 3600 seconds; a new LSA increments the sequence number.

  3. Flooding: the new LSA floods through the area. Each router that receives it runs show ip ospf flood-list internally to track unacknowledged floods.

  4. SPF re-execution: each router in the area re-runs Dijkstra. SPF scheduling has a configurable initial delay and hold timer to dampen rapid topology changes (avoid SPF thrashing).

  5. RIB update: the new SPF tree updates the routing table (Routing Information Base). FIB (Forwarding Information Base) is updated from the RIB.

Modern OSPF implementations complete this cycle in under 1 second with proper BFD + SPF timer tuning. Default timers can produce 40+ second convergence.

2.4 Measuring Convergence

Lab 1 has you deliberately force a link failure and measure the time from failure to full convergence. Tools:

# Method 1: watch show ip ospf neighbor -- observe Full -> Down -> Full transition
watch -n 0.2 "docker exec clab-ospf4-r1 vtysh -c 'show ip ospf neighbor'"

# Method 2: continuous ping across the failed link -- measure ping gaps
docker exec -it clab-ospf4-r3 sh -c "ping -i 0.2 -c 200 10.0.3.1"

# Method 3: capture in Wireshark -- timestamp LSA flood + SPF complete message
tcpdump -i clab-ospf4-r1-eth1 -w /tmp/ospf_convergence.pcap "proto ospf"

Wireshark dissects OSPF packets natively. The OSPF protocol value is 89 in the IP header (display filter: ospf).

2.5 Reference Bandwidth and Cost Tuning

The default OSPF reference bandwidth of 100 Mbps means GigE (1 Gbps) and FastEthernet (100 Mbps) both get cost 1. On any modern network, the reference bandwidth should be raised:

router ospf 1
 auto-cost reference-bandwidth 10000   ! 10 Gbps

With 10 Gbps reference:

  • 1 Gbps link: cost = 10,000 / 1,000 = 10
  • 10 Gbps link: cost = 10,000 / 10,000 = 1

This allows OSPF to differentiate 1 Gbps from 10 Gbps paths, which the default bandwidth cannot do.


Lab Preview

Continue Lab 1 (from Week 1). Focus this week on:

  • Multi-area topology with ABR Type 3 LSA verification
  • Deliberate link-failure + convergence measurement (BFD disabled for default-timer behavior; measure 40-sec dead-interval baseline; then tune)
  • show ip ospf command summary documentation for your Toolchain Diary

Homework

Reading (45 min): Kurose-Ross 9e Ch 5.4 (The SDN control plane; just the section on routing-algorithm refresh to contextualize OSPF vs BGP vs SDN). If you have Doyle-Carroll Vol. 1: read Chapter 6 (IS-IS Overview), focusing on the level hierarchy and TLV architecture difference from OSPF.

Hands-on (60 min): Add BFD support to your two-router OSPF topology and compare convergence time with and without BFD. FRR supports BFD via bfdd. Document the difference in seconds.

# FRR BFD configuration (vtysh)
configure terminal
 bfd
  peer 10.0.1.2 local-address 10.0.1.1
   detect-multiplier 3
   receive-interval 300
   transmit-interval 300
  !
 !
 interface eth1
  ip ospf bfd
 !

Toolchain Diary Entry

Deepen this week: FRR vtysh debugging suite

show ip ospf: OSPF process summary -- Router ID, area count, SPF runs, flood count.

show ip ospf database router: all Type 1 Router LSAs -- see every router's self-reported link costs.

show ip ospf interface eth1: per-interface OSPF state -- DR/BDR election, Hello timer, Dead interval, neighbor count.

show ip ospf statistics: SPF computation history -- timestamps of last SPF run, how many times it ran.

clear ip ospf process: restart the OSPF process; forces re-establishment of all adjacencies. Warning: causes network disruption; use only in lab.


Key Terms

  • IS-IS: Intermediate System to Intermediate System; link-state IGP; runs directly over Layer 2; dominant at ISP backbone; level 1/2 hierarchy mirrors OSPF area hierarchy
  • BFD: Bidirectional Forwarding Detection; sub-second link-failure detection protocol; decoupled from the routing protocol itself; reduces OSPF convergence time from 40+ seconds to under 1 second
  • SPF: Shortest Path First; Dijkstra's algorithm as implemented inside OSPF/IS-IS; re-executes on every topology change
  • Reference bandwidth: the divisor in OSPF's default cost formula (10^8 / bandwidth); should be set to the highest-speed link in the domain to allow cost differentiation
  • Type 3 Summary LSA: the LSA type generated by an ABR to advertise inter-area routes; enables OSPF route propagation across area boundaries
  • Dead interval: the time an OSPF router waits after the last Hello from a neighbor before declaring the neighbor dead and re-running SPF; default 4x Hello interval (40 sec)