Classroom Glossary Public page

NET-301 Week 4 -- Network Automation: Ansible, Salt, Nornir, Python Network Engineering

1,101 words

"The network is a large, distributed, heterogeneous system. It has no single control point, no unified API, and was designed by committee across decades. Network automation is the discipline of imposing programmatic control on a system that was not designed for it." -- Practitioner observation; see also Dutt Cloud Native Data Center Networking Ch 9


Lecture (50 min -- abbreviated; follow with Lab 4 immediately)

4.1 Why Network Automation Is at NET-301 Depth

NET-201's manual configuration workflow -- log into each device, enter configuration commands, verify, document -- is the standard practice for networks up to about 50 devices. It does not scale to the carrier and datacenter environments of the preceding weeks. A spine-leaf fabric with 100 switches, a BGP route-reflector hierarchy with 200 routers, or a distributed NSM deployment with 20 Suricata sensors cannot be managed manually without errors, inconsistency, and operational overhead that consumes engineering time that should be spent on security and design work.

Network automation at NET-301 depth is not "scripting your SSH sessions." It is idempotent configuration management -- the same discipline that infrastructure-as-code applies to servers, applied to network devices that speak Cisco IOS, Junos, Arista EOS, or FRR. The three tools that cover the production space are Ansible Network, Nornir, and Salt with NAPALM.

4.2 Ansible Network Automation

Ansible (Red Hat) uses YAML-based playbooks and an agentless model: it connects to network devices via SSH or vendor APIs, pushes configuration, and reports results without requiring software installation on the device.

Key concepts:

Concept Description
Inventory A file (YAML or INI) listing all managed devices with their connection parameters
Playbook A YAML file defining tasks to execute against an inventory group
Task A single operation: push a config, run a command, verify an output
Module The implementation of a task type; cisco.ios.ios_config, arista.eos.eos_command, etc.
Role A reusable bundle of playbook tasks for a specific function
Fact Device state gathered by Ansible; used in conditional logic

Network collection structure: vendor collections (cisco.ios, arista.eos, juniper.junos) follow a consistent module taxonomy:

  • *_config modules: push configuration sections
  • *_command modules: run show commands, return structured output
  • *_facts modules: gather device state as structured facts
  • *_resource modules: idempotent management of specific resource types (BGP peers, VLANs, interfaces)

Idempotency: an idempotent playbook can be run repeatedly without side effects. The module checks current device state before pushing; if the desired state already matches, no configuration is sent. This is the critical property that makes automation safe in production -- re-running a playbook after a partial failure does not double-apply configuration.

# Minimal example: configure a BGP peer on FRR via Ansible
- name: Configure FRR BGP peer
  hosts: spine_routers
  gather_facts: false
  tasks:
    - name: Set BGP peer
      ansible.netcommon.cli_config:
        config: |
          router bgp 65001
            neighbor 10.0.0.2 remote-as 65002
            neighbor 10.0.0.2 description "leaf-01"

4.3 Nornir: Python-Native Network Automation

Nornir is a Python framework that provides the same inventory + task + runner model as Ansible but implemented in pure Python, with Python tasks instead of YAML modules. This is the correct tool for engineers who write Python and want full control over task logic, error handling, and integration with custom data sources.

Nornir architecture:

Component Role
Inventory Hosts + Groups + Defaults; populated from YAML files or dynamic sources
Runner Executes tasks; default is threaded (parallel per host)
Task A Python function decorated with @nornir_task; receives a Nornir task object
Result Structured return from a task; one result per host
ConnectionPlugin Backend for device communication; Netmiko (SSH), NAPALM, Scrapli, httpx
from nornir import InitNornir
from nornir.core.task import Task, Result
from nornir_netmiko import netmiko_send_command

nr = InitNornir(config_file="config.yaml")

# Filter to spine routers only
spines = nr.filter(groups=["spines"])

# Run a command on all spines in parallel
result = spines.run(task=netmiko_send_command, command_string="show bgp summary")
for host, r in result.items():
    print(f"{host}: {r.result}")

When to use Nornir vs Ansible:

Scenario Preferred tool
Multi-team environment; non-Python background Ansible (readable YAML playbooks)
Complex conditional logic; custom data sources Nornir (full Python)
Existing Ansible infrastructure Ansible
Integration with Python testing framework Nornir
CI/CD pipeline with Python test harness Nornir

4.4 NAPALM: Network Abstraction Layer

NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support) provides a unified API across network operating systems. A script using NAPALM's get_bgp_neighbors() method works against Cisco IOS, Arista EOS, Junos, and FRR without modification.

from napalm import get_network_driver

driver = get_network_driver("eos")  # or "ios", "junos", "frr"
device = driver("10.0.1.1", "admin", "password")
device.open()

bgp_neighbors = device.get_bgp_neighbors()
for peer_ip, peer_data in bgp_neighbors["global"]["peers"].items():
    print(f"{peer_ip}: {peer_data['description']} -- {'UP' if peer_data['is_up'] else 'DOWN'}")

device.close()

NAPALM is frequently used as the connection plugin within Nornir (nornir-napalm) and within Ansible (ansible-napalm).

4.5 Infrastructure as Code and Git-Based Network Configuration

The operational model that enterprise and carrier teams increasingly adopt: network configuration lives in a Git repository. The workflow:

  1. Engineer creates a branch, edits device configuration templates (Jinja2 templates rendering to device-specific config)
  2. CI pipeline validates the rendered config (syntax check, policy check, RPKI ROA query, etc.)
  3. Peer review; merge to main
  4. CD pipeline runs the Ansible/Nornir playbook against the staging topology; runs automated tests
  5. Change window: run against production; Nornir/Ansible reports diff + result

This is the "GitOps for networking" model. The security benefit: every configuration change is audited in Git history; rollbacks are explicit (git revert); changes require review.

4.6 Security Considerations in Network Automation

Network automation tools run with privileged access to devices that route production traffic. Security posture requirements:

Credential management: never commit device passwords to Git. Use Ansible Vault, HashiCorp Vault, or Kubernetes Secrets for credential storage. Nornir supports external secret backends via plugins.

SSH host key verification: automation tools should verify SSH host keys. StrictHostKeyChecking=yes (Ansible) and equivalent in Netmiko/NAPALM. A man-in-the-middle on the automation control channel is a complete network takeover.

Principle of least privilege: automation service accounts should have configuration privileges only for the specific devices they manage. TACACS+ per-command authorization limits damage if credentials are compromised.

Audit trail: Ansible/Nornir run logs should be shipped to SIEM. A configuration change pushed outside the normal CI/CD pipeline is a detection opportunity.


Kurose-Ross / Dutt Weave

Dutt's Ch 9 covers network automation from the cloud-native datacenter perspective. The central argument: automation is not optional at scale. A datacenter operator who manually configures 100+ switches will make errors that an automated system would not. The security implication is the reverse: an automation system that lacks audit controls introduces a single point of compromise that can reconfigure the entire fabric. Network automation is a force multiplier for both good and bad.


Lab 4 Introduction

Lab 4 uses the Containerlab topology from Labs 1-2 as the managed inventory. You will author an Ansible playbook that idempotently configures BGP peer relationships across the spine-leaf fabric, then author the same configuration in Nornir using the Netmiko connection plugin. You will verify idempotency by running the playbooks twice and confirming no configuration is re-sent on the second run. A Jinja2 template for generating per-device FRR configuration from a YAML inventory will demonstrate the IaC model at small scale.


Independent Practice (6 hr)

  1. Ansible Network Getting Started: docs.ansible.com/ansible/latest/network/ -- read through "Getting Started" + "Platform Guides" for linux/frr
  2. Nornir documentation: nornir.readthedocs.io -- read "Getting Started" + "Plugins"
  3. NAPALM documentation: napalm.readthedocs.io -- read "Supported Devices" + "Base Driver Methods"
  4. Lab 4 -- full lab (all three parts; this week has no second lecture block)
  5. Supplemental: read the Salt + NAPALM network-automation blog post from NTT (publicly archived; ~30 min)