Automation: Running napalm-ansible to manage configurations on UNetLab

It’s been a while since I’ve been tinkering with napalm, pyntc and their Ansible custom modules, and I’ve been using napalm-ansible to manage all the configuration of my devices on UNetLab (eve-ng fork). In short, I’m leveraging UNetLab just as a backend to run QEMU instances and the configuration of all devices I manage with napalm-ansible. In this post, I’ll show you how my UNetLab infrastructure looks like and how exactly I manage the configurations with napalm-ansible.

Infrastructure

I run UNetLab virtual machine on a Ubuntu host that has QEMU/KVM as its main hypervisor with libvirt. Currently, these are the main networks/bridges on my lab infrastructure managed by libvirt that I use on eve-ng:

  • br_mgmt
    • 192.168.101.254/24, used for ssh management
  • br_core
    • 192.168.102.254/24, flat network to segregate the network core based on VLANs
  • br_ext1
    • 192.168.201.254/24, used for external networks (Docker and other applications that I develop, for example)
  • br_ext2
    • 192.168.202.254/24, used for external networks

This br_mgmt network enables external management connectivity, so from my Ubuntu host (which is my laptop) I can reach this network and manage all devices. The br_core is used for segregating the L3 core based on VLANs, so I don’t have to keep wiring virtual network cables on UNetLab web interface, this enables me to wire just once and have as many logical topologies on that lab as I want. For example, here’s the definition of br_mgmt:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
❯ sudo virsh net-dumpxml br_mgmt
<network>
<name>br_mgmt</name>
<uuid>38084376-ee42-461c-96a7-4116f2cee543</uuid>
<forward mode='route'/>
<bridge name='br_mgmt' stp='off' delay='0'/>
<mac address='52:54:00:f9:f2:a3'/>
<ip address='192.168.101.254' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.101.33' end='192.168.101.94'/>
<host mac='00:00:00:00:00:01' ip='192.168.101.101'/>
<host mac='00:00:00:00:00:02' ip='192.168.101.102'/>
<host mac='00:00:00:00:00:03' ip='192.168.101.103'/>
<host mac='00:00:00:00:00:04' ip='192.168.101.104'/>
<host mac='00:00:00:00:00:05' ip='192.168.101.105'/>
<host mac='00:00:00:00:00:06' ip='192.168.101.106'/>
<host mac='00:00:00:00:00:07' ip='192.168.101.107'/>
<host mac='00:00:00:00:00:08' ip='192.168.101.108'/>
<host mac='00:00:00:00:00:09' ip='192.168.101.109'/>
<host mac='00:00:00:00:00:0a' ip='192.168.101.110'/>
<host mac='00:00:00:00:00:fd' ip='192.168.101.253'/>
</dhcp>
</ip>
</network>

libvirt
On libvirt, the <forward mode='route'> essentially disables NAT, so this network is accessible locally and act as a bridge since all guest traffic will be forwarded to the local network via the host’s IP routing stack. Plus, I enabled DHCP on this network to facilitate addressing. I run and access this network locally on my Ubuntu host, so this mode was suited for my use case.

Now that I have a management network br_mgmt, all I have to do is to connect this network to all devices I want to manage inside UNetLab. I mapped this network to the network adapter 1, which is the eth1 interface on UNetLab. As you know, in order to connect external networks to UNetLab, you need to use Linux bridges aka pnet interfaces, pnet1 bridge has this eth1 interface. As a result, anything connected to pnet1 is bridged to br_mgmt (192.168.101.0/24). Here’s the definition of eve-ng on libvirt:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
❯ sudo virsh dumpxml eveng
...
<truncated output>
...
<interface type='network'>
<mac address='52:54:00:41:2d:35'/>
<source network='default'/>
<model type='rtl8139'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<interface type='network'>
<mac address='00:00:00:00:00:fd'/>
<source network='br_mgmt'/>
<model type='rtl8139'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
</interface>
<interface type='network'>
<mac address='52:54:00:d7:0a:8c'/>
<source network='br_core'/>
<model type='rtl8139'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</interface>
...
<truncated output>
...

Information
If you run other hypervisor than QEMU/KVM, you have to figure out how you can create a virtual network and bridge this management network to UNetLab virtual machine. Plus, if you have a distributed environment, don’t forget to create all the routes to have L3 reachability remotely.

Topology demo

First of all, you have to define how many devices your topology will consist of and connect all instances to your external interfaces. For example, six_routers is a topology that I use often to test Junos. As you can see bellow, all instances are connected to both br_mgmt (pnet1) and br_core (pnet2). You have to enable SSH on all devices and configure their IP addresses (in my case, I have the first interface of all devices connected to br_mgmt).

topology

On this six_routers topology, for instance, I have several logical configurations such as:

  • base
    • Basic IP addressing management
  • l3
    • IPv4 and IPv6 core uplinks segregated with VLANs on top of base
  • flat_ospf
    • Flat OSPF backbone on top of l3
  • flat_ospf_ldp
    • Flat LDP MPLS backbone on top of flat_ospf
  • flat_ospf_rsvp
    • Flat RSVP MPLS backbone on top of flat_ospf
  • flat_ospf_ldp_bgp
    • BGP unicast IPv4/IPv6 on top of flat_ospf_ldp
  • l3vpn
    • BGP vpnv4 on top of flat_ospf_ldp
  • l2circuit
    • MPLS L2VPNs on top of flat_ospf_ldp

So depending on what I’m interested to test, I push the configuration with napalm-ansible. This enables me to switch quickly between logical configurations and start developing or testing whatever I need. Plus, I can branch out new logical configurations if I need a specific logical configuration that I don’t have yet. This scales nicely, over the past months, it’s been quite convenient. All devices of the six_routers topology are on Ansible inventory:

1
2
3
4
5
6
7
[junos]
dev1
dev2
dev3
dev4
dev5
dev6

Napalm playbooks

Before diving in the demo, let me show you some playbooks that are part of this workflow.

  • stage.yml, this is used for staging configuration. The prefix variable stores the name of a logical configuration such as l3, ospf_flat under the compiled directory. The replace variable controls whether to fully replace the configuration or just merge, if it’s set as false, merging will take place. All other variables are just credentials and default driver options (Junos) to access the devices. Lastly, the commit_changes is set as False because I just want to stage the configuration and not fully commit yet, all diff files will be placed in the current directory.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
- hosts: all
connection: local
gather_facts: False
vars:
prefix: "temp"
replace: False
tasks:
- name: "Staging {{ prefix }} configuration"
napalm_install_config:
hostname: "{{ inventory_hostname }}"
username: "{{ username }}"
dev_os: "{{ dev_os }}"
password: "{{ password }}"
config_file: "compiled/{{ prefix }}_{{ inventory_hostname }}.cfg"
commit_changes: False
diff_file: "{{ prefix }}_{{ inventory_hostname }}.diff"
replace_config: "{{ replace }}"
  • push.yml, the major difference when compared to stage.yml is that now commit_changes is true. Because of this, the configuration will actually be pushed to the running configuration of the device. Typically, my configuration commit workflow is like this, I run stage.yml first, and if Ansible reported any changes, I take a look at the diff files and if I’m satisfied with the results I run push.yml to commit.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
- hosts: all
connection: local
gather_facts: False
vars:
prefix: "temp"
replace: False
tasks:
- name: "Pushing {{ prefix }} configuration"
napalm_install_config:
hostname: "{{ inventory_hostname }}"
username: "{{ username }}"
dev_os: "{{ dev_os }}"
password: "{{ password }}"
config_file: "compiled/{{ prefix }}_{{ inventory_hostname }}.cfg"
commit_changes: True
diff_file: "{{ prefix }}_{{ inventory_hostname }}.diff"
replace_config: "{{ replace }}"
  • fetch.yml, it’s used for fetching the running config of all devices. Sometimes, when the configuration of a device is not fully under version control I have to start somewhere to get their configuration or when generating configuration from scratch is not an option (lack of APIs):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
---
- hosts: all
connection: local
gather_facts: False
vars:
prefix: "temp"
tasks:
- name: "Fetching config prefix {{ prefix }}"
napalm_fetch_running:
hostname: "{{ inventory_hostname }}"
username: "{{ username }}"
dev_os: "{{ dev_os }}"
password: "{{ password }}"
archive_file: "compiled/{{ prefix }}_{{ inventory_hostname }}.cfg"

Napalm Ansible demo

Now, let me show you a demo of a typical use case. First, let’s make sure flat_ospf is the current configuration (running config) of all devices. I’ll stage flat_ospf, if Ansible’s output has resulted in changes that means that the running configuration is not the same as flat_ospf:

Information
When I started this lab, all devices were running the l3 configuration.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
~/repos/napalm-ansible develop*
❯ ansible-playbook stage.yml -e "prefix=flat_ospf replace=true"
PLAY [all] ************************************************************************************************************************************************************************************
TASK [Staging flat_ospf configuration] ********************************************************************************************************************************************************
changed: [dev2]
changed: [dev4]
changed: [dev1]
changed: [dev3]
changed: [dev5]
changed: [dev6]
PLAY RECAP ************************************************************************************************************************************************************************************
dev1 : ok=1 changed=1 unreachable=0 failed=0
dev2 : ok=1 changed=1 unreachable=0 failed=0
dev3 : ok=1 changed=1 unreachable=0 failed=0
dev4 : ok=1 changed=1 unreachable=0 failed=0
dev5 : ok=1 changed=1 unreachable=0 failed=0
dev6 : ok=1 changed=1 unreachable=0 failed=0
~/repos/napalm-ansible develop*

Ansible pointed out changes in all devices, looking at the diff files of dev1 and dev6, for example:

  • flat_ospf_dev1.diff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[edit]
+ protocols {
+ ospf {
+ area 0.0.0.0 {
+ interface ge-0/0/1.12 {
+ interface-type p2p;
+ }
+ interface ge-0/0/1.13 {
+ interface-type p2p;
+ }
+ interface lo0.1 {
+ passive;
+ }
+ }
+ }
+ ospf3 {
+ area 0.0.0.0 {
+ interface ge-0/0/1.12 {
+ interface-type p2p;
+ }
+ interface ge-0/0/1.13 {
+ interface-type p2p;
+ }
+ interface lo0.1 {
+ passive;
+ }
+ }
+ }
+ }
  • flat_ospf_dev6.diff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[edit]
+ protocols {
+ ospf {
+ area 0.0.0.0 {
+ interface ge-0/0/1.34 {
+ interface-type p2p;
+ }
+ interface ge-0/0/1.46 {
+ interface-type p2p;
+ }
+ interface lo0.1 {
+ passive;
+ }
+ }
+ }
+ ospf3 {
+ area 0.0.0.0 {
+ interface ge-0/0/1.34 {
+ interface-type p2p;
+ }
+ interface ge-0/0/1.46 {
+ interface-type p2p;
+ }
+ interface lo0.1 {
+ passive;
+ }
+ }
+ }
+ }

All changes are coherent (after analyzing all diffs), let’s commit all changes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
~/repos/napalm-ansible develop*
❯ ansible-playbook push.yml -e "prefix=flat_ospf replace=true"
PLAY [all] ************************************************************************************************************************************************************************************
TASK [Pushing flat_ospf configuration] ********************************************************************************************************************************************************
changed: [dev5]
changed: [dev1]
changed: [dev4]
changed: [dev3]
changed: [dev2]
changed: [dev6]
PLAY RECAP ************************************************************************************************************************************************************************************
dev1 : ok=1 changed=1 unreachable=0 failed=0
dev2 : ok=1 changed=1 unreachable=0 failed=0
dev3 : ok=1 changed=1 unreachable=0 failed=0
dev4 : ok=1 changed=1 unreachable=0 failed=0
dev5 : ok=1 changed=1 unreachable=0 failed=0
dev6 : ok=1 changed=1 unreachable=0 failed=0
~/repos/napalm-ansible develop*

Done. No errors, so the configuration was properly pushed. Just out of curiosity, let’s push this configuration again:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
~/repos/napalm-ansible develop*
❯ ansible-playbook push.yml -e "prefix=flat_ospf replace=true"
PLAY [all] ************************************************************************************************************************************************************************************
TASK [Pushing flat_ospf configuration] ********************************************************************************************************************************************************
ok: [dev3]
ok: [dev1]
ok: [dev2]
ok: [dev4]
ok: [dev5]
ok: [dev6]
PLAY RECAP ************************************************************************************************************************************************************************************
dev1 : ok=1 changed=0 unreachable=0 failed=0
dev2 : ok=1 changed=0 unreachable=0 failed=0
dev3 : ok=1 changed=0 unreachable=0 failed=0
dev4 : ok=1 changed=0 unreachable=0 failed=0
dev5 : ok=1 changed=0 unreachable=0 failed=0
dev6 : ok=1 changed=0 unreachable=0 failed=0
~/repos/napalm-ansible develop*

As you can see, since napalm_install_config module is idempotent (supports check_mode) no changes took place since the running configuration was the same as the candidate configuration that I was trying to push. Neat!

If you have to narrow down your changes to a specific group of devices or host, you can use the limit parameter of Ansible. For instance, if you were to stage base configuration on dev1:

1
2
3
4
5
6
7
8
9
10
~/repos/napalm-ansible develop*
❯ ansible-playbook stage.yml -e "prefix=base replace=true" --limit dev1
PLAY [all] ************************************************************************************************************************************************************************************
TASK [Staging base configuration] *************************************************************************************************************************************************************
changed: [dev1]
PLAY RECAP ************************************************************************************************************************************************************************************
dev1 : ok=1 changed=1 unreachable=0 failed=0

Ansible reported changes, if you take a look at the diff file, you’d see that everything that had been pushed before because of l3 and flat_ospf would be removed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
~/repos/napalm-ansible develop*
❯ cat base_dev1.diff
[edit interfaces]
- ge-0/0/1 {
- vlan-tagging;
- unit 12 {
- vlan-id 12;
- family inet {
- address 10.1.2.1/24;
- }
- family inet6 {
- address 2001:1:2::1/64;
- }
- }
- unit 13 {
- vlan-id 13;
- family inet {
- address 10.1.3.1/24;
- }
- family inet6 {
- address 2001:1:3::3/64;
- }
- }
- }
- lo0 {
- unit 1 {
- family inet {
- address 1.1.1.1/32;
- }
- family inet6 {
- address 3001::1/128;
- }
- }
- }
[edit]
- protocols {
- ospf {
- area 0.0.0.0 {
- interface ge-0/0/1.12 {
- interface-type p2p;
- }
- interface ge-0/0/1.13 {
- interface-type p2p;
- }
- interface lo0.1 {
- passive;
- }
- }
- }
- ospf3 {
- area 0.0.0.0 {
- interface ge-0/0/1.12 {
- interface-type p2p;
- }
- interface ge-0/0/1.13 {
- interface-type p2p;
- }
- interface lo0.1 {
- passive;
- }
- }
- }
- }

Final Thoughts

Although this workflow has been smoothly for my use case, in the upcoming months, I’ll start exploring napalm with YANG models in order to improve this workflow, especially when it comes to generating network configurations and verifying their operational states with a standard API and common YANG models such as OpenConfig models.