MPLS TE fast-reroute (FRR) sub-second convergence

In this article, I’d like to show you guys, how you can achieve sub-second convergence in MPLS networks with MPLS TE (RSVP) fast-reroute (FRR). Since this sub-second convergence depends on hardware optimization, I’ll use DATACOM Ethernet Switches 1. In this topology, I’ll simulate a network event failure in the RSVP core, with fast-reroute enabled and then disabled, just so we have some numbers to compare the convergence time in both situations.

The MPLS network topology is illustrated in Figure 1. In summary, this is a classic MPLS core running RSVP and I’ve configured L2VPN Martini (VPWS) between DM4001_207 (SW_207) and DM4001_194 (SW_194). The primary path of both LSPs between these two PEs goes through DM4004_196 (SW_196) and fast-reroute protection is requested downstream. As you can see in Figure 1, I also drew how detour tunnels have been established to protected the primary path. Essentially, since there are multiple alternate paths, the primary path is protected entirely in both directions i.e., SW_207 <=> SW_194.

Info
Basically, fast-reroute is a one-to-one protection mechanism with pre-computed alternate LSPs (detour tunnels) to protect the downstream next-hop (link/node) of the primary path. If you need more technical details check out RFC-4090 2.

Topology

Figure 1

Configuration

I’ll highlight some major details of both DM4001_207 and DM4001_194 configuration related to MPLS TE (RSVP). As you can see bellow, mpls traff-eng was enabled and all L3/MPLS VLANs are running RSVP. Plus, tunnel mpls traffic-eng fast-reroute one-to-one is the command that requests fast-reroute protection. Note that VPWS VPN 1000 is associated with the RSVP LSP by the mplstype te tunnel 1 command.

DM4001_207

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
! Board models in this configuration:
! Unit 1: ETH24GX H Series
!
hostname DM4001_207
!
mpls rsvp
signalling refresh interval 2000
rsvp enable
!
router ospf
router-id 10.1.73.207
network 10.1.73.207/32 area 0
network 10.1.73.124/31 area 0
network 10.1.73.116/31 area 0
network 10.1.73.122/31 area 0
log-adjacency-changes
mpls traffic-eng
!
interface vlan 1000
set-member tagged ethernet 1/18
!
interface vlan 4086
ip address 10.1.73.124/31
link-detect
ip ospf authentication message-digest
ip ospf message-digest-key 1 md5 ********
ip ospf cost 10
ip ospf network point-to-point
set-member tagged port-channel 97
rsvp enable
!
interface vlan 4087
ip address 10.1.73.116/31
link-detect
ip ospf authentication message-digest
ip ospf message-digest-key 1 md5 ********
ip ospf cost 10
ip ospf network point-to-point
set-member tagged ethernet 1/14
rsvp enable
!
interface vlan 4089
ip address 10.1.73.123/31
link-detect
ip ospf authentication message-digest
ip ospf message-digest-key 1 md5 ********
ip ospf cost 10
ip ospf network point-to-point
set-member tagged ethernet 1/16
rsvp enable
!
mpls expl-path
explicit-path identifier 1
tsp-hop 1 path-option 1 next-address ipv4 10.1.73.196 loose
!
mpls te
interface te-tunnel 1
tunnel name SW207_TO_SW194
tunnel mpls destination 10.1.73.194
tunnel mpls traffic-eng fast-reroute one-to-one
tunnel mpls traffic-eng path-option 1 explicit-path identifier 1
no shutdown
!
mpls ldp neighbor 10.1.73.194
!
mpls vpws
vpn 1000
xconnect vlan 1000 vc-type vlan
neighbor 10.1.73.194 pwid 1000 mplstype te-tunnel 1
no shutdown
!

DM4001_194

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
! Board models in this configuration:
! Unit 1: ETH24GX+2x10GX H Series
!
hostname DM4001_194
!
mpls rsvp
signalling refresh interval 2000
rsvp enable
!
router ospf
router-id 10.1.73.194
network 10.1.73.194/32 area 0
network 10.1.73.128/31 area 0
network 10.1.73.130/31 area 0
log-adjacency-changes
mpls traffic-eng
!
interface vlan 1000
set-member tagged ethernet 1/18
!
interface vlan 4092
ip address 10.1.73.130/31
link-detect
ip ospf authentication message-digest
ip ospf message-digest-key 1 md5 ********
ip ospf cost 10
ip ospf network point-to-point
set-member tagged ethernet 1/26
rsvp enable
!
interface vlan 4094
ip address 10.1.73.129/31
link-detect
ip ospf authentication message-digest
ip ospf message-digest-key 1 md5 ********
ip ospf cost 10
ip ospf network point-to-point
set-member tagged ethernet 1/25
rsvp enable
!
interface loopback 0
ip address 10.1.73.194/32
mpls enable
!
mpls expl-path
explicit-path identifier 1
tsp-hop 1 path-option 1 next-address ipv4 10.1.73.196 loose
!
mpls te
interface te-tunnel 1
tunnel name SW194_TO_SW207
tunnel mpls destination 10.1.73.207
tunnel mpls traffic-eng fast-reroute one-to-one
tunnel mpls traffic-eng path-option 1 explicit-path identifier 1
no shutdown
!
mpls ldp neighbor 10.1.73.207
!
mpls vpws
vpn 1000
xconnect vlan 1000 vc-type vlan
neighbor 10.1.73.207 pwid 1000 mplstype te-tunnel 1
no shutdown
!

Convergence Testing

I’ll simulate a network event failure by shutting down ethernet 2/26 (VLAN 4092) interface on DM4004_196 and let’s see how many packets will be lost between these two PEs, from the perspective of the MPLS VPWS transported over these LSPs. I’ll run two tests, the first test will have fast-reroute enabled in this topology, and the second one without fast-reroute protection requested. To generate traffic, I’ll use Spirent Test Center, so I can have reasonable accuracy to measure the convergence time. The network traffic generated is being encapsulated in this VPWS.

Info
On Test Center, the traffic stream rate is 1000 packets per second. So, each lost packet represents 1 ms of network outage.

Test #1 - RSVP LSPs with fast-reroute

Just to double check, let’s confirm that fast-reroute was successfully established from DM4001_207's perspective:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
DM4001_207#show mpls te traffic-eng tunnels role head
Tunnel-Name Destination Protect. Up-If Down-If Adm/Oper
-------------------- ---------------- -------- --------- --------- ----------
SW207_TO_SW194 10.1.73.194 avail. 4087,31 up/up
DM4001_207#
DM4001_207#show mpls forwarding-table
Number of entries: 2 (2 FTNs)
Action Codes: FWD - Forward, PHP - Penultimate Hop Popping,
POP - Pop and L3 Lookup, PSH - Push Label, SWP - Label Swap,
DIS - Discard
Role Codes: D - Detour Tunnel, LR - Local Repair
Status Codes: A - Active, I - Inactive, P - Pending, S - Stale
--------------------+-----+---------------+---------------+-----------+---------
Prefix, Tunnel ID | Act | Incoming | Outgoing | Outgoing | Status
or Lookup Table | ion | Label/ | Label/ | Interface | & Role
| | Protocol | Protocol | |
--------------------+-----+---------------+---------------+-----------+---------
SW207_TO_SW194 PSH - 18/RSVP VLAN 4086 I, D
SW207_TO_SW194 PSH - 31/RSVP VLAN 4087 A
--------------------+-----+---------------+---------------+-----------+---------
DM4001_207#

From DM4001_194's point of view:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
DM4001_194#show mpls te traffic-eng tunnels role head
Tunnel-Name Destination Protect. Up-If Down-If Adm/Oper
-------------------- ---------------- -------- --------- --------- ----------
SW194_TO_SW207 10.1.73.207 avail. 4092,30 up/up
DM4001_194#
DM4001_194#show mpls forwarding-table
Number of entries: 2 (2 FTNs)
Action Codes: FWD - Forward, PHP - Penultimate Hop Popping,
POP - Pop and L3 Lookup, PSH - Push Label, SWP - Label Swap,
DIS - Discard
Role Codes: D - Detour Tunnel, LR - Local Repair
Status Codes: A - Active, I - Inactive, P - Pending, S - Stale
--------------------+-----+---------------+---------------+-----------+---------
Prefix, Tunnel ID | Act | Incoming | Outgoing | Outgoing | Status
or Lookup Table | ion | Label/ | Label/ | Interface | & Role
| | Protocol | Protocol | |
--------------------+-----+---------------+---------------+-----------+---------
SW194_TO_SW207 PSH - 18/RSVP VLAN 4094 I, D
SW194_TO_SW207 PSH - 30/RSVP VLAN 4092 A
--------------------+-----+---------------+---------------+-----------+---------
DM4001_194#

On the PLR (point of local repair), DM4004_196:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
DM4004_196#show mpls forwarding-table
Number of entries: 4 (4 ILMs)
Action Codes: FWD - Forward, PHP - Penultimate Hop Popping,
POP - Pop and L3 Lookup, PSH - Push Label, SWP - Label Swap,
DIS - Discard
Role Codes: D - Detour Tunnel, LR - Local Repair
Status Codes: A - Active, I - Inactive, P - Pending, S - Stale
--------------------+-----+---------------+---------------+-----------+---------
Prefix, Tunnel ID | Act | Incoming | Outgoing | Outgoing | Status
or Lookup Table | ion | Label/ | Label/ | Interface | & Role
| | Protocol | Protocol | |
--------------------+-----+---------------+---------------+-----------+---------
SW207_TO_SW194 PHP 31/RSVP ImpNull/RSVP VLAN 4092 A
SW207_TO_SW194 SWP 31/RSVP 19/RSVP VLAN 4084 I, LR
SW194_TO_SW207 PHP 30/RSVP ImpNull/RSVP VLAN 4087 A
SW194_TO_SW207 SWP 30/RSVP 16/RSVP VLAN 4088 I, LR
--------------------+-----+---------------+---------------+-----------+---------
DM4004_196#

We’re good to go. Shutting down ethernet 2/26 interface on DM4004_196:

1
2
3
4
5
6
7
8
9
DM4004_196#conf
DM4004_196(config)#interface ethernet 2/26
DM4004_196(config-if-eth-2/26)#shut
DM4004_196(config-if-eth-2/26)#
May 6 09:59:39.813592 : [Un.1-A*] <5> Interface Ethernet 2/26 changed state to down (shutdown)
May 6 09:59:39.830843 : [Un.1-A*] <5> OSPF-ADJCHG - Nbr 10.1.73.194 from FULL to DOWN
May 6 09:59:40.196955 : [Un.1-A*] <4> Unidirectional link detected or link down on port 2/26, blocking
DM4004_196(config-if-eth-2/26)#

Detour tunnels were rerouted because of this event. As you can see in Figure 2, this event resulted in 11 packets lost from DM4001_207 to DM4001_194 and 5 packets lost from DM4001_194 to DM4001_207. On average, (5+11)/2, this represents an outage of 8 ms. Neat!

1
2
3
4
5
6
7
8
9
10
11
12
13
DM4001_207#
May 6 09:59:40.074704 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.194 is REROUTED
May 6 09:59:40.929968 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.194 is UP
DM4001_207#
DM4001_194#
May 6 10:00:02.899640 : [Master] <4> OAM: Dying Gasp Flag received on port 1/26
May 6 10:00:03.006165 : [Master] <5> Interface Ethernet 1/26 changed state to down
May 6 10:00:03.051333 : [Master] <5> OSPF-ADJCHG - Nbr 10.1.73.196 from FULL to DOWN
May 6 10:00:03.400464 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.207 is REROUTED
May 6 10:00:03.634964 : [Master] <4> Unidirectional link detected or link down on port 1/26, blocking
May 6 10:00:04.085325 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.207 is UP

Figure 2

Test #2 - RSVP LSPs without fast-reroute

Now, I’ll disable fast-reroute on those PEs in these LSPs:

1
2
3
4
5
6
7
8
9
DM4001_194(config-mpls-te)#interface te-tunnel 1
DM4001_194(config-mpls-te-if-1)#shut
DM4001_194(config-mpls-te-if-1)#
May 6 10:03:34.508866 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.207 is DOWN
DM4001_194(config-mpls-te-if-1)#no tunnel mpls traffic-eng fast-reroute
DM4001_194(config-mpls-te-if-1)#no shut
DM4001_194(config-mpls-te-if-1)#
May 6 10:03:40.068765 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.207 is UP
1
2
3
4
5
6
7
8
9
10
11
DM4001_207(config-mpls-te)#show this
DM4001_207(config-mpls-te)#interface te-tunnel 1
DM4001_207(config-mpls-te-if-1)#shut
DM4001_207(config-mpls-te-if-1)#
May 6 10:02:24.635903 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.194 is DOWN
DM4001_207(config-mpls-te-if-1)#no tunnel mpls traffic-eng fast-reroute
DM4001_207(config-mpls-te-if-1)#
DM4001_207(config-mpls-te-if-1)#no shut
DM4001_207(config-mpls-te-if-1)#
May 6 10:02:54.084216 : [Master] <5> RSVP-ADJCHG - Tunnel (1) to destination 10.1.73.194 is UP

Let’s verify the forwarding table on DM4004_196, there are no detour tunnels, just PHP (penultimate hop popping) MPLS entries of both primary LSPs:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
DM4004_196(config-if-eth-2/26)#show mpls forwarding-table
Number of entries: 2 (2 ILMs)
Action Codes: FWD - Forward, PHP - Penultimate Hop Popping,
POP - Pop and L3 Lookup, PSH - Push Label, SWP - Label Swap,
DIS - Discard
Role Codes: D - Detour Tunnel, LR - Local Repair
Status Codes: A - Active, I - Inactive, P - Pending, S - Stale
--------------------+-----+---------------+---------------+-----------+---------
Prefix, Tunnel ID | Act | Incoming | Outgoing | Outgoing | Status
or Lookup Table | ion | Label/ | Label/ | Interface | & Role
| | Protocol | Protocol | |
--------------------+-----+---------------+---------------+-----------+---------
SW207_TO_SW194 PHP 28/RSVP ImpNull/RSVP VLAN 4092 A
SW194_TO_SW207 PHP 27/RSVP ImpNull/RSVP VLAN 4087 A
--------------------+-----+---------------+---------------+-----------+---------
DM4004_196(config-if-eth-2/26)#

Now, let’s simulate the same network failure event again. As shown in Figure 3, this event resulted in 4418 lost packets from DM4001_207 to DM4001_194 and 4417 lost packets from DM4001_194 to DM4001_207. On average, this equates to 4.417 seconds of network outage.

1
2
3
4
5
6
7
8
9
10
DM4004_196(config-if-eth-2/26)#shut
DM4004_196(config-if-eth-2/26)#
May 6 10:06:03.984706 : [Un.1-A*] <5> Interface Ethernet 2/26 changed state to down (shutdown)
May 6 10:06:04.003891 : [Un.1-A*] <5> OSPF-ADJCHG - Nbr 10.1.73.194 from FULL to DOWN
DM4004_196(config-if-eth-2/26)#
May 6 10:06:04.833259 : [Un.1-A*] <4> Unidirectional link detected or link down on port 2/26, blocking
DM4004_196(config-if-eth-2/26)#
DM4004_196(config-if-eth-2/26)#

Figure 3

Final Thoughs

In this particular case, with fast-reroute the convergence, from an end-to-end customer traffic perspective, resulted in 8 ms of network outage as opposed to 4.41 seconds when the LSPs weren’t protected. If you ever need to improve convergence time in your MPLS core, RSVP with fast-reroute could definitely be a great solution. Plus, with fast-reroute you also can take advantage of affinity (link-coloring) in order to have a more granular control about how detour LSPs are established. Maybe, I’ll address this point in a future post. Stay tuned!

References