MPLS L2VPN Martini AC with OAM LFM on Junos

In this quick article, I’m going to show you how OAM LFM 1 can be used on MPLS L2VPN ACs to detect point-to-point connectivity faults on Junos. OAM LFM can be extremely helpful to enhance the AC (Layer 2) connectivity troubleshooting when you don’t have back to back physical connectivity on the PE-CE Ethernet link. This lack of back to back Ethernet connectivity, for instance, can arise when you have an underlying transport technology such as DWDM. In order to simulate this environment, I’ll run my entire topology (Figure 1) on virtual Ethernet bridges, which also have this lack of back to back physical connectivity between the two endpoints (i.e., even if you shut down one side of the link, the Ethernet link will still be perceived as up from the other endpoint’s perspective).

Topology

Figure 1

In this topology, to illustrate this problem, I’ll set OAM LFM only on the AC of PE1 and I’ll leave the AC of PE2 as it is. We’ll see shortly how this will reflect on the L2VPN Martini VC status between these two PEs.

L2VPN with OAM LFM Configuration

As you can see, in the configuration bellow between PE1 and CE1, they both have OAM LFM enabled. In particular, I set the PDU-interval as 1 second (1000 ms) and the PDU-threshold is three times this value. As a result, as soon as 3 PDUs are lost the link-adjacency-loss will be triggered and the link will be set to down. In addition, a syslog message will be sent. So, whenever a connectivity fault occurs in this point-to-point link, the status will be correctly reflected on the AC.

PE1 and CE1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
protocols {
l2circuit {
neighbor 2.2.2.2 {
interface ge-0/0/3.1024 {
virtual-circuit-id 12;
pseudowire-status-tlv;
}
}
}
oam {
ethernet {
link-fault-management {
action-profile SEND_SYSLOG_AND_SET_LINKDOWN {
event {
link-adjacency-loss;
}
action {
syslog;
link-down;
}
}
interface ge-0/0/3 {
apply-action-profile SEND_SYSLOG_AND_SET_LINKDOWN;
pdu-interval 1000;
link-discovery active;
pdu-threshold 3;
negotiation-options {
allow-remote-loopback;
}
}
}
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
protocols {
oam {
ethernet {
link-fault-management {
action-profile SEND_SYSLOG_AND_SET_LINKDOWN {
event {
link-adjacency-loss;
}
action {
syslog;
link-down;
}
}
interface ge-0/0/1 {
apply-action-profile SEND_SYSLOG_AND_SET_LINKDOWN;
pdu-interval 1000;
link-discovery active;
pdu-threshold 3;
negotiation-options {
allow-remote-loopback;
}
}
}
}
}
}
logical-systems {
CE1 {
interfaces {
ge-0/0/1 {
unit 1024 {
vlan-id 1024;
family inet {
address 192.168.12.1/24;
}
}
}
}
}
}

PE2 and CE2

1
2
3
4
5
6
7
8
9
10
protocols {
l2circuit {
neighbor 1.1.1.1 {
interface ge-0/0/3.1024 {
virtual-circuit-id 12;
pseudowire-status-tlv;
}
}
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
logical-systems {
CE2 {
interfaces {
ge-0/0/2 {
unit 1024 {
vlan-id 1024;
family inet {
address 192.168.12.2/24;
}
}
}
}
}
}

Verification

First, let’s see from PE1’s perspective and make sure we can ping from CE1 to CE2. The status of the L2VPN Circuit is operational and OAM LFM is also OK, you can tell by the status Send-Any and the peer MAC address. Plus, we can ping from CE1 to CE2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
root@PE1# run show l2circuit connections extensive
Layer-2 Circuit Connections:
Legend for connection status (St)
EI -- encapsulation invalid NP -- interface h/w not present
MM -- mtu mismatch Dn -- down
EM -- encapsulation mismatch VC-Dn -- Virtual circuit Down
CM -- control-word mismatch Up -- operational
VM -- vlan id mismatch CF -- Call admission control failure
OL -- no outgoing label IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC TM -- TDM misconfiguration
BK -- Backup Connection ST -- Standby Connection
CB -- rcvd cell-bundle size bad SP -- Static Pseudowire
LD -- local site signaled down RS -- remote site standby
RD -- remote site signaled down HS -- Hot-standby Connection
XX -- unknown
Legend for interface status
Up -- operational
Dn -- down
Neighbor: 2.2.2.2
Interface Type St Time last up # Up trans
ge-0/0/3.1024(vc 12) rmt Up Jun 11 20:42:30 2016 1
Remote PE: 2.2.2.2, Negotiated control-word: Yes (Null)
Incoming label: 299856, Outgoing label: 299776
Negotiated PW status TLV: Yes
local PW status code: 0x00000000, Neighbor PW status code: 0x00000000
Local interface: ge-0/0/3.1024, Status: Up, Encapsulation: VLAN
Flow Label Transmit: No, Flow Label Receive: No
Connection History:
Jun 11 20:42:30 2016 PE route changed
Jun 11 20:42:30 2016 Out lbl Update 299776
Jun 11 20:42:30 2016 In lbl Update 299856
Jun 11 20:42:30 2016 loc intf up ge-0/0/3.1024
[edit]
root@PE1# run show oam ethernet link-fault-management
Interface: ge-0/0/3
Status: Running, Discovery state: Send Any
Transmit interval: 1000ms, PDU threshold: 3 frames, Hold time: 3000ms
Peer address: 00:05:86:71:d1:01
Flags:Remote-Stable Remote-State-Valid Local-Stable 0x50
Remote entity information:
Remote MUX action: forwarding, Remote parser action: forwarding
Discovery mode: active, Unidirectional mode: unsupported
Remote loopback mode: supported, Link events: supported
Variable requests: unsupported
Application profile statistics:
Profile Name Invoked Executed
SEND_SYSLOG_AND_SET_LINKDOWN 1 1
[edit]
root@CEs:CE1> ping 192.168.12.2 rapid
PING 192.168.12.2 (192.168.12.2): 56 data bytes
!!!!!
--- 192.168.12.2 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 4.473/4.789/5.644/0.444 ms

Now, from PE2’s point of view the L2VPN circuit is also operational.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
root@PE2# run show l2circuit connections extensive
Layer-2 Circuit Connections:
Legend for connection status (St)
EI -- encapsulation invalid NP -- interface h/w not present
MM -- mtu mismatch Dn -- down
EM -- encapsulation mismatch VC-Dn -- Virtual circuit Down
CM -- control-word mismatch Up -- operational
VM -- vlan id mismatch CF -- Call admission control failure
OL -- no outgoing label IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC TM -- TDM misconfiguration
BK -- Backup Connection ST -- Standby Connection
CB -- rcvd cell-bundle size bad SP -- Static Pseudowire
LD -- local site signaled down RS -- remote site standby
RD -- remote site signaled down HS -- Hot-standby Connection
XX -- unknown
Legend for interface status
Up -- operational
Dn -- down
Neighbor: 1.1.1.1
Interface Type St Time last up # Up trans
ge-0/0/3.1024(vc 12) rmt Up Jun 11 20:42:30 2016 1
Remote PE: 1.1.1.1, Negotiated control-word: Yes (Null)
Incoming label: 299776, Outgoing label: 299856
Negotiated PW status TLV: Yes
local PW status code: 0x00000000, Neighbor PW status code: 0x00000000
Local interface: ge-0/0/3.1024, Status: Up, Encapsulation: VLAN
Flow Label Transmit: No, Flow Label Receive: No
Connection History:
Jun 11 20:42:30 2016 status update timer
Jun 11 20:42:30 2016 PE route changed
Jun 11 20:42:30 2016 Out lbl Update 299856
Jun 11 20:42:30 2016 In lbl Update 299776
Jun 11 20:42:30 2016 loc intf up ge-0/0/3.1024
[edit]
root@PE2#

So far so good, everything is working as expected. Let’s simulate some network failures to analyze how OAM LFM improves the troubleshooting and the signaling status of the AC.

Simulating network failures on CEs

Shutting down CE2’s interface ge-0/0/2

Remember that we don’t have OAM LFM between PE2-CE2 Ethernet link. Consequently, if I shut down CE2’s interface ge-0/0/2, PE2 won’t notice this event because all of the Ethernet links on this topology are virtual. As a result, the AC will appear as if it is still UP. So, from the control plane point of view, everything seems OK, as we’ll see. However, the data plane won’t work, which makes the troubleshooting process more difficult.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@CEs# set interfaces ge-0/0/2 disable
[edit]
root@CEs# show | compare
[edit interfaces ge-0/0/2]
+ disable;
[edit]
root@CEs# commit
commit complete
[edit]
root@CEs#
root@CEs:CE1> ping 192.168.12.2 rapid
PING 192.168.12.2 (192.168.12.2): 56 data bytes
.....
--- 192.168.12.2 ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss
root@CEs:CE1>

As expected, the ping won’t succeed. Let’s check the control plane status. Both PE1 and PE2 are still signaling the L2 circuit as operational, which they shouldn’t after all the AC between PE2 and CE2 is not operational. However, as pointed out earlier PE2 can’t realize this, which is why OAM LFM help to remedy the troubleshooting.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
root@PE1# run show l2circuit connections extensive
Layer-2 Circuit Connections:
Legend for connection status (St)
EI -- encapsulation invalid NP -- interface h/w not present
MM -- mtu mismatch Dn -- down
EM -- encapsulation mismatch VC-Dn -- Virtual circuit Down
CM -- control-word mismatch Up -- operational
VM -- vlan id mismatch CF -- Call admission control failure
OL -- no outgoing label IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC TM -- TDM misconfiguration
BK -- Backup Connection ST -- Standby Connection
CB -- rcvd cell-bundle size bad SP -- Static Pseudowire
LD -- local site signaled down RS -- remote site standby
RD -- remote site signaled down HS -- Hot-standby Connection
XX -- unknown
Legend for interface status
Up -- operational
Dn -- down
Neighbor: 2.2.2.2
Interface Type St Time last up # Up trans
ge-0/0/3.1024(vc 12) rmt Up Jun 11 20:42:30 2016 1
Remote PE: 2.2.2.2, Negotiated control-word: Yes (Null)
Incoming label: 299856, Outgoing label: 299776
Negotiated PW status TLV: Yes
local PW status code: 0x00000000, Neighbor PW status code: 0x00000000
Local interface: ge-0/0/3.1024, Status: Up, Encapsulation: VLAN
Flow Label Transmit: No, Flow Label Receive: No
Connection History:
Jun 11 20:42:30 2016 PE route changed
Jun 11 20:42:30 2016 Out lbl Update 299776
Jun 11 20:42:30 2016 In lbl Update 299856
Jun 11 20:42:30 2016 loc intf up ge-0/0/3.1024
[edit]
root@PE1#
root@PE2# run show l2circuit connections extensive
Layer-2 Circuit Connections:
Legend for connection status (St)
EI -- encapsulation invalid NP -- interface h/w not present
MM -- mtu mismatch Dn -- down
EM -- encapsulation mismatch VC-Dn -- Virtual circuit Down
CM -- control-word mismatch Up -- operational
VM -- vlan id mismatch CF -- Call admission control failure
OL -- no outgoing label IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC TM -- TDM misconfiguration
BK -- Backup Connection ST -- Standby Connection
CB -- rcvd cell-bundle size bad SP -- Static Pseudowire
LD -- local site signaled down RS -- remote site standby
RD -- remote site signaled down HS -- Hot-standby Connection
XX -- unknown
Legend for interface status
Up -- operational
Dn -- down
Neighbor: 1.1.1.1
Interface Type St Time last up # Up trans
ge-0/0/3.1024(vc 12) rmt Up Jun 11 20:42:30 2016 1
Remote PE: 1.1.1.1, Negotiated control-word: Yes (Null)
Incoming label: 299776, Outgoing label: 299856
Negotiated PW status TLV: Yes
local PW status code: 0x00000000, Neighbor PW status code: 0x00000000
Local interface: ge-0/0/3.1024, Status: Up, Encapsulation: VLAN
Flow Label Transmit: No, Flow Label Receive: No
Connection History:
Jun 11 20:42:30 2016 status update timer
Jun 11 20:42:30 2016 PE route changed
Jun 11 20:42:30 2016 Out lbl Update 299856
Jun 11 20:42:30 2016 In lbl Update 299776
Jun 11 20:42:30 2016 loc intf up ge-0/0/3.1024
[edit]
root@PE2#

Shutting down CE1’s interface ge-0/0/1

Before I run the test of shutting down CE1’s interface ge-0/0/1, I’ll rollback the configuration, so we have an operational configuration once again.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@CEs# rollback 1
load complete
[edit]
root@CEs# show | compare
[edit interfaces ge-0/0/2]
- disable;
[edit]
root@CEs# commit
commit complete
[edit]
root@CEs#
root@CEs:CE1> ping 192.168.12.2 rapid
PING 192.168.12.2 (192.168.12.2): 56 data bytes
!!!!!
--- 192.168.12.2 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 4.473/4.789/5.644/0.444 ms
root@CEs:CE1>

Now let’s shut down CE1’s interface and check how both PE1 and PE2 will now perceive the status of the L2 circuit. As you can see, in the snippet bellow, the OAM LFM status now currently reflects the down state of the AC. As a result, PE1 signals the L2 circuit as LD (local site signaled down). Because of this, PE2 perceives the L2 circuit as RD (remote site signaled down). Cool! The troubleshooting couldn’t get any easier thanks to OAM LFM in this case.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
root@CEs# set interfaces ge-0/0/1 disable
[edit]
root@CEs# show | compare
[edit interfaces ge-0/0/1]
+ disable;
[edit]
root@CEs# commit
commit complete
root@PE1# run show l2circuit connections extensive
Layer-2 Circuit Connections:
Legend for connection status (St)
EI -- encapsulation invalid NP -- interface h/w not present
MM -- mtu mismatch Dn -- down
EM -- encapsulation mismatch VC-Dn -- Virtual circuit Down
CM -- control-word mismatch Up -- operational
VM -- vlan id mismatch CF -- Call admission control failure
OL -- no outgoing label IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC TM -- TDM misconfiguration
BK -- Backup Connection ST -- Standby Connection
CB -- rcvd cell-bundle size bad SP -- Static Pseudowire
LD -- local site signaled down RS -- remote site standby
RD -- remote site signaled down HS -- Hot-standby Connection
XX -- unknown
Legend for interface status
Up -- operational
Dn -- down
Neighbor: 2.2.2.2
Interface Type St Time last up # Up trans
ge-0/0/3.1024(vc 12) rmt LD
local PW status code: 0x00000001, Neighbor PW status code: 0x00000000
[edit]
root@PE1# run show oam ethernet link-fault-management
Interface: ge-0/0/3
Status: Running, Discovery state: Active Send Local
Transmit interval: 1000ms, PDU threshold: 3 frames, Hold time: 0ms
Peer address: 00:00:00:00:00:00
Flags:0x8
Application profile statistics:
Profile Name Invoked Executed
SEND_SYSLOG_AND_SET_LINKDOWN 2 2
[edit]
root@PE1# run show log messages | match lfm
Jun 11 23:57:55 PE1 lfmd[2318]: Action Syslog on ge-0/0/3 [SEND_SYSLOG_AND_SET_LINKDOWN]: :Adjacency Lost
Jun 11 23:57:55 PE1 lfmd[2318]: LFMD_3AH_LINKDOWN: (ge-0/0/3): 802.3ah link-fault status changed to fault with reason [Adjacency lost]
Jun 11 23:58:41 PE1 mgd[3397]: UI_CMDLINE_READ_LINE: User 'root', command 'run show log messages | match lfm '
[edit]
root@PE1#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
root@PE2# run show l2circuit connections extensive
Layer-2 Circuit Connections:
Legend for connection status (St)
EI -- encapsulation invalid NP -- interface h/w not present
MM -- mtu mismatch Dn -- down
EM -- encapsulation mismatch VC-Dn -- Virtual circuit Down
CM -- control-word mismatch Up -- operational
VM -- vlan id mismatch CF -- Call admission control failure
OL -- no outgoing label IB -- TDM incompatible bitrate
NC -- intf encaps not CCC/TCC TM -- TDM misconfiguration
BK -- Backup Connection ST -- Standby Connection
CB -- rcvd cell-bundle size bad SP -- Static Pseudowire
LD -- local site signaled down RS -- remote site standby
RD -- remote site signaled down HS -- Hot-standby Connection
XX -- unknown
Legend for interface status
Up -- operational
Dn -- down
Neighbor: 1.1.1.1
Interface Type St Time last up # Up trans
ge-0/0/3.1024(vc 12) rmt RD
local PW status code: 0x00000000, Neighbor PW status code: 0x00000001
[edit]
root@PE2#

Summary

Troubleshooting can be challenging if you happen to have an underlying transport technology that mask the actual status of the Ethernet link of the AC. Fortunately though, to facilitate this this process you can use OAM LFM to correctly signals the L2 connectivity status for point-to-point Ethernet links.

References