BGP policy routing: DM4050 as a L3 multihomed CPE

DATACOM 2 is about to release a new Ethernet Switch platform, DM4050, as FOA (first office application), which will be shipped with DmOS as its operating system. In this post, I’d like to show you guys some of its use cases as a L3 CPE multihomed with BGP 1, specifically we’ll see some of the features that are available as far as inbound and outbound BGP policy routing on DmOS 2.0.

Topology

In this topology, SW6_DM4050 provides L3 access for a specific carrier network. In this case, DM4050 is on a private (AS - autonomous system) AS 65000 and it’s multihomed to an upstream ISP AS 2 that has two routers CSR2 running IOS-XE and MX1 running Junos. On AS 65000, there’s a DM4100 as a router-reflector and the rest of the L3 topology is simply abstracted as a L3 cloud, which could be a flat OSPF backbone or a multi-area design.

Figure1

Metro Ethernet L2
Typically, on carrier networks, DM4050 will probably have some metro Ethernet 10GE aggregation rings, running either EAPS or STP, since this platform has 6 x 10GE interfaces, but I won’t focus on this L2 feature set in this post.

BGP Routing Policy

There are several attributes on BGP that can be used to influence how prefixes are supposed to be routed both inbound and outbound 1. In the following sections, I’ll show some of them that are commonly used. Let’s assume our local AS has four public routable /24 prefixes, 15.0.0.0/22, and we’d like to influence that the incoming traffic for the first two prefixes comes from one upstream router MX1 and the incoming traffic for the last two prefixes comes from the other upstream router CSR2. Plus, let’s say the upstream AS 2 announces the prefix 2.2.2.2/32 and we’d like to enforce that the outbound traffic from our local AS to this prefix goes through MX1 preferably. These prefixes are illustrated in Figure 2, just check out both gray and brown arrows for a visual representation.

Figure2

Let’s review some of BGP’s attributes first to understand how we’re going to set the routing policies. Afterwards, I’ll show some configuration snippets just in case you need them.

Outbound Routing Policy

  • LOCAL_PREF

You can use LOCAL_PREF in order to enforce how your AS routes traffic outbound. The higher this value, the most preferable the L3 switch/router will be to route traffic out of your autonomous system. This attribute is propagated inside your AS, so all iBGP peers will choose the best route based on the highest LOCAL_PREF. Since you can set this attribute based on a set of prefixes, you can even have multiple routers and each of them could be the most preferable gateway for a subset of these prefixes. Typically, you weigh the LOCAL_PREF value according to the preference of your gateways based on some matching criteria.

In this case, to make sure that the prefix 2.2.2.2/32 received from AS 2 is preferably routed through MX1 (192.168.52.1) and then CSR2 (192.168.53.2) as a second gateway, we’ll set the local preference as 1000 when receiving this prefix from MX1, and local preference as 500 from CSR2.

Inbound Routing Policy

Info
Keep in mind that outbound policies have higher preference over inbound policies, which means that in order for your inbound policies to have the expected outcome you have to make sure that the upstream AS provider doesn’t have outbound policies in place that will overrule your inbound attributes. Usually, upstream ISPs provide looking glass routers for you to check this out.

  • AS_PATH

It’s possible to prepend the local AS on the AS_PATH in order to try to influence how the remote-as is supposed to route towards our local AS 65000. The longer the AS_PATH the least preferable the prefix is.

In this use case, we have four /24 prefixes, 15.0.0.0/22, and we’d like to influence that the incoming traffic for the last two prefixes, 15.0.2.0/24 and 15.0.3.0/24, comes from CSR2 preferably. To accomplish this, I’ll prepend the local AS tree times when announcing these last two prefixes to CSR2 and five times to MX1.

  • MED (aka MULTI_EXIT_DISC)

MED is used when you have multiple BGP sessions to the same remote AS and you want to influence how they are supposed to route. In short, MED is less powerful than the AS_PATH because AS_PATH is considered first in the selection path algorithm over MED. Plus, MED is not transitive (it’s not carried out by BGP updates over multiple ASes). The lower the MED value the better. In this example, let’s use MED to tell AS 2 that we’d like that the incoming traffic of the first two prefixes, 15.0.0.0/24 and 15.0.1.0/24 comes from MX1 preferably. We’ll set MED as 20000 for these two prefixes being exported to CSR2 and MED as 15000 for these prefixes exported to MX1.

Configuration

The following configuration snippets show the configuration that is relevant to BGP.

DM4050

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
SW6-DM4050(config-bgp-65000)# show
router bgp 65000
router-id 192.6.6.6
prefix-list FIRST_TWO_NETWORKS
seq 5
address-family ipv4 unicast
address 15.0.0.0/23
le 24
exit-address-family
!
!
!
prefix-list LAST_TWO_NETWORKS
seq 10
address-family ipv4 unicast
address 15.0.2.0/23
le 24
exit-address-family
!
!
!
prefix-list UPSTREAM_PREFIX
seq 5
address-family ipv4 unicast
address 2.2.2.2/32
exit-address-family
!
!
!
route-map MX_EXPORT 10
set-prepend-local-as 5
match-ip nlri prefix-list LAST_TWO_NETWORKS
!
route-map MX_EXPORT 20
set-med 15000
match-ip nlri prefix-list FIRST_TWO_NETWORKS
!
route-map MX_IMPORT 10
set-local-preference 1000
match-ip nlri prefix-list UPSTREAM_PREFIX
!
route-map CSR_EXPORT 10
set-prepend-local-as 3
match-ip nlri prefix-list LAST_TWO_NETWORKS
!
route-map CSR_EXPORT 20
set-med 20000
match-ip nlri prefix-list FIRST_TWO_NETWORKS
!
route-map CSR_IMPORT 10
set-local-preference 500
match-ip nlri prefix-list UPSTREAM_PREFIX
!
route-policy NEIGHBOR_MX
import-route-map MX_IMPORT
export-route-map MX_EXPORT
!
route-policy NEIGHBOR_CSR
import-route-map CSR_IMPORT
export-route-map CSR_EXPORT
!
neighbor 192.168.52.1
route-policy NEIGHBOR_MX
remote-as 2
password hls:4070488623:33GXlDMO&<=P
ebgp-multihop 1
!
neighbor 192.168.53.2
route-policy NEIGHBOR_CSR
remote-as 2
password hls:4070488623:33GXlDMO&<=P
ebgp-multihop 1
!
neighbor 192.4.4.4
update-source-address 192.6.6.6
remote-as 65000
ebgp-multihop 255
next-hop-self
!
!
SW6-DM4050(config-bgp-65000)#

MX1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
root@MX1# show protocols bgp
group EXT_AS65000 {
authentication-key "$9$zu95FnCtpBhyKP5F/tOcSNdb"; ## SECRET-DATA
export EXPORT_BGP;
peer-as 65000;
neighbor 192.168.52.6;
}
group INT_AS2 {
export NEXT_HOP_SELF;
neighbor 1.2.100.2 {
peer-as 2;
}
}
root@MX1# show routing-options
autonomous-system 2;
root@MX1# show policy-options
prefix-list LOCAL {
2.2.2.2/32;
}
policy-statement EXPORT_BGP {
from {
prefix-list LOCAL;
}
then accept;
}
policy-statement NEXT_HOP_SELF {
then {
next-hop self;
}
}

CSR2

1
2
3
4
5
6
7
8
router bgp 2
bgp log-neighbor-changes
network 2.2.2.2 mask 255.255.255.255
neighbor 1.2.100.1 remote-as 2
neighbor 1.2.100.1 next-hop-self
neighbor 192.168.53.6 remote-as 65000
neighbor 192.168.53.6 password 7 110D181116110401
CSR2#

Verification

Outbound Routing Policy

The prefix 2.2.2.2/32 was learned from both MX1 (192.168.52.1) and CSR2 (192.168.53.2), and the route entry selected as best is the one from MX1, since LOCAL_PREF 1000 > 500. As a result, traffic leaving from our local AS will route towards MX1.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
SW6-DM4050# show ip bgp prefixes
Status codes: s suppressed; d damped; h history; * valid; > best; i - internal;
S Stale;
Origin codes: i - IGP; e - EGP; ? - incomplete;
Network Next Hop Metric LocPrf Weight Learned from Path
*>i 15.0.0.0/24 192.168.51.50 0 10 0 192.4.4.4 i
*>i 15.0.1.0/24 192.168.51.50 0 10 0 192.4.4.4 i
*>i 15.0.2.0/24 192.168.51.50 0 10 0 192.4.4.4 i
*>i 15.0.3.0/24 192.168.51.50 0 10 0 192.4.4.4 i
* 2.2.2.2/32 192.168.53.2 0 500 0 192.168.53.2 2 i
*> 2.2.2.2/32 192.168.52.1 0 1000 0 192.168.52.1 2 i
SW6-DM4050#
SW6-DM4050# show ip route | i ^B
B 2.2.2.2/32 192.168.52.1 00:12:25 20 0 l3-vlan 4052
B 15.0.0.0/24 4.6.9.4 00:12:11 200 0 l3-vlan 4009
B 15.0.1.0/24 4.6.9.4 00:12:11 200 0 l3-vlan 4009
B 15.0.2.0/24 4.6.9.4 00:12:11 200 0 l3-vlan 4009
B 15.0.3.0/24 4.6.9.4 00:12:11 200 0 l3-vlan 4009
SW6-DM4050#

Verifying on SW4_DM4100, from an iBGP perspective:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SW4_DM4100#show ip bgp
BGP Routing with Router ID: 192.4.4.4
Admin status: up, Operational status: up
Local AS number: 65000
Displaying IPv4 routes:
Status codes: s suppressed, d damped, h history, * valid, > best
Origin codes: i IGP, e EGP, ? incomplete
Network Next Hop Metric LocPrf Weight Path
------------------ --------------- ---------- ---------- ---------- ----------
*> 2.2.2.2/32 192.6.6.6 0 1000 0 2 i
*> 15.0.0.0/24 192.168.51.50 0 10 0 i
*> 15.0.1.0/24 192.168.51.50 0 10 0 i
*> 15.0.2.0/24 192.168.51.50 0 10 0 i
*> 15.0.3.0/24 192.168.51.50 0 10 0 i
Total number of prefixes 5
SW4_DM4100#

Inbound Routing Policy

From CSR2's point of view, we can confirm that 15.0.0.0/24 and 15.0.1.0/24 is routed inside AS 2 through MX1 since MED 15000 < 20000. The next-hop 1.2.100.1 is reachable by an internal network inside AS 2. In addition, both 15.0.0.3.0/24 and 15.0.4.0/24 is routed through CSR2 since the AS_PATH 65000 x 3 is shorter than 65000 x 5.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
CSR2#show ip bgp
BGP table version is 28, local router ID is 192.100.100.100
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*> 2.2.2.2/32 0.0.0.0 0 32768 i
*>i 15.0.0.0/24 1.2.100.1 15000 100 0 65000 i
* 192.168.53.6 20000 0 65000 i
*>i 15.0.1.0/24 1.2.100.1 15000 100 0 65000 i
* 192.168.53.6 20000 0 65000 i
*> 15.0.2.0/24 192.168.53.6 0 65000 65000 65000 i
*> 15.0.3.0/24 192.168.53.6 0 65000 65000 65000 i
CSR2#
CSR2#show ip route bgp
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
a - application route
+ - replicated route, % - next hop override, p - overrides from PfR
Gateway of last resort is not set
15.0.0.0/24 is subnetted, 4 subnets
B 15.0.0.0 [200/15000] via 1.2.100.1, 00:16:54
B 15.0.1.0 [200/15000] via 1.2.100.1, 00:16:51
B 15.0.2.0 [20/0] via 192.168.53.6, 00:16:34
B 15.0.3.0 [20/0] via 192.168.53.6, 00:16:34
CSR2#

From MX1's perspective, just for the sake of completeness, we can double check what we’ve seen on CSR2. Note that the last two prefixes were prepended 5 times.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
root@MX1# run show route protocol bgp
inet.0: 9 destinations, 12 routes (9 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
2.2.2.2/32 [BGP/170] 00:27:34, MED 0, localpref 100
AS path: I, validation-state: unverified
> to 1.2.100.2 via ge-0/0/0.100
15.0.0.0/24 *[BGP/170] 00:17:55, MED 15000, localpref 100
AS path: 65000 I, validation-state: unverified
> to 192.168.52.6 via ge-0/0/0.4052
15.0.1.0/24 *[BGP/170] 00:17:52, MED 15000, localpref 100
AS path: 65000 I, validation-state: unverified
> to 192.168.52.6 via ge-0/0/0.4052
15.0.2.0/24 *[BGP/170] 00:17:35, MED 0, localpref 100
AS path: 65000 65000 65000 I, validation-state: unverified
> to 1.2.100.2 via ge-0/0/0.100
[BGP/170] 00:18:07, localpref 100
AS path: 65000 65000 65000 65000 65000 I, validation-state: unverified
> to 192.168.52.6 via ge-0/0/0.4052
15.0.3.0/24 *[BGP/170] 00:17:35, MED 0, localpref 100
AS path: 65000 65000 65000 I, validation-state: unverified
> to 1.2.100.2 via ge-0/0/0.100
[BGP/170] 00:18:07, localpref 100
AS path: 65000 65000 65000 65000 65000 I, validation-state: unverified
> to 192.168.52.6 via ge-0/0/0.4052
root@MX1# run show route 15.0.3.0/24 extensive
inet.0: 9 destinations, 12 routes (9 active, 0 holddown, 0 hidden)
15.0.3.0/24 (2 entries, 1 announced)
TSI:
KRT in-kernel 15.0.3.0/24 -> {indirect(1048574)}
*BGP Preference: 170/-101
Next hop type: Indirect
Address: 0x95d4568
Next-hop reference count: 7
Source: 1.2.100.2
Next hop type: Router, Next hop index: 562
Next hop: 1.2.100.2 via ge-0/0/0.100, selected
Session Id: 0x142
Protocol next hop: 1.2.100.2
Indirect next hop: 0x9660110 1048574 INH Session ID: 0x145
State: <Active Int Ext>
Local AS: 2 Peer AS: 2
Age: 21:48 Metric: 0 Metric2: 0
Validation State: unverified
Task: BGP_2.1.2.100.2+14169
Announcement bits (2): 0-KRT 2-Resolve tree 1
AS path: 65000 65000 65000 I
Accepted
Localpref: 100
Router ID: 192.100.100.100
Indirect next hops: 1
Protocol next hop: 1.2.100.2
Indirect next hop: 0x9660110 1048574 INH Session ID: 0x145
Indirect path forwarding next hops: 1
Next hop type: Router
Next hop: 1.2.100.2 via ge-0/0/0.100
Session Id: 0x142
1.2.100.0/24 Originating RIB: inet.0
Node path count: 1
Forwarding nexthops: 1
Next hop type: Interface
Nexthop: via ge-0/0/0.100
BGP Preference: 170/-101
Next hop type: Router, Next hop index: 552
Address: 0x95d43a0
Next-hop reference count: 8
Source: 192.168.52.6
Next hop: 192.168.52.6 via ge-0/0/0.4052, selected
Session Id: 0x141
State: <Ext>
Inactive reason: AS path
Local AS: 2 Peer AS: 65000
Age: 22:20
Validation State: unverified
Task: BGP_65000.192.168.52.6+179
AS path: 65000 65000 65000 65000 65000 I
Communities: 15:15
Accepted
Localpref: 100
Router ID: 192.6.6.6
[edit]
root@MX1#

References