Preferring MPLS VPN BGP Path with IGP Backup
Preferring MPLS VPN BGP Path with IGP Backup
Over the last few years MPLS VPN services have gained popularity as an alternative network connectivity transport option over legacy TDM networks. One of the most popular challenges with the MPLS VPN design is the Layer3 routing interaction between the customer network and the service provider routing. A common scenario is when there is a primary BGP path over the MPLS VPN network and a redundant routing path over a non MPLS VPN network. This is exposed in many networks that have an eBGP peering session with the MPLS VPN provider and routes are learned to remote locations but also have a backup path to those same locations over a redundant IGP path. Typically the IGP path learns the routes via a dynamic routing protocol such as EIGRP or OSPF. This TAC Tip describes how to configure the routing such that the preferred path is always selected in both the primary path failure condition as well as the reroute on primary path recovery. Typically with the default configuration the failover works to the backup IGP path. However, the problem comes when the primary recovers.
When an IGP (in this example OSPF) route is redistributed in to BGP it is considered locally generated by BGP and gets assigned a weight of 32768. By default, all routes received from a BGP peer are assigned a local weight of 0. When doing BGP path comparison weight is the first attribute compared. Therefore, if the same prefix must be compared, the locally originated prefix with the higher weight will be installed in the routing table based on the BGP best path selection process. Let's first walk through an example of how the problem surfaces.
Take this simple network example:
R1 is the end customer (CPE) router that has two parallel paths to reach the remote 192.168.1.0/24 subnet. One path is an OSPF learned path and the other is an eBGP learned route from the MPLS PE router over the MPLS VPN network.
When the MPLS VPN network is up the eBGP route is selected as the best path based on the higher administrative distance (20 for eBGP and 110 for OSPF).
R1#show ip bgp 192.168.1.0 255.255.255.0
BGP routing table entry for 192.168.1.0/24, version 1562
Paths: (1 available, best #1, table default)
Flag: 0x820
Not advertised to any peer
65000
172.16.56.6 from 172.16.56.6 (192.168.8.1)
Origin IGP, metric 0, localpref 100, valid, external, best
R1#show ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "bgp 65001", distance 20, metric 0
Tag 65000, type external
Last update from 172.16.56.6 00:08:14 ago
Routing Descriptor Blocks:
* 172.16.56.6, from 172.16.56.6, 00:08:14 ago
Route metric is 0, traffic share count is 1
AS Hops 1
Route tag 65000
The OSPF learned route to 192.168.1.0/24 is there as a candidate path in the OSPF database.
R1#show ip ospf data router 3.3.3.3
OSPF Router with ID (5.5.5.5) (Process ID 1)
Router Link States (Area 0)
LS age: 225
Options: (No TOS-capability, DC)
LS Type: Router Links
Link State ID: 3.3.3.3
Advertising Router: 3.3.3.3
LS Seq Number: 8000138E
Checksum: 0x3AF3
Length: 48
Number of Links: 2
Link connected to: a Transit Network
(Link ID) Designated Router address: 172.16.35.5
(Link Data) Router Interface address: 172.16.35.3
Number of MTID metrics: 0
TOS 0 Metrics: 10
Link connected to: a Stub Network
(Link ID) Network/subnet number: 192.168.1.0
(Link Data) Network Mask: 255.255.255.0
Number of MTID metrics: 0
TOS 0 Metrics: 1
Now assume the link to the MPLS VPN network fails and we lose the eBGP route. Under this condition the OSPF backup route will be installed in the routing table. Here is the routing table debug showing the backup OSPF route going in the routing table.
RT: del 192.168.1.0 via 172.16.56.6, bgp metric [20/0]
RT: delete network route to 192.168.1.0/24
RT: updating ospf 192.168.1.0/24 (0x0) via 172.16.35.3 Et1/0
RT: add 192.168.1.0/24 via 172.16.35.3, ospf metric [110/11]
R1#show ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "ospf 1", distance 110, metric 11, type intra area
Redistributing via bgp 65001
Advertised by bgp 65001 match internal external 1 & 2
Last update from 172.16.35.3 on Ethernet1/0, 00:00:09 ago
Routing Descriptor Blocks:
* 172.16.35.3, from 3.3.3.3, 00:00:09 ago, via Ethernet1/0
Route metric is 11, traffic share count is 1
At this stage the routing has reconverged to the IGP backup path and everything is ok. However, notice that the output above shows the route is being redistributed in to BGP. This is because the router is doing OSPF to BGP redistribution to get the local OSPF learned routes in to BGP in order to be advertised over the MPLS VPN network.
Here is the entry in the BGP table showing it is now locally sourced with a weight of 32768.
R1#show ip bgp 192.168.1.0 255.255.255.0
BGP routing table entry for 192.168.1.0/24, version 1564
Paths: (1 available, best #1, table default)
Flag: 0x820
Not advertised to any peer
Local
172.16.35.3 from 0.0.0.0 (5.5.5.5)
Origin incomplete, metric 11, localpref 100, weight 32768, valid, sourced, best
Now let's say that the primary link to the MPLS VPN router comes back up and the eBGP session recovers such that we learn the 192.168.1.0/24 network over the eBGP session again.
BGP(0): 172.16.56.6 rcvd UPDATE w/ attr: nexthop 172.16.56.6, origin i, metric 0, path 65000
BGP(0): 172.16.56.6 rcvd 192.168.1.0/24
R1#show ip bgp 192.168.1.0 255.255.255.0
BGP routing table entry for 192.168.1.0/24, version 1564
Paths: (2 available, best #2, table default)
Flag: 0x820
Advertised to update-groups:
1
65000
172.16.56.6 from 172.16.56.6 (192.168.8.1)
Origin IGP, metric 0, localpref 100, valid, external
Local
172.16.35.3 from 0.0.0.0 (5.5.5.5)
Origin incomplete, metric 11, localpref 100, weight 32768, valid, sourced, best
Even though the AD of the eBGP path (20) is lower than OSPF path (110), we do not install the eBGP learned route into the routing table. Since this prefix is in the routing table via OSPF and is being redistributed into BGP, the BGP table will have both paths and must use the Best Path Selection Algorithm. Routes redistributed into BGP are considered locally originated and get a default weight of 32768. The BGP learned prefix is assigned a weight of 0 by default. Since weight is the first BGP attribute that we compare on Cisco routers, the route with the higher weight is considered the best.
R1#show ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "ospf 1", distance 110, metric 11, type intra area
Redistributing via bgp 65001
Advertised by bgp 65001 match internal external 1 & 2
Last update from 172.16.35.3 on Ethernet1/0, 00:03:05 ago
Routing Descriptor Blocks:
* 172.16.35.3, from 3.3.3.3, 00:03:05 ago, via Ethernet1/0
Route metric is 11, traffic share count is 1
Now the problem is that, even though the BGP link is back up and we are learning prefixes, traffic is still routing over the backup path via OSPF. To resolve this, we need to force the eBGP path to be preferred.
One common way to resolve this issue is to set the weight on routes learned from the eBGP peer higher than 32768. When the paths are compared by BGP, the path with the highest weight will be preferred and installed in the routing table.
router bgp 65001
bgp log-neighbor-changes
neighbor 172.16.56.6 remote-as 65000
!
address-family ipv4
no synchronization
redistribute ospf 1 match internal external 1 external 2
neighbor 172.16.56.6 activate
neighbor 172.16.56.6 weight 32769
no auto-summary
exit-address-family
To update the weight on the received update, we must force the peer to send the update again so that we can apply the change inbound.
R1#clear ip bgp 172.16.56.6 soft in
*Feb 10 00:32:01.279: BGP(0): 172.16.56.6 rcvd UPDATE w/ attr: nexthop 172.16.56.6, origin i, metric 0, path 65000
*Feb 10 00:32:01.279: BGP(0): 172.16.56.6 rcvd 192.168.1.0/24
*Feb 10 00:32:01.291: RT: closer admin distance for 192.168.1.0, flushing 1 routes
*Feb 10 00:32:01.291: RT: add 192.168.1.0/24 via 172.16.56.6, bgp metric [20/0]
R1#show ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "bgp 65001", distance 20, metric 0
Tag 65000, type external
Last update from 172.16.56.6 00:01:06 ago
Routing Descriptor Blocks:
* 172.16.56.6, from 172.16.56.6, 00:01:06 ago
Route metric is 0, traffic share count is 1
AS Hops 1
Route tag 65000
R1#show ip bgp 192.168.1.0
BGP routing table entry for 192.168.1.0/24, version 1565
Paths: (1 available, best #1, table default)
Flag: 0x820
Not advertised to any peer
65000
172.16.56.6 from 172.16.56.6 (192.168.8.1)
Origin IGP, metric 0, localpref 100, weight 32769, valid, external, best
To demonstrate how this works, let's assume that the BGP route has been lost and the OSPF route is installed in the routing table.
R1#show ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "ospf 1", distance 110, metric 11, type intra area
Redistributing via bgp 65001
Advertised by bgp 65001 match internal external 1 & 2
Last update from 172.16.35.3 on Ethernet1/0, 00:00:08 ago
Routing Descriptor Blocks:
* 172.16.35.3, from 3.3.3.3, 00:00:08 ago, via Ethernet1/0
Route metric is 11, traffic share count is 1
R1#show ip bgp 192.168.1.0
BGP routing table entry for 192.168.1.0/24, version 1567
Paths: (1 available, best #1, table default)
Flag: 0x820
Not advertised to any peer
Local
172.16.35.3 from 0.0.0.0 (5.5.5.5)
Origin incomplete, metric 11, localpref 100, weight 32768, valid, sourced, best
Once the eBGP peer comes back up, we learn the 192.168.1.0/24 again. Now we can see that the eBGP path is immediately installed in the routing table as the best path.
*Feb 10 00:37:33.259: BGP(0): 172.16.56.6 rcvd UPDATE w/ attr: nexthop 172.16.56.6, origin i, metric 0, path 65000
*Feb 10 00:37:33.259: BGP(0): 172.16.56.6 rcvd 192.168.1.0/24
*Feb 10 00:37:33.271: BGP(0): Revise route installing 1 of 1 routes for 192.168.1.0/24 -> 172.16.56.6(global) to main IP table
*Feb 10 00:37:33.271: RT: updating bgp 192.168.1.0/24 (0x0) via 172.16.56.6
*Feb 10 00:37:33.271: RT: closer admin distance for 192.168.1.0, flushing 1 routes
*Feb 10 00:37:33.271: RT: add 192.168.1.0/24 via 172.16.56.6, bgp metric [20/0]
R1#show ip route 192.168.1.0
Routing entry for 192.168.1.0/24
Known via "bgp 65001", distance 20, metric 0
Tag 65000, type external
Last update from 172.16.56.6 00:00:11 ago
Routing Descriptor Blocks:
* 172.16.56.6, from 172.16.56.6, 00:00:11 ago
Route metric is 0, traffic share count is 1
AS Hops 1
Route tag 65000
R1#show ip bgp 192.168.1.0
BGP routing table entry for 192.168.1.0/24, version 1568
Paths: (1 available, best #1, table default)
Flag: 0x820
Not advertised to any peer
65000
172.16.56.6 from 172.16.56.6 (192.168.8.1)
Origin IGP, metric 0, localpref 100, weight 32769, valid, external, best
If you do not want to apply the weight to all updates received from the neighbor, you can use a route-map to change the weight for only certain updates from the peer. Please see the configuration example below.
router bgp 65001
bgp log-neighbor-changes
neighbor 172.16.56.6 remote-as 65000
!
address-family ipv4
no synchronization
redistribute ospf 1 match internal external 1 external 2
neighbor 172.16.56.6 activate
neighbor 172.16.56.6 route-map set_weight in
no auto-summary
exit-address-family
route-map set_weight permit 10
match ip address 1
set weight 32769
access-list 1 permit 192.168.1.0 0.0.0.255
When the update is received, you can check the ACL for matches.
R1#show access-list 1
Standard IP access list 1
10 permit 192.168.1.0, wildcard bits 0.0.0.255 (2 matches)
Labels: Cisco, Networking, Technology, Telco