Segment Routing L3VPN and TE

In my previous post Segment Routing I described basic concept of SR and how it works from data-plane and control-plane perspective. In this article I am going to focus how SR interact with L3VPN and MPLS TE. To do so I am going to use below network topology with Cisco IOS-XE 16.10.

L3VPN SR

The L3VPN configuration with SR is no different than traditional MPLS L3VPN deployment apart there is no LDP requirement. The CSR3 is configured as VPNv4 RR and basic VRF configuration is on each PE router.

This is configuration on the CSR1, no fancy stuff just simple VPNv4 configuration for BGP and redistribution connected into the VRF.

vrf definition A
 rd 100:100
  address-family ipv4
  route-target export 100:100
  route-target import 100:100

router bgp 100
 bgp router-id 150.1.1.1
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 150.1.1.3 remote-as 100
 neighbor 150.1.1.3 update-source Loopback0
 !
 address-family ipv4
 exit-address-family
 !
 address-family vpnv4
  neighbor 150.1.1.3 activate
  neighbor 150.1.1.3 send-community extended
  neighbor 150.1.1.3 route-reflector-client
 exit-address-family
 !        
 address-family ipv4 vrf A
  redistribute connected
 exit-address-family

interface GigabitEthernet3
 vrf forwarding A
 ip address 10.100.1.1 255.255.255.0

The VRF A on the CSR1 router learns one subnet from CSR5 over MP-BGP.

CSR1#show ip route vrf A

Routing Table: A
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2, m - OMP
       n - NAT, Ni - NAT inside, No - NAT outside, Nd - NAT DIA
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       H - NHRP, G - NHRP registered, g - NHRP registration summary
       o - ODR, P - periodic downloaded static route, l - LISP
       a - application route
       + - replicated route, % - next hop override, p - overrides from PfR

Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 3 subnets, 2 masks
C        10.100.1.0/24 is directly connected, GigabitEthernet3
L        10.100.1.1/32 is directly connected, GigabitEthernet3
B        10.100.5.0/24 [200/0] via 150.1.1.5, 02:57:10

Let’s look at the mpls forwarding-table on the CSR1 router where you can see SR Adj-SID indicated by the A latter next to the prefix, in this example 10.1.13.3 and 10.1.12.2 with label 18 and 16 respectively. In addition, there is one VPN prefix for local VRF 10.100.1.0/24 with local label of 17.

CSR1#sh mpls forwarding-table 
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop    
Label      Label      or Tunnel Id     Switched      interface              
16         Pop Label  10.1.12.2-A      0             Gi1        10.1.12.2   
17         No Label   10.100.1.0/24[V] 0             aggregate/A 
18         Pop Label  10.1.13.3-A      0             Gi2        10.1.13.3   
16201      Pop Label  150.1.1.2/32     0             Gi1        10.1.12.2   
16301      Pop Label  150.1.1.3/32     0             Gi2        10.1.13.3   
16401      16401      150.1.1.4/32     0             Gi1        10.1.12.2   
16501      16501      150.1.1.5/32     0             Gi2        10.1.13.3   

A  - Adjacency SID

In the above output there is no information about remote prefix 10.100.5.0/24 because show mpls forwarding-table refer only to the global routing table and to find out what label was assign to this remote VPN prefix following commands can be use:

  • show ip cef vrf A 10.100.5.0/24 detail
  • show bgp vpnv4 unicast vrf A labels

Depends on what information you are looking for these above command can be used. I prefer the first one as it gives me information about a remote BGP VPN label and local one for the next hop.

CSR1#sh ip cef vrf A 10.100.5.0/24 detail 
10.100.5.0/24, epoch 0, flags [rib defined all labels]
  recursive via 150.1.1.5 label 17
    nexthop 10.1.13.3 GigabitEthernet2 label 16501-(local:16501)

As you can see there are two labels information, the outer label is 16501 which is reachable via 10.1.13.3 and inner label of 17 which was assigned by the BGP VPN process on the remote box 150.1.1.5 (CSR5)

This is a packet capture taken on the CSR1 router link towards MPLS cloud. You can see here two labels the outer label 16501 and inner label of 17 which is also mark as BoS label (Bottom of Stack) which indicates this is the last label in the stack. The packet capture outputs match the above cef information. Each router through the network has exactly the same label information for the BGP next-hop IP address 150.1.1.5 and they know how to forward such packet to this destination. That label number is driven from the OSPF database as describe in my previous post here.

The L3VPN with SR is no really different from traditional MPLS deployment with LDP as a control-plane, and the main difference is that OSPF is responsible for propagation all label information and not the LDP. With such approach we can reduce control-plane complexity as there is no requirement of LDP and troubleshooting should be fairly simple as well because there are two protocols you have to worry about MP-BGP and OSPF.

Now lets look at the SR-TE configuration and how traffic-engineering works without RSVP. I am going to use the same topology as above however below drawing shows how routers are connected from physical perspective and how TE tunnel is setup.

SR-TE

By default trace between PC1 and PC2 is using the shortest path between both PE’s routers based on the underlying OSPF cost. The trace is going as follow PC1-CSR1-CSR3-CSR5-PC2 as depicted below.

PC1> trace 10.100.5.2  
trace to 10.100.5.2, 8 hops max, press Ctrl+C to stop
 1   10.100.1.1   2.347 ms  3.880 ms  7.776 ms
 2   10.1.13.3   4.168 ms  7.447 ms  3.365 ms
 3   10.100.5.1   3.807 ms  4.874 ms  6.448 ms
 4   *10.100.5.2   8.048 ms (ICMP type:3, code:3, Destination port unreachable)

Below is related mpls traffic-engineering configuration I am going to add to the CSR1 router which enables traffic-engineering and creates Tunnel 100. There are few things which needs to be mention here. I will create explicit path to force traffic to traverse following path PC1-CSR1-CSR2-CSR4-CSR5-PC2. Secondly as you can note there is no RSVP commands at all. In addition, I am going to use N-SID instead next-address under explicit-path configuration and autoroute announce to dynamically puts appropriate routes into the routing table and point to the tunnel interface. Also mpls-te will rely on the IGP metric.

You can use mix next-address and next-label options for the explicit-path however there is one caveat you should be aware of, and you cannot use next-address after next-label option.

mpls traffic-eng tunnels

interface ran gi1-2
 mpls traffic-eng tunnels

router ospf 1
 no mpls traffic-eng router-id Loopback0
 no mpls traffic-eng area 0

ip explicit-path name LONGER enable
 index 1 next-label 16201
 index 2 next-label 16401
 index 3 next-label 16501

interface Tunnel100
 ip unnumbered Loopback0
 tunnel mode mpls traffic-eng
 tunnel destination 150.1.1.5
 tunnel mpls traffic-eng autoroute announce
 tunnel mpls traffic-eng priority 6 6
 tunnel mpls traffic-eng path-option 10 explicit name LONGER segment-routing
 tunnel mpls traffic-eng path-selection metric igp

I am not going to configure second Tunnel on a tail-end router CSR5 as there is no such requirement because each TE tunnel is unidirectional. Without a second Tunnel on the CSR5 router, asymmetric routing will happen which is not main concern for this example right now. With asymmetric routing return traffic from CSR5 will be based on the OSPF shortest path between both PE routers and not on the SR-TE.

In addition following config will be applied to other routers such as CSR2-CSR5

mpls traffic-eng tunnels

interface ran gi1-2
 mpls traffic-eng tunnels

router ospf 1
 no mpls traffic-eng router-id Loopback0
 no mpls traffic-eng area 0

As soon as the TE was enabled the interface Tunnel 100 was brought up and properly signaled. Looking at the below output you can see that path option 10 is used which is signaled by SR and path type is explicit. Also you can see what is order N-SID that packet needs to traverse when routed over this interface along the related igp-id of these next-hops.

CSR1#sh mpls traffic-eng tunnels 

P2P TUNNELS/LSPs:

Name: CSR1_t100                           (Tunnel100) Destination: 150.1.1.5
  Status:
    Admin: up         Oper: up     Path: valid       Signalling: connected
    path option 10, (SEGMENT-ROUTING) type explicit LONGER (Basis for Setup)

  Config Parameters:
    Bandwidth: 0        kbps (Global)  Priority: 6  6   Affinity: 0x0/0xFFFF
    Metric Type: IGP (interface)
    Path Selection:
     Protection: any (default)
    Path-selection Tiebreaker:
      Global: not set   Tunnel Specific: not set   Effective: min-fill (default)
    Hop Limit: disabled [ignore: Explicit Path Option with all Strict Hops]
    Cost Limit: disabled
    Path-invalidation timeout: 10000 msec (default), Action: Tear
    AutoRoute: enabled  LockDown: disabled Loadshare: 0 [0] bw-based
    auto-bw: disabled
    Fault-OAM: disabled, Wrap-Protection: disabled, Wrap-Capable: No
  Active Path Option Parameters:
    State: explicit path option 10 is active
    BandwidthOverride: disabled  LockDown: disabled  Verbatim: disabled

  History:
    Tunnel:
      Time since created: 19 minutes, 22 seconds
      Time since path change: 1 minutes, 17 seconds
      Number of LSP IDs (Tun_Instances) used: 15
    Current LSP: [ID: 15]
      Uptime: 1 minutes, 17 seconds
    Prior LSP: [ID: 14]
      ID: path option unknown
      Removal Trigger: signalling shutdown
  Tun_Instance: 15
  Segment-Routing Path Info (ospf 1  area 0)
    Segment0[Node]: 150.1.1.2, Label: 16201
    Segment1[Node]: 150.1.1.4, Label: 16401
    Segment2[Node]: 150.1.1.5, Label: 16501

I did packet capture on the CSR1 router for Gi1 interface when traffic-engineering was enable and below output shows some details encoded in LSA-type 10 for TE. Here you can see that first highlighted LSA is for router CSR1 itself and the second LSA is also generated by CSR1 router however describes link information between CSR1 and CSR3.

The same information as shown in the packet capture you can see using following command sh mpl traffic-eng topology <router-id>. This command gives the same information as packet capture output above, where you can see what are directly connected neighbors, they router-ID’s, what IP’s are configured on the interfaces, bandwidth reservation etc.

CSR1#sh mpl traffic-eng topology 150.1.1.1

IGP Id: 150.1.1.1, MPLS TE Id:150.1.1.1 Router Node  (ospf 1  area 0) id 10
      link[0]: Point-to-Point, Nbr IGP Id: 150.1.1.3, nbr_node_id:7, gen:30, nbr_p:80007F7F74866208
      frag_id: 2, Intf Address: 10.1.13.1, Nbr Intf Address: 10.1.13.3
      TE metric: 1, IGP metric: 1, attribute flags: 0x0
      SRLGs: None 
      physical_bw: 1000000 (kbps), max_reservable_bw_global: 0 (kbps)
      max_reservable_bw_sub: 0 (kbps)

                             Global Pool       Sub Pool
           Total Allocated   Reservable        Reservable
           BW (kbps)         BW (kbps)         BW (kbps)
           ---------------   -----------       ----------
    bw[0]:            0                0                0
    bw[1]:            0                0                0
    bw[2]:            0                0                0
    bw[3]:            0                0                0
    bw[4]:            0                0                0
    bw[5]:            0                0                0
    bw[6]:            0                0                0
    bw[7]:            0                0                0

      link[1]: Point-to-Point, Nbr IGP Id: 150.1.1.2, nbr_node_id:6, gen:30, nbr_p:80007F7F74866550
      frag_id: 1, Intf Address: 10.1.12.1, Nbr Intf Address: 10.1.12.2
      TE metric: 1, IGP metric: 1, attribute flags: 0x0
      SRLGs: None 
      physical_bw: 1000000 (kbps), max_reservable_bw_global: 0 (kbps)
      max_reservable_bw_sub: 0 (kbps)

                             Global Pool       Sub Pool
           Total Allocated   Reservable        Reservable
           BW (kbps)         BW (kbps)         BW (kbps)
           ---------------   -----------       ----------
    bw[0]:            0                0                0
    bw[1]:            0                0                0
    bw[2]:            0                0                0
    bw[3]:            0                0                0
    bw[4]:            0                0                0
    bw[5]:            0                0                0
    bw[6]:            0                0                0
    bw[7]:            0                0                0

Another useful command is sh mpls traffic-eng topology segment-routing <router-id> which shows similar information as command above but in addition displays information about the N-SID and label allocation to specific node. Also gives you what is the Adj-SID for the corresponding link. Based on these information head-end router imposes the corresponding MPLS label stack on to outgoing packets to be carried over the tunnel. Each transit node along the SR-TE LSP path uses tha incoming top label to select the next-hop and perform pop or swap the label operation, and forward the packet to the the next node with the remainder of the label stack. This process happens till the packet reaches the ultimate destination. The mpls traffic-engineering topology populates all these information based on the OSPF database mainly contains in the LSA-type 10.

CSR1#show mpls traffic-eng topology segment-routing 150.1.1.5

IGP Id: 150.1.1.5, MPLS TE Id:150.1.1.5 Router Node  (ospf 1  area 0) id 5
Segment-Routing:
 Node-SID: 501, (16501)
 SRGB[0] - Start: 16000, Size: 8000

      link[0]: Point-to-Point, Nbr IGP Id: 150.1.1.3, nbr_node_id:3, gen:10, nbr_p:80007FA64D5F01C0
      frag_id: 2, Intf Address: 10.1.35.5, Nbr Intf Address: 10.1.35.3
      Segment-Routing Adjacency-SIDs: 1
      Adjacency-SID[0]: 16, Flags: V, L to Nbr:: IGP Id: 150.1.1.3, MPLS TE Id: 150.1.1.3
      TE metric: 1, IGP metric: 1, attribute flags: 0x0
      SRLGs: None 
      physical_bw: 1000000 (kbps), max_reservable_bw_global: 0 (kbps)
      max_reservable_bw_sub: 0 (kbps)

                             Global Pool       Sub Pool
           Total Allocated   Reservable        Reservable
           BW (kbps)         BW (kbps)         BW (kbps)
           ---------------   -----------       ----------
    bw[0]:            0                0                0
    bw[1]:            0                0                0
    bw[2]:            0                0                0
    bw[3]:            0                0                0
    bw[4]:            0                0                0
    bw[5]:            0                0                0
    bw[6]:            0                0                0
    bw[7]:            0                0                0

      link[1]: Point-to-Point, Nbr IGP Id: 150.1.1.4, nbr_node_id:4, gen:10, nbr_p:80007FA64D5F0508
      frag_id: 1, Intf Address: 10.1.45.5, Nbr Intf Address: 10.1.45.4
      Segment-Routing Adjacency-SIDs: 1
      Adjacency-SID[0]: 17, Flags: V, L to Nbr:: IGP Id: 150.1.1.4, MPLS TE Id: 150.1.1.4
      TE metric: 1, IGP metric: 1, attribute flags: 0x0
      SRLGs: None 
      physical_bw: 1000000 (kbps), max_reservable_bw_global: 0 (kbps)
      max_reservable_bw_sub: 0 (kbps)

                             Global Pool       Sub Pool
           Total Allocated   Reservable        Reservable
           BW (kbps)         BW (kbps)         BW (kbps)
           ---------------   -----------       ----------
    bw[0]:            0                0                0
    bw[1]:            0                0                0
    bw[2]:            0                0                0
    bw[3]:            0                0                0
    bw[4]:            0                0                0
    bw[5]:            0                0                0
    bw[6]:            0                0                0
    bw[7]:            0                0                0

OSPF provides TE with the topology and SR related information which include SRGB, prefix, Adj-SID, links with SR enabled on the network.

Lets make a trace from PC1 to PC2 and see how packet will be switch.

PC1> trace 10.100.5.2
trace to 10.100.5.2, 8 hops max, press Ctrl+C to stop
 1   10.100.1.1   2.044 ms  1.204 ms  1.773 ms
 2   10.1.12.2   13.764 ms  6.927 ms  4.736 ms
 3   10.1.24.4   2.860 ms  8.226 ms  2.980 ms
 4   10.100.5.1   3.295 ms  2.898 ms  6.953 ms
 5   *10.100.5.2   16.924 ms (ICMP type:3, code:3, Destination port unreachable)

The packet was forwarded as expected per explicit-path configuration PC1-CSR1-CSR2-CSR4–CSR5-PC2 even the shorter path is available via CSR3. Before we look at the packet capture to see how label imposition is encoded into the packet let’s check how data-plane for prefix 10.100.5.0/24 looks like on the CSR1 router.

First let’s check what is the VPN label advertised by CSR5 for that prefix.

CSR1#sh bgp vpnv4 unicast vrf A labels 
   Network          Next Hop      In label/Out label
Route Distinguisher: 100:100 (A)
   10.100.1.0/24    0.0.0.0         17/nolabel(A)
   10.100.5.0/24    150.1.1.5       nolabel/18

The label 18 was assigned by CSR5 for the prefix 10.100.5.0/24 with the next-hop 150.1.1.5. So now let’s find out what label was assigned for the next-hop address 150.1.1.5.

CSR1# sh mpls forwarding-table 
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop    
Label      Label      or Tunnel Id     Switched      interface              
16    [T]  Pop Label  100/1[TE-Bind]   0             Tu100      point2point 
17         No Label   10.100.1.0/24[V] 2000          aggregate/A 
18         Pop Label  10.1.12.2-A      0             Gi1        10.1.12.2   
19         Pop Label  10.1.13.3-A      0             Gi2        10.1.13.3   
16201      Pop Label  150.1.1.2/32     0             Gi1        10.1.12.2   
16301      Pop Label  150.1.1.3/32     0             Gi2        10.1.13.3   
16401      16401      150.1.1.4/32     0             Gi1        10.1.12.2   
16501      16501      0-150.1.1.5/32-1 0             Gi2        10.1.13.3   

Looking at the above output from the CSR1 router you can see that label 16501 was assigned to the remote 150.1.1.5 PE next-hop address. The outgoing interface for that IP is through the CRS3, however trace indicates that packet was not forward through the CSR3 at all. Let’s see what shows routing table for that prefix 150.1.1.5 and why packet was not forward to CSR3. As you can see the routing table indicates that the outgoing interface for that next-hop address 150.1.1.5 is actually the Tunnel100 interface.

CSR1#sh ip route | in 150.1.1.5
 O        10.1.45.0/24 [110/3] via 150.1.1.5, 00:51:55, Tunnel100
 O        150.1.1.5 [110/3] via 150.1.1.5, 00:51:55, Tunnel100

Now we can check what is the label imposition for that interface using following command sh mpls forwarding-table labels 16 detail, where label 16 is referring to the local label for the TE Tunnel interface in the mpls forwarding-table above. This output indicates that label imposition for the traffic through the tunnel 100 is as follow 16401 and 16501 respectively and that’s why when trace was executed traffic was not forward to the CSR3 router and instead was forward to CSR2 based on the outgoing interface label allocation in the mpls forwarding-table.

CSR1# sh mpls forwarding-table labels 16 detail 
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop    
Label      Label      or Tunnel Id     Switched      interface              
16         Pop Label  100/1[TE-Bind]   0             Tu100      point2point 
        MAC/Encaps=14/22, MRU=1496, Label Stack{16401 16501}, via Gi1
        5000000100005000000300008847 0401100004075000
        No output feature configured

Explicit-path configuration uses following labels.

ip explicit-path name LONGER enable
 index 1 next-label 16201
 index 2 next-label 16401
 index 3 next-label 16501

The label 16201 is not added into the label stack because for that label, pop operation happens which is indicated by show mpls forwarding-table command output.

Looking at the packet capture taken on the CSR2 router for the interface between CSR1 and CSR2 you can see that for the incoming ICMP packet there are only three labels in the stack. The first two in the stack for CSR4 and CSR5 with label 16401 and 16501 respectively. The last one in the stack indicates VPN belong on the CSR5 router. As mentioned early there is no point to add label 16201 to th mpls stack as for that label, pop operation happens on the CSR1.

When CSR2 gets packet it looks into the mpls forwarding-table to finds out what to do with the packet. When it gets packet with the top label 16401 the label is removed according to the mpls forwarding-table listed below and the packet is forwards to the next hop based on the remaining label stack which is 16501. This proces repeats on the transit network until packet reach its final destination.

CSR2#sh mpls forwarding-table 
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop    
Label      Label      or Tunnel Id     Switched      interface              
16         Pop Label  10.1.12.1-A      0             Gi1        10.1.12.1   
17         Pop Label  10.1.24.4-A      0             Gi2        10.1.24.4   
16101      Pop Label  150.1.1.1/32     0             Gi1        10.1.12.1   
16301      16301      150.1.1.3/32     0             Gi1        10.1.12.1   
16401      Pop Label  150.1.1.4/32     2510          Gi2        10.1.24.4   
16501      16501      150.1.1.5/32     0             Gi2        10.1.24.4   

A  - Adjacency SID

SR-TE seems be relatively straight forward to configure at least basic concept and there is no requirement to worry about RSVP and the whole process how RSVP interacts with MPLS and how label allocation happens through the PATH and PATH Resv messages sent through the network between head-end and tail-end routers.

In the next post I will focus on some more advance topics such as link and node protection and auto-tunnels with SR-TE.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s