Cisco LISP – Multihoming

This is part of a series of posts – The Lisp Papers.

Overview

One of the key functions of the LISP implementation is baked-in multihoming and traffic engineering. EID spaces can be connected to the RLOC environment via several ETRs. The ETR responding to the querying ITR can offer a list of ETRs to be used. The LISP designers had two objectives when they were determining how this should function.

Firstly, the design should offer a preference of some ETRs over others. This allows some systems to act as primary ETRs and others as backups. This is implemented using the Priority field, with lower Priority systems being (counterintuitively) preferable over higher systems.

Secondly, the design should allow traffic distribution to be controlled. Once the priority comparison has taken place and, assuming that a number of systems with the lowest priority have been identified, the weight is considered. Traffic will be distributed between ETRs in the ratio of their weights. For example, ETR A might have a weight of 80 and ETR B might have a weight of 20. Traffic will be distributed between them in a ratio of 4:1.

Cisco LISP Network

In order to fully explore the design, the topology used in the previous post has been modified as follows:

In this case, the “MF” device combines Mapping Server and Mapping Resolver functionality (MF = Multi-Function). In addition, it peers with xTR3.1 and xTR3.2, which act as dual homing xTRs for Customer C. In this topology, xTR1 and xTR2 register with MS and xTR3.1 and xTR3.2 with MF for Mapping Server functionality. xTR1 and xTR2 use MR and xTR3.1 and xTR3.2 use MF to resolve EID to RLOC addresses. MF has GRE tunnels and BGP peers to both MS and MR. This allows the EID prefixes to be disseminated throughout the network and for Mapping Requests to to be delivered to the correct RLOC address.

Design thought

This is a far more realistic topology than the one considered in the previous post for several reasons. The dual xTRs offer a resilient connection to Customer C. There remains a single point of failure in MF, which could be resolved by deploying a second MF and attaching it to a separate part of the core network. The MS and MR functionality are combined, which makes sense at a superficial level. This is for the practical reason that to separate the functions and provide resilience would require 4 systems (resilient MS and resilient MR), which would probably be an overkill for all but the largest environments. Perhaps there are scaling, security or topological reasons to separate MS and MR functionality.

The Mapping Request is routed over the ALT network, as described in the previous post, to one of the RLOCs registered for a specific EID space. The xTR that this corresponds to will respond with a list of RLOCs with associated priorities and weights.

MF’s LISP configuration looks as follows:

router lisp
 site CustomerC
  authentication-key cisco
  eid-prefix 10.1.0.0/24 accept-more-specifics
  exit
 !
 ipv4 map-server
 ipv4 map-resolver
 ipv4 alt-vrf LISP
 exit

xTR3.1 and xTR3.2 have matching configurations:

router lisp
!database mapping for xTR3.1
 database-mapping 10.1.0.0/24 10.255.255.6 priority 2 weight 50
!database mapping for xTR3.2
 database-mapping 10.1.0.0/24 10.255.255.7 priority 2 weight 50
!Configuring MF's loopback as Mapping Resolver
 ipv4 itr map-resolver 10.255.255.5
 ipv4 itr
!Configuring MF's loopback as Mapping Server
 ipv4 etr map-server 10.255.255.5 key cisco
 ipv4 etr
 exit

Pitfall alert

ETRs must be configured with all the EID address space database mappings for which they are authoritative for. In the above example, xTR3.1 is configured with database mappings for both xTR3.1 and xTR3.2. This makes sense when you consider that any ETR could respond to a mapping request and will need to respond with the addresses, priority and weight for the associated ETR RLOC addresses for a required EID address space. If you only configure xTR3.1 with the database mappings for xTR3.1, it will only respond with its own RLOC address and no load-balancing or resilience will be delivered.

Examining the LISP EID database table on MF shows that both xTRs have registered with it:

MF#sh lisp site detail 
LISP Site Registration Information

Site name: CustomerC
Allowed configured locators: any
Allowed EID-prefixes:
  EID-prefix: 10.1.0.0/24 
    First registered:     00:47:03
    Routing table tag:    0x0
    Origin:               Configuration, accepting more specifics
    Registration errors:  
      Authentication failures:   0
      Allowed locators mismatch: 0
    ETR 10.1.3.26, last registered 00:00:41, no proxy-reply, no map-notify
                   TTL 1d00h
      Locator       Local  State      Pri/Wgt
      10.255.255.6  no     up           2/50 
      10.255.255.7  yes    up           2/50 
    ETR 10.1.3.22, last registered 00:00:53, no proxy-reply, no map-notify
                   TTL 1d00h
      Locator       Local  State      Pri/Wgt
      10.255.255.6  yes    up           2/50 
      10.255.255.7  no     up           2/50 

If Customer A wishes to communicate with Customer C, xTR1 will send a Mapping Request to MR. MR will forward this over the ALT network, via the GRE tunnel between itself and MF, to the Customer C address, as detailed in the following packet trace:

Frame 1: 126 bytes on wire (1008 bits), 126 bytes captured (1008 bits)
Ethernet II, Src: ca:00:16:79:00:8c (ca:00:16:79:00:8c), Dst: ca:03:16:79:00:1c (ca:03:16:79:00:1c)
Internet Protocol, Src: 10.255.255.0 (10.255.255.0), Dst: 10.255.255.5 (10.255.255.5)
Generic Routing Encapsulation (IP)
Internet Protocol, Src: 10.1.1.1 (10.1.1.1), Dst: 10.1.0.255 (10.1.0.255)
User Datagram Protocol, Src Port: lisp-control (4342), Dst Port: lisp-control (4342)
Locator/ID Separation Protocol
    0001 .... .... .... .... .... = Type: Map-Request (1)
    .... 0... .... .... .... .... = A bit (Authoritative): Not set
    .... .1.. .... .... .... .... = M bit (Map-Reply present): Set
    .... ..0. .... .... .... .... = P bit (Probe): Not set
    .... ...0 .... .... .... .... = S bit (Solicit-Map-Request): Not set
    .... .... 0... .... .... .... = p bit (Proxy ITR): Not set
    .... .... .0.. .... .... .... = s bit (SMR-invoked): Not set
    .... .... ..00 0000 000. .... = Reserved bits: 0x000000
    .... .... .... .... ...0 0000 = ITR-RLOC Count: 0
    Record Count: 1
    Nonce: 0x0f5c25e246fd29d7
    Source EID AFI: 1
    Source EID: 10.1.1.2 (10.1.1.2)
    ITR-RLOC 1: 10.255.255.1
        ITR-RLOC-AFI: 1
        ITR-RLOC Address: 10.255.255.1 (10.255.255.1)
    Record 1: 10.1.0.255/32
        Reserved bits: 0x00
        Prefix length: 32
        Prefix AFI: 1
        Prefix: 10.1.0.255
    Map-Reply record
        EID prefix: 10.1.1.0/24, TTL: 1440, Authoritative, No-Action
            0000 .... .... .... = Reserved: 0x0000
            .... 0000 0000 0000 = Mapping Version: 0
            Local RLOC: 10.255.255.1, Reachable, Priority/Weight: 2/50, Multicast Priority/Weight: 255/0

You will notice that the above packet trace contains a GRE header sourced by MR and destined to MF. This is because the packet capture occured between Core and MF. The response below from xTR3.1 is not sent via the ALT (therefore no GRE header) and includes two RLOCs, a local RLOC (for xTR3.1) and an additional RLOC (xTR3.2):

Frame 2: 94 bytes on wire (752 bits), 94 bytes captured (752 bits)
Ethernet II, Src: ca:03:16:79:00:1c (ca:03:16:79:00:1c), Dst: ca:00:16:79:00:8c (ca:00:16:79:00:8c)
Internet Protocol, Src: 10.1.3.22 (10.1.3.22), Dst: 10.255.255.1 (10.255.255.1)
User Datagram Protocol, Src Port: lisp-control (4342), Dst Port: lisp-control (4342)
Locator/ID Separation Protocol
    0010 .... .... .... .... .... = Type: Map-Reply (2)
    .... 0... .... .... .... .... = P bit (Probe): Not set
    .... .0.. .... .... .... .... = E bit (Echo-Nonce locator reachability algorithm enabled): Not set
    .... ..00 0000 0000 0000 0000 = Reserved bits: 0x000000
    Record Count: 1
    Nonce: 0x0f5c25e246fd29d7
    EID prefix: 10.1.0.0/24, TTL: 1440, Authoritative, No-Action
        0000 .... .... .... = Reserved: 0x0000
        .... 0000 0000 0000 = Mapping Version: 0
        Local RLOC: 10.255.255.6, Reachable, Priority/Weight: 2/50, Multicast Priority/Weight: 255/0
        RLOC: 10.255.255.7, Reachable, Priority/Weight: 2/50, Multicast Priority/Weight: 255/0

Use the following table to decipher the traces, or for more detail, refer to the detailed diagram and configurations:

SystemTypeAddress
CustomerAEID10.1.1.2
xTR1EID10.1.1.1
xTR1loopback10.255.255.1
MRloopback10.255.255.0
MFloopback10.255.255.5
XTR3.1loopback10.255.255.6
XTR3.1RLOC10.1.3.22
xTR3.2loopback10.255.255.7
xTR3.2RLOC10.1.3.26
Customer CEID10.1.0.255

Examining the FIB on xTR1 shows that both RLOCs are used:

xTR1#sh ip cef 10.1.0.255
10.1.0.0/24
  nexthop 10.1.0.0 Null0
xTR1#sh ip cef 10.1.0.255 det
10.1.0.0/24, epoch 0, flags subtree context, check lisp eligibility
  SC owned: LISP remote EID - locator status bits 0x00000003
  LISP remote EID: 4 packets 400 bytes fwd action encap
  LISP source path list
    nexthop 10.255.255.6 LISP0
    nexthop 10.255.255.7 LISP0
  2 IPL sources [active source]
   Dependent covered prefix type inherit cover 0.0.0.0/0
  recursive via 0.0.0.0/0
    attached to Null0

You might have observed that the interface LISP0 on xTR1 has an extra line of config.

interface LISP0
 ip load-sharing per-packet

This will share packets in a round-robin fashion between xTR3.1 and xTR3.2. This is probably not desirable as using different paths within a stream could cause out of sequence errors and excessive retransmissions, which is why the default is per-destination.

Summary

In the networks of today, load-balancing and resilience is predominantly delivered by the crude tools of deaggregation and AS Path prepending. LISP offers a much more subtle and manageable toolkit in a way that doesn’t beggar thy neighbour.

Share

LISP Papers

What was meant to be a brief investigation into LISP has turned into a rather large piece of work. The following is a summary presenting the LISP papers in the correct reading order:

Setting the scene

  1. The other big network challenge
  2. Underlying causes for an expanding Internet Routing Table
  3. Internet Routing Table – the good
  4. Internet Routing Table – the bad
  5. Internet Routing Table – the ugly
  6. Greatness and failure
  7. Identity / Location

LISP

  1. LISP
  2. LISP Data Plane
  3. Routing Tables and Economics
  4. Practical LISP – Basic Control Plane
  5. Practical LISP – Monitoring LISP
  6. Practical LISP – Separate MS / MR
  7. Cisco LISP – Multihoming across xTRs
  8. MTU, security and other practical aspects (upcoming)
Share

Routing Tables and Economics

This is part of a series of posts – The Lisp Papers.

A comment on the previous post made some pretty astute observations about the potential problems that might affect the uptake of LISP. I will be covering off MTU, Convergence and Latency in a future blog post. Suffice to say that the smart guys at Cisco have considered them.

In spite of any number of clever technologies, the underlying message of the comment is right. LISP is clumsy. Indirection in general is clumsy. It is going to be a source for more complexity, administration and  failure scenarios. It will be more difficult to explain, understand and troubleshoot.

It would be much better for everyone to be a good net citizen and maintain a small routing table. In previous posts I discussed what the Internet routing table looks like and the technical reasons for the expanding routing table size. I glossed over the underlying social / structural reason for increased routing table size.

The economics

I can advertise my entire /16 address space as /24s at a very low cost (low private cost) and everyone else has to pay (high social cost). Economics calls this a negative externality. Like other forms of pollution, the routing table pollution is caused by this simple imbalance.  By way of example, consider the following. If I jump into my car and clog up London’s road, all I have to pay is my time and the running costs of my vehicle. Every other user of the road has to pay in lost time due to the congestion. There is a lower personal cost than social cost. There are solutions for this. The London Congestion charge was a way of imposing a Pigovian tax to bring the private cost up to a level representative of the social cost. Carbon tax is another example of making polluters pay for the social cost.

However, taxing people for the number of prefixes they advertise is not very likely. Other solutions, such as regulation (over and above peering rules such as prefixes no longer than /24, etc), are unlikely. If so, the EID space is destined to grow.

In general computing, we have become used to Moore’s law rushing to the rescue. Bloated OS and apps? No problem, throw a multicore cpu and gigabits of RAM at the problem. So, why is our domain any different? Forwarding plane’s memory is not subject to Moore’s law. SRAM and TCAM memory’s cost performance ratio is not advancing at anything like the rate of commodity memory.

What will work?

BGP uses a push system. Everyone knows everything (in the DFZ). LISP is more pull-like. Does everyone need to know about the 300 /24s being advertised out of Albania?

If we break the RLOC / EID overlap, we can isolate the EID explosion to the edge. The cost will therefore be borne by the “polluters”. Want to dual-home? Don’t punch a /24 size hole in the DFZ but rather contract with your LISP-capable service providers to deliver this service. We can then limit knowledge of these to specialised systems that can scale at Moore’s law, and then pull the requisite elements down as and when needed. LISP seems like a pretty good way of doing that.

This is part of a series of posts – The Lisp Papers.

Share

Identity / Location

This is part of a series of posts – The Lisp Papers.

The study of meaning is known as semantics. One of the most fundamental concepts in networking theory is the Layer 3 address. In modern networking, this is predominantly the IPv4 address. And what is the semantics of an IP address? Asked another way, what does an IP address mean?

It has two different meanings. The first is that a device’s IP address is its identity on the network – Google’s famous public DNS server is identified by the address 8.8.8.8. You might say that the identity is a DNS name :

Name:    google-public-dns-a.google.com
Address:  8.8.8.8

but actually, it isn’t. DNS names aren’t used by the network layer, and there really isn’t anything between the application and the network to create name to address indirection. Therefore, what we have is an application resolving the DNS name to an IP address, and then establishing a connection to the IP address, and not the DNS name. This is standard (broken?) TCP sockets operation as discussed here.

The second semantic for an IP address is location. If we interrogate our AT&T route server:

route-server>sh ip bgp 8.0.0.0/8 longer-prefixes | i 8.8.8

*  8.8.8.0/24       12.123.29.249                          0 7018 15169 i

we can see that 8.8.8.0/24 is also a location, available through 12.123.29.239 and located in AS 15169. If this server moved its location, it would likely need to be re-addressed. If I connect to the Internet with one IP address at work, another at home and a 3rd when using my mobile broadband service, my network identity (that is, IP address) changes in each case. A real-world example of this would be if I was known as Terry at work, John at home and Simon in my car.

So, we have two different meanings for an IP address, one that is location based and another that is identity based. This effectively means that we have two separate semantics attached to an IP address. This is important because it affects the way that machines connect to the Internet. If my location changes, then, by extension, my identity has to change (as IP address is used for both). Taken a step further, we cannot have two simultaneous locations, which is exactly the requirement needed for multihoming.

How does the current Internet structure get around this? In the multihoming example, a third location is created, and this attaches to the network as an independent entity. This increases the size and complexity of the Internet’s view of the edge and greatly adds to the size of the Internet Routing Table (the extent of which is explored in Internet Routing Table – the good).

If, however, we could break the link between the Location and Identity, we could multihome (for example) without needing to advertise the same address in a third domain with independent connections. This would mean that we could create a hierarchy of information, where location is used within our core, and identity at either end.

To return to the earlier analogy, to route a packet to me, you would need to know my identity. You would send a packet to Terry. The network looks to see if Terry is available in the local location, e.g. Home. If not, it is routed to and hits the edge of the core, where a location lookup would be executed. Once my location, e.g. Work, has been determined, this is added to my packet – which then becomes Work:Terry. The rest of the core is not interested in the Terry part, and routes it based on the location, towards Work. Only once it reaches the edge of the core where Work connects, does the focus change. Now the packet is sent to Terry within the work domain. This is illustrated in the diagram below .

(click for larger version)

Sounds great – but wouldn’t we need to make dramatic changes to the network to deliver this? Yes, and this will be explored in the next post – the Location Identification Separation Protocol, or LISP.

This may be a really important technology going forward. The reason is because some of the ideas I explored in recent posts around unaggregatable IP address growth is even more important once the IPv6 addressing approach (Classless Interdomain Routing and Provider Independent Address Allocations) creates

a “swamp” (unaggregatable address space) that can be many orders of magnitude larger than what we faced with IPv4 – RFC4984.

As an aside – we do have an identification only address deployed in the network today. Can anyone guess? A clue – it uses a flat namespace.


The MAC address space is identity-only and doesn’t have any location-specific Semantics.

Share

Greatness and failure

Failure (and Greatness)

This is part of a series of posts – The Lisp Papers.

So, we’ve pretty much run out of IPv4 addressing, right? And we’re anticipating a new release of the (Capital I) Internet – which will definitely be the greatest upheaval in the networking world. Why is this going to happen? A very simple and flippant answer is that we don’t have enough addresses. This has created all sorts of problems and will continue to have significant impact for between 10 and 20 years. And why don’t we have enough addresses?

The question is answered by Vint Cerf, arguably the most important man in the history and development of the Internet. It’s all his fault. As he explains, a temporary fix becomes long-term infrastructure. I will definitely be using this example the next time a “tactical” fix is proposed.

Of course, this doesn’t stop him being a legend. As I like to point out, importance is directly related to how many people your mistakes affect. Any time you affect 1.6 Billion people (and counting), you are important

Greatness (and Failure?)

This week I’m going to continue on my theme of the expanding Internet Routing Table that I delved into last week (the good, bad and ugly), particularly looking at some suggested solutions and cool technologies. Wednesday’s post is going to be around the concept of Location and Identification semantic overloading (what?) and the impact that has on the routing table. I really like this, as it’s a subtle concept with a large impact. Then, Friday and next week, I’ll be diving into the terribly named Location Identification Separation Protocol. LISP is already an acronym in widespread usage – as someone put it, we have a namespace conflict.

When I first learnt about MPLS Layer 3 VPNs, I remember thinking, “Wow, this is incredible.” It’s still early days but LISP might just be in that category. Of course, it might not, but Cisco has already released working code and I’m going to attempt to configure a number of network topologies with it.

Share

Copyleft(ↄ) 2014. Text is available under the Creative Commons Attribution-ShareAlike license

Pattinson Consulting Limited