LISP – Separate MS / MR

This is part of a series of posts – The Lisp Papers.

In the previous LISP post, Practical LISP – Basic Control Plane, the network topology had a single device that delivered both MS and MR functions. While this demonstrated a simple LISP network, it is not a practical deployment. In reality, any number of MS and MR devices will communicate over an Alternative Logical Topology (ALT).

An ALT is nothing more than a tunnelled overlay network that runs BGP to disseminate EID addresses. It also acts as the environment that delivers Mapping Requests or data probes. The Mapping Request / Reply action has been largely covered under the previous post. Data Probes weren’t.

Data Probes

Data Probes are used to ensure that the first initial frames aren’t dropped while a Mapping Request is sent and Response is received. This delay is similar to DNS but with a small proviso. As DNS requests occur within the client, the dropping of the first few frames doesn’t occur as the network layer isn’t engaged until a destination address has been resolved. However, the LISP map doesn’t occur in the client stack and an ITR will not buffer the frames until the destination ETR(s) have been located. As such, a few packets will generally be dropped until the Mapping Response has been received.

To get around this, Data Probes can be used. This is quite simply using the initial packets received by the ITR to act as a mapping request. The data packet is encapsulated with a LISP header and forwarded to the MR. The LISP packet addressed to the EID and encapsulated within a GRE header, which is destination addressed to the next hop specified by the BGP routing used over the ALT. The GRE packet will potentially be decapsulated and encapsulated multiple times as it traverses the ALT. The ALT will use BGP routing to forward the packet to the EID prefix source, which is the MS. Once at the MS, the LISP packet is forwarded on to the ETR, where the Data Packet is decapsulated and forwarded on to the EID Destination.

So we have 3 different levels of source / destination addresses, which can best be understood using the following diagram:
Location ID Separation Protocol Data Probe

Data Probes seem to be a good idea, but the ALT draft does draw attention to some concerns:

It is worth noting that there has been a great deal of discussion and controversy about whether Data Probes are a good idea. On the one hand, using them offers a method of avoiding the “first packet drop” problem when an ITR does not have a mapping for a particular EID-prefix. On the other hand, forwarding data packets on the ALT would require that it either be engineered to support relatively high traffic rates, which is not generally feasible for a tunneled network, or that it be carefully designed to aggressively rate-limit traffic to avoid congestion or DoS attacks. There may also be issues caused by different latency or other performance characteristics between the ALT path taken by an initial Data Probe and the “Internet” path taken by subsequent packets on the same flow once a mapping is in place on an ITR. For these reasons, the use of Data Probes is not recommended at this time; they should only be originated an ITR when explicitly configured to do so and such configuration should only be enabled when performing experiments intended to test the viability of using Data Probes.

Cisco provides support for data probes on the Nexus platform using the command:
ip lisp itr send-data-probe
No support for data probes seems to be available in IOS (15.1(4)XB4 or 15.1.4M).

The ALT

The ALT does not directly communicate the RLOC to EID mapping. It simply provides a low-capacity environment to deliver the initial Mapping Requests, addressed to the EID required. The Mapping response will traverse the standard RLOC environment, with the source the responding ETR and the destination would be the requesting ITR.

To explore the configuration required to support a separated MS and MR function connecting over the ALT, the following topology will be used:
Locator / ID Separation Protocol MS MR over ALT
The configurations use the new LISP CLI format.
MR Configuration
The MR configuration now includes a GRE tunnel to the MS and an IPv4 VRF SAFI BGP peer to the remote side of the BGP peer tunnel. LISP is redistributed into this BGP SAFI.

vrf definition LISP
 rd 1:1
 !
 address-family ipv4
 exit-address-family
!
interface Loopback0
 ip address 10.255.255.0 255.255.255.255
!
interface Tunnel0
 vrf forwarding LISP
 ip address 10.0.0.1 255.255.255.252
 tunnel source Loopback0
 tunnel destination 10.255.255.4
!
router lisp
 ipv4 map-resolver
 ipv4 alt-vrf LISP
 exit
!
router bgp 65001
 bgp log-neighbor-changes
 !
 address-family ipv4 vrf LISP
  redistribute lisp
  neighbor 10.0.0.2 remote-as 65002
  neighbor 10.0.0.2 activate
 exit-address-family

MS Configuration
The reciprocal configuration is specified

vrf definition LISP
 rd 1:1
 !
 address-family ipv4
 exit-address-family
!
interface Loopback0
 ip address 10.255.255.4 255.255.255.255
!
interface Tunnel0
 vrf forwarding LISP
 ip address 10.0.0.2 255.255.255.252
 tunnel source Loopback0
 tunnel destination 10.255.255.0
!
router lisp
 site CustomerA
  authentication-key cisco
  eid-prefix 10.1.1.0/24 accept-more-specifics
  exit
 !
 site CustomerB
  authentication-key cisco
  eid-prefix 10.1.2.0/24 accept-more-specifics
  exit
 !
 ipv4 map-server
 ipv4 alt-vrf LISP
 exit
!
router bgp 65002
 bgp log-neighbor-changes
 !
 address-family ipv4 vrf LISP
  redistribute lisp
  neighbor 10.0.0.1 remote-as 65001
  neighbor 10.0.0.1 activate
 exit-address-family

xTRs modified configuration
Previously, the xTRs pointed to a single address for both the MS and MR addresses. Now we point the MS and MR elements in the xTR configuration to the separate addresses specified above:
xTR1

router lisp
 database-mapping 10.1.1.0/24 10.255.255.1 priority 2 weight 50
 ipv4 itr map-resolver 10.255.255.0
 ipv4 itr
 ipv4 etr map-server 10.255.255.4 key cisco
 ipv4 etr
 exit

xTR2

router lisp
 database-mapping 10.1.2.0/24 10.255.255.1 priority 2 weight 50
 ipv4 itr map-resolver 10.255.255.0
 ipv4 itr
 ipv4 etr map-server 10.255.255.4 key cisco
 ipv4 etr
 exit

This topology peers the MS directly with the MR. However, it is reasonable to assume that xTR1 would use a MS / MR combination and xTR2 would use another. Further, direct peering is not likely. Instead, the MR would likely peer with a chain of ALT routers before reaching the MS, similar to how the Internet operates with BGP peering from source AS to destination AS.

As BGP is used to interconnect MS and MR devices over the ALT, the standard BGP prefix aggregation mechanisms and best practices could be followed.

ALT in Action

To really get to grips with the Mapping Request, it is important to examine the packets.  The following wireshark frame captures were taken (made available here with the excellent services of Cloudshark):

xTR1 to MR
MR to MS
MS to xTR2

The Mapping Response is presented below. This bypasses the ALT and is sent directly from xTR2 to xTR1.
xTR2 to xTR1

Use the following table to decipher the traces, or for more detail, refer to the detailed diagram and configurations:

SystemTypeAddress
CustomerAEID10.1.1.2
CustomerBEID10.1.2.255
xTR1EID10.1.1.1
xTR1RLOC10.1.3.2
xTR1loopback10.255.255.1
MRtunnel endpoint10.0.0.1
MStunnel endpoint10.0.0.2
MRloopback10.255.255.0
MSloopback10.255.255.4
MSRLOC10.1.3.10
xTR2RLOC10.1.3.6
xTR2loopback10.255.255.2

Monitoring the ALT Network
The MR now contains a couple BGP prefixes within the LISP VRF:

MR#sh ip bgp vpnv4 vrf LISP
BGP table version is 3, local router ID is 10.255.255.0
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external, f RT-Filter
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 1:1 (default for vrf LISP)
*> 10.1.1.0/24      10.0.0.2                 1             0 65002 ?
*> 10.1.2.0/24      10.0.0.2                 1             0 65002 ?

The networks 10.1.1.0/24 and 10.1.2.0/24 are the EIDs address spaces for Customer A and Customer B, respectively. 10.0.0.2 refers to the MS tunnel endpoint, which is used to establish the BGP Peer with the MR.

Share

LISP Papers

What was meant to be a brief investigation into LISP has turned into a rather large piece of work. The following is a summary presenting the LISP papers in the correct reading order:

Setting the scene

  1. The other big network challenge
  2. Underlying causes for an expanding Internet Routing Table
  3. Internet Routing Table – the good
  4. Internet Routing Table – the bad
  5. Internet Routing Table – the ugly
  6. Greatness and failure
  7. Identity / Location

LISP

  1. LISP
  2. LISP Data Plane
  3. Routing Tables and Economics
  4. Practical LISP – Basic Control Plane
  5. Practical LISP – Monitoring LISP
  6. Practical LISP – Separate MS / MR
  7. Cisco LISP – Multihoming across xTRs
  8. MTU, security and other practical aspects (upcoming)
Share

Routing Tables and Economics

This is part of a series of posts – The Lisp Papers.

A comment on the previous post made some pretty astute observations about the potential problems that might affect the uptake of LISP. I will be covering off MTU, Convergence and Latency in a future blog post. Suffice to say that the smart guys at Cisco have considered them.

In spite of any number of clever technologies, the underlying message of the comment is right. LISP is clumsy. Indirection in general is clumsy. It is going to be a source for more complexity, administration and  failure scenarios. It will be more difficult to explain, understand and troubleshoot.

It would be much better for everyone to be a good net citizen and maintain a small routing table. In previous posts I discussed what the Internet routing table looks like and the technical reasons for the expanding routing table size. I glossed over the underlying social / structural reason for increased routing table size.

The economics

I can advertise my entire /16 address space as /24s at a very low cost (low private cost) and everyone else has to pay (high social cost). Economics calls this a negative externality. Like other forms of pollution, the routing table pollution is caused by this simple imbalance.  By way of example, consider the following. If I jump into my car and clog up London’s road, all I have to pay is my time and the running costs of my vehicle. Every other user of the road has to pay in lost time due to the congestion. There is a lower personal cost than social cost. There are solutions for this. The London Congestion charge was a way of imposing a Pigovian tax to bring the private cost up to a level representative of the social cost. Carbon tax is another example of making polluters pay for the social cost.

However, taxing people for the number of prefixes they advertise is not very likely. Other solutions, such as regulation (over and above peering rules such as prefixes no longer than /24, etc), are unlikely. If so, the EID space is destined to grow.

In general computing, we have become used to Moore’s law rushing to the rescue. Bloated OS and apps? No problem, throw a multicore cpu and gigabits of RAM at the problem. So, why is our domain any different? Forwarding plane’s memory is not subject to Moore’s law. SRAM and TCAM memory’s cost performance ratio is not advancing at anything like the rate of commodity memory.

What will work?

BGP uses a push system. Everyone knows everything (in the DFZ). LISP is more pull-like. Does everyone need to know about the 300 /24s being advertised out of Albania?

If we break the RLOC / EID overlap, we can isolate the EID explosion to the edge. The cost will therefore be borne by the “polluters”. Want to dual-home? Don’t punch a /24 size hole in the DFZ but rather contract with your LISP-capable service providers to deliver this service. We can then limit knowledge of these to specialised systems that can scale at Moore’s law, and then pull the requisite elements down as and when needed. LISP seems like a pretty good way of doing that.

This is part of a series of posts – The Lisp Papers.

Share

Identity / Location

This is part of a series of posts – The Lisp Papers.

The study of meaning is known as semantics. One of the most fundamental concepts in networking theory is the Layer 3 address. In modern networking, this is predominantly the IPv4 address. And what is the semantics of an IP address? Asked another way, what does an IP address mean?

It has two different meanings. The first is that a device’s IP address is its identity on the network – Google’s famous public DNS server is identified by the address 8.8.8.8. You might say that the identity is a DNS name :

Name:    google-public-dns-a.google.com
Address:  8.8.8.8

but actually, it isn’t. DNS names aren’t used by the network layer, and there really isn’t anything between the application and the network to create name to address indirection. Therefore, what we have is an application resolving the DNS name to an IP address, and then establishing a connection to the IP address, and not the DNS name. This is standard (broken?) TCP sockets operation as discussed here.

The second semantic for an IP address is location. If we interrogate our AT&T route server:

route-server>sh ip bgp 8.0.0.0/8 longer-prefixes | i 8.8.8

*  8.8.8.0/24       12.123.29.249                          0 7018 15169 i

we can see that 8.8.8.0/24 is also a location, available through 12.123.29.239 and located in AS 15169. If this server moved its location, it would likely need to be re-addressed. If I connect to the Internet with one IP address at work, another at home and a 3rd when using my mobile broadband service, my network identity (that is, IP address) changes in each case. A real-world example of this would be if I was known as Terry at work, John at home and Simon in my car.

So, we have two different meanings for an IP address, one that is location based and another that is identity based. This effectively means that we have two separate semantics attached to an IP address. This is important because it affects the way that machines connect to the Internet. If my location changes, then, by extension, my identity has to change (as IP address is used for both). Taken a step further, we cannot have two simultaneous locations, which is exactly the requirement needed for multihoming.

How does the current Internet structure get around this? In the multihoming example, a third location is created, and this attaches to the network as an independent entity. This increases the size and complexity of the Internet’s view of the edge and greatly adds to the size of the Internet Routing Table (the extent of which is explored in Internet Routing Table – the good).

If, however, we could break the link between the Location and Identity, we could multihome (for example) without needing to advertise the same address in a third domain with independent connections. This would mean that we could create a hierarchy of information, where location is used within our core, and identity at either end.

To return to the earlier analogy, to route a packet to me, you would need to know my identity. You would send a packet to Terry. The network looks to see if Terry is available in the local location, e.g. Home. If not, it is routed to and hits the edge of the core, where a location lookup would be executed. Once my location, e.g. Work, has been determined, this is added to my packet – which then becomes Work:Terry. The rest of the core is not interested in the Terry part, and routes it based on the location, towards Work. Only once it reaches the edge of the core where Work connects, does the focus change. Now the packet is sent to Terry within the work domain. This is illustrated in the diagram below .

(click for larger version)

Sounds great – but wouldn’t we need to make dramatic changes to the network to deliver this? Yes, and this will be explored in the next post – the Location Identification Separation Protocol, or LISP.

This may be a really important technology going forward. The reason is because some of the ideas I explored in recent posts around unaggregatable IP address growth is even more important once the IPv6 addressing approach (Classless Interdomain Routing and Provider Independent Address Allocations) creates

a “swamp” (unaggregatable address space) that can be many orders of magnitude larger than what we faced with IPv4 - RFC4984.

As an aside – we do have an identification only address deployed in the network today. Can anyone guess? A clue – it uses a flat namespace.


The MAC address space is identity-only and doesn’t have any location-specific Semantics.

Share

Internet Routing Table – the ugly

So, is it getting worse?

Over the past few posts (the good and the bad), I have considered the various causes for a non-optimal Internet Routing Table. And now on to the question of whether it is getting worse.

Enterprise deaggregation (click for full size)

The graph above (source) shows the deaggregation by Enterprise ASes. Enterprises represent the vast majority of the edge of the Internet. As such, they are by far the largest source of deaggregation. According to this, the level of deaggregation has not increased significantly.

However, the following graph (source) shows a significant increase in the deaggregation factor.

Deaggregation by RIR

(click for full size)

The discrepancy appears to be as a result of definition. The 1st graph is based on allocations (maximum aggregation is bound by RIR allocation size) , while the second is based on best-case aggregation (idealised aggregation to maximum possible per AS). These are very different. It is therefore getting worse and has increased from 1.27 in 1999 to 2.18 today. This represents an increase of over 70% in deaggregation.

What solutions are there. Well – a whole whack have been developed. However, how many are viable? This humorous post actually captures a stack of the issues.

Further Reading

BGP Aggregation & The Deaggregation Report

Evolution of Internet Address Space Deaggregation: Myths and Reality

CIDR REPORT

This is part of a series of posts – The Lisp Papers.

Share

Copyleft(ↄ) 2014. Text is available under the Creative Commons Attribution-ShareAlike license

Pattinson Consulting Limited