Ooh! Ooh! Ooh! BGP, MPLS, Segment Routing and PCE!
I won’t lie to you – I’m a little giddy with excitement right now, so I’m going to pause for a minute befo… Oh I can’t take it any longer: We are demoing BGP EVPNs and MPLS segment routing with PCE! That’s huge! All you avid readers of my posts, which I know you all are, (I know you’re out there – I can hear you snoring) will remember that I’m a bit of a BGP bigot, an MPLS addict; a PCE devotee; and a segment routing supporter. If you don’t, have a quick skim of this article and this one from late 2015.
Having composed myself a little, let me explain: At MPLS + SDN + NFV World Congress 2016, Metaswitch teamed up with 15 other vendors, as vendors do in these days of ecosystem kumbaya and partnership brotherly love, to participate in the EANTC MPLS and Ethernet Transport Interoperability Event. We were working with the guys at EANTC before they were famous -- but even though they now ignore us at social gatherings, their expertise in the field and dedication to the industry, plus the comprehensiveness of their testing scenarios and reports, keep us coming back.(1) This Interoperability event specifically covers 41 detailed tests in three distinct areas: MPLS and Ethernet transport, software defined networking and phase/time clock Synchronization. Employing our portable networking stacks, Metaswitch is delivering a complete data plane network node for a good chunk of the tests relating to MPLS segment routing and EVPN -- a newly ratified method for deploying virtual private networks.
Emerging from the now defunct (or concluded) IETF l2vpn Working Group,(2) Ethernet VPNs (EVPNs) represent a significant evolution over existing multipoint to multipoint Layer 2 over Layer 3 virtual private networking techniques. With the initial draft released in February 2012, the RFC was ratified three years later as RFC 7432: BGP MPLS-Based Ethernet VPN. EVPN is equally at home in both new data center interconnect (DCI) applications and replacing the type of Carrier Ethernet LAN services commonly deployed today using IEEE 802.1ah provider backbone bridging (PBB) or MAC-in-MAC techniques, which are employed to provide a complete separation of customer and service provider domains. Based on a full mesh of MPLS pseudowires, per IETF PWE3 specifications, these virtual private LAN services (VPLS) suffer from some fundamental problems -- not least of which is that underlying PWE3 mesh, which facilitates the rather crude Ethernet bridging-like “flood and learn” (aka MAC learning) over the required wide area and between the numerous interconnected sites. While this is less than optimal for carrier Ethernet services, VPLS issues are compounded when it comes to the demands of virtualized data center interconnect applications, where dual-homed load balancing requirements, a high number of VLAN and MAC addresses and the need for fast convergence times are the norm. Simply put, VPLS was not up to the task of delivering on today’s cloud-centric buzzwords such as “virtual machine mobility” and (my personal favorite) “cloud bursting.” VPLS had to go.
RFC 7432 follows today’s modern internetworking design patterns in a couple of key ways: First, it employs a separated signaling and data plane architecture. Secondly it uses the almighty Border Gateway Protocol -- BGP. Now those of you who remember RFC 2547/bis (circa 1999, now obsoleted by RFC 4364, circa 2006) may well be rolling your eyes a little: “BGP/MPLS VPNs? Been there – done that. And didn’t they suffer from BGP overload issues, anyway?” To which I would have to ask you, politely, to please not to roll your eyes at me and then I’d remind you that those were IP VPNs and that we’ve come a long, long, way in raw route processing capabilities and our ability to handle multiple forwarding tables since then. We now know that, with perhaps a little help from classic scaling techniques like route reflection and more recently complete control plane separation, BGP can scale. That moderately successful hobbyist project we call ‘The Internet’ has proven that conclusively. Plus, really, what are you doing remembering RFCs from 1999? Seriously?!
The EVPN standard adds a new Multiprotocol (extensions for) BGP (MBGP) Network Layer Reachability Information (NLRI) element, somewhat predictably called the EVPN NLRI. The BGP control plane uses NLRI to distribute EVPN routes among peers, adding a layer of intelligent route discovery that eliminates the need for VXLANs to operate in its default (and gnarly) “flood-and-learn” mode. That’s all fine and dandy, but where does MPLS come into all this? And where did the phrase “fine and dandy” come from, anyway?(3) Fortunately, RFC 7432 answers at least one of these questions. (Spoiler alert: It’s not the one about being “dandy.”) EVPN defines MPLS as the (default) data plane, in its decomposed architecture, with the MPLS labels associated with the Ethernet segments signaled through a new BGP Ethernet Segment Identifier (ESI) MPLS Label extended community. This eliminates the need for dedicated point-to-point fixed pseudowires meshing each remote endpoint “just in case,” thereby dramatically simplifying deployment and troubleshooting while elevating the scalability of these Layer 2 VPNs to the level of an IP VPN.
For those network engineers not quite as enamoured with MPLS as I am, the separated nature of the EVPN control plane opens up the option for other data plane encapsulations and overlays. The most interesting of these is the Virtual eXtensible Local Area Network (VXLAN), which is being proposed in the BGP Enabled Service (bess) working group draft “evpn-overlay-02” (current revision).
Aided by a nudge from its lead author (VMware) and therefore, not surprisingly, its incorporation into the standard Open vSwitch (OVS) distribution, VXLANs are increasingly becoming the defacto standard in virtualized datacenters. Defined within informational RFC 7348 (A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks) VXLAN trumps 802.1Qs paltry 4,096 VLAN address space (Q-in-Q notwithstanding), enabling a mighty 16 million, fully isolated, Layer 2 networks to coexist in a single, common Layer 3 infrastructure.
This allows BGP to take on the role of the control plane for VXLAN, employing the BGP Tunnel Encapsulation Attribute (type 8) to identify and advertise this fact. While there is an MPLS encapsulation type (10), if the is no attribute present, MPLS is used as a default. The last (but by no means least!) option for the EVPN data plane is Provider Backbone Bridging (PBB). RFC 7623 (Provider Backbone Bridging Combined with Ethernet VPN or PBB-EVPN for short) outlines how the MAC tunneling technique defined in IEEE 802.1ah can be used as the EVPN data plane.
RFC 7432 EVPN features control and data plane separation, with 3 options for the latter.
So RFC 7432 is for wide area network and data center infrastructure, right? Well, if you listen very carefully, you can hear the scoffing sounds from our Project Calico team. They may be BGP bigots, like me, but they will oust these overlays before our inter-host packets have had the chance to be encapsulated, decapsulated, encapsulated again and decapsulated once more. That’s more on-ramps and off-ramps than the 405 -- which is a reference only a Los Angeles resident like me will understand -- but let me just be impartial for a minute and say that in the same way we here in La La Land sometimes just need a hug, network architects sometimes just need a Layer 2 connection.
Metaswitch is specifically focused on verifying the correct operation of EVPN in single-home scenarios with VXLAN data plane encapsulation. In the old days, we used to refer to these as “stub” VPNs and the endpoint switch/router as “one-armed,” owing to its solitary network interface -- though I fear younger readers of this post will now view me as networking’s Archie Bunker or Alf Garnett for using such terms. As the most prolific type of VPN (think small/medium enterprise vs. large businesses that would have two redundant links per site -- multi-homed -- into their service provider), we set out to prove critical VXLAN functionality like ARP Proxy, Proxy Neighbor Discovery and MAC mobility. The proxies eliminate unnecessary flooding while MAC mobility enables the kind of rapid migration of addresses between segments required for providing resilient virtual compute environments.
Test 1.16 setup: Single Homed EVPN with ARP Proxy & Proxy ND
While EVPN is the opening act at the EANTC Interoperability event, the headliner (in my mind at least) is segment routing. I’m not ashamed of my admiration for segment routing, though my wife certainly is, given her apparent need to lead me gently away by the elbow from large gatherings, where I’ve felt the need to describe the latest in source routing techniques in detail. Fortunately, I don’t feel I need to cover either the path computation element or segment routing in any detail, having just recently done so in the aforementioned post.
There is something of a plot twist, however, in that while we have a superior PCE in our portfolio of portable software stacks, we are actually testing our path computation client (PCC) at this event. OK – maybe that’s not exactly a twist per se, given the fact I mentioned at the beginning of this post that we were focused on the data plane functions, but then what do you expect? I’m not exactly Chuck Palahniuk and this isn’t Fight Club, you’re reading.(4)
Test 2.7 setup: Segment Routing with Path Computation Element (PCE) Server
The purpose of this test is to demonstrate the ability to initiate and terminate an end-to-end segment routing path, as determined by the stateful PCE controller, without employing any classic in-band hop-by-hop label distribution (LDP) or resource reservation protocol (RSVP). The PCE leverages a synchronized Traffic Engineering Database (TED), populated by an OSPF or IS-IS interior gateway routing protocol (IGP), and an LSP database to determine the best path. Naturally, we are showing off the recent PCEP extensions for segment routing(5) we have made to our Path Computation Element Protocol stack. Technically, we are performing a number of segment routing test scenarios that do not include a PCE, but they are boring, so I’m ignoring those.(6)
As you probably were not able to see the testing live, you can download EANTCs detailed report right here. Naturally, if you want to know more about our portable protocol stacks, you can drop us a line -- just excuse my hysteria.
(1) On the downside, they do have morals, which I discovered when I tried to negotiate a discount with their CEO off the back of these glowing public remarks.
(2) l2vpn was superseded primarily by the newly formed pseudowire and LDP-enabled services (pals) working group and, to a lesser extent, by the BGP Enabled Service (bess) charter -- more than likely simply because of their more amusing acronyms.
(3) Having spent minutes scouring the Internet, I’m afraid to report I came up shy of anything remotely resembling a sensible answer as to the origin of this phrase. Please don’t email me.
(4) Even if it were, I wouldn’t talk about it.
(5) As of this post, an IETF draft: https://tools.ietf.org/html/draft-sivabalan-pce-segment-routing-03
(6) Note: We also demonstrated our IP Fast Reroute capabilities but that was boring as well, though our sales team would love you to buy some licences, so don’t tell them I said so.