Intent-Based Composable Networking
There is absolutely no question that B2B technology marketing remains a most revered profession. Certainly, since I first donned my marcom blazer and walked the hallowed corridors of corporate communications, I have felt the warm admiration of colleagues and counterparts throughout this great industry of ours flowing over me like molten solder.
Indeed, so admired is this occupation that everyone appears to want in. Certainly, when it comes to infrastructure and architecture evolutions, there is apparently no shortage of individuals across all disciplines who are willing and ready to help fly the hype flag. Ultimately, I feel both flattered and a little surplus to requirements, seeing as waving the hype flag is my job. Somewhat ironically, it will also likely be how my manager feels about my position after reading this post.
Recent reports around intent-based networking (IBN) are a great example of a business hell-bent on becoming its own parody, to the point where highly regarded industry pundits are writing off the movement as nothing more than noise. That’s a shame because if we look beyond the “everything’s broken but IBN will fix it… and here’s an infographic to prove it!” rhetoric, intent-based networking offers not only a logical approach to evolving network design, implementation and management, but one that’s been sorely needed for decades.
Although our beloved market segment has been extolling open software centricity, in the last few years, nothing can be further from the truth. In reality, the deployment and configuration of networking platforms still overwhelmingly harks back to a pre-open compute era, long before the evolution of lightweight software development methodologies
As purveyors of the finest portable routing and control plane protocol stacks available, Metaswitch continues to be an integral component in many of these internetworking devices. Owing to this recognized lineage, we have also been playing a critical role in the evolution of this industry. That evolution is often referred to as network disaggregation, although that tells only half the story.
The evolution of network infrastructure
Unbelievable though it is, in this day and age, the majority of networking is still developed and delivered on proprietary hardware using custom application-specific integrated circuits (ASICs). They employ tightly coupled software that includes everything required to run the box and have it perform its job, including advertising network reachability, setting up connections, moving packets around and managing the entire process -- all in one amorphous lump of code. Now, I’m not knocking that model as it is OK in some instances (plus it still forms the basis for our protocol stack business!), but when we consider where we are in other technology arenas, this approach is a little dated to say the least: It’s not like people are building data centers with proprietary server hardware, operating systems and application software from a single vendor.
Using data centers as an example is a little unfair as those architects were the first to recognize that using open compute platforms and closed switching infrastructures made little sense -- the white box industry and network disaggregation were born from this very fact. With all best intentions, however, network disaggregation 1.0 (aka hardware/software disaggregation) has largely failed to deliver any discernible value over classic, single-vendor approaches. This is primarily because there remains a tight relationship between the hardware and software, which dramatically limits the ability for an end user to select the two independently. Ultimately, a hardware vendor would dictate the software or a software supplier would have a limited number of hardware options they would support.
Larger data center operators – those in the region of hyperscale – mitigated this, in a sense, by designing, building and supporting their own hardware and software platforms using merchant switching silicon, x.86 processors and open source software foundations. In those processes, they also recognized that hardware and software disaggregation alone didn’t go nearly far enough. They recognized that it made logical sense to develop and deploy switches and routers with software architectures more in line with the servers they supported. Current network operating software (NOS) offerings were still monolithic in nature, implementing the operating system, routing and control plane protocol applications (often open source, but sometimes from suppliers such as Metaswitch) and management interfaces on a single code base.
This meant that, much like the legacy approach, you still had to “take” all the protocols, regardless if you needed them or wanted them, and you were bound to the stack supplier or source chosen by the NOS vendor. This could mean that your router has 450 protocol stacks buried in its codebase while you probably need only five. This is not just an issue of code bloat or limiting choice. The main problem is that network operators still must keep track of those unwanted and unused protocols -- almost as if they were employed in a production environment. Are engineers playing with a protocol implementation they found resident on a switch? Are there security vulnerabilities with unused protocols making their infrastructures prone to attack? Many protocols are also extremely fluid, in nature, being updated with new features or functionality or patched on a regular basis. With a monolithic NOS implementation, even the smallest protocol bug fix must be delivered in the form of a completely new NOS release and when you update the NOS, you must reboot the switch. So, a security update for a protocol you might not even use results in the need to reboot all your switches as soon as possible.
Complete software disaggregation forms the basis of composable networking, resulting in a switch/router architecture that reflects modern open compute methodologies, rather than looking like legacy mainframes. While the hyperscale data center operators have the hyperscale resources required to define their own models, deliver their own solutions and support the whole thing, the mid-to-large scale players who recognize they could benefit from composable networking simply don’t. In order to pursue software disaggregation, those players have come together in community-driven initiatives such as OpenSwitch and DANOS – both examples of which are managed under the auspices of the Linux Foundation.
Sometimes referred to as (or originally called, in some cases!) disaggregated network operating systems (DNOS), these models -- somewhat confusingly but very accurately -- define routing and control plane protocol stacks as applications. Accurate because in a disaggregated software architecture, the protocols are instantiated as such. Confusing because when people think apps, they think Candy Crush Saga and not a EVPN Control Plane.
DANOS (prev. dNOS) functional layers and components (Source: AT&T White Paper)
This hierarchical model affords numerous advantages -- not least of which being that a switch can now be built and managed in the same way as compute platforms: Exactly as we build and buy anything from low-end personal laptops to high-end corporate servers today. In operational principles, this means that critical elements, like a routing protocol, can be downloaded on an individual switch only as and when it is required. It can also be upgraded autonomously, without requiring a reboot of the entire operating system or switching platform. Indeed, an active routing stack can be updated in an entirely hitless manner.
Disaggregated network operating systems are the epitome of composable networking, enabling complete freedom to select the best control plane and management protocol offerings while also providing a pluggable abstraction layer for controlling the hardware, affording (for the first time) true separation of the box from the code. A network operator need no longer be beholden to the release schedules of a single vendor or settle for inferior functionality on one area to get the desired features in another.
Composable networking is fundamental to embracing continuous delivery and integration philosophies -- or DevOps -- in the networking world. But how that code is developed is also of equal importance. There’s no point in a customer base having the perfect foundation for CD/CI when their supplier has no foundation for agile continuous development. In this modern compute era, that foundation should come in the form of microservices-based development methodologies, with the applications built in the same way they are deployed -- not as one lump of code, but from atomized, reusable, elements.
As highly disaggregated systems, applications built using microservices approaches must employ a common framework. This framework provides a development toolchain that dramatically simplifies the software engineering process. A microservices framework abstracts the complexity of all underlying common services and lifecycle management systems, such as storage and orchestration, while providing a lightweight inter-process communication mechanism. Which brings me to a favorite theme of my posts where I get to say “and that concept isn’t exactly a new one” because this and other attributes are like those outlined within service oriented architectures (SOA), although microservices architectures are recognized for being more lightweight in nature. In the latter example, this means forgoing elaborate message-oriented middleware (MOM) for a simpler messaging system that is typically built on HTTP.
So, your composable network protocols, management interfaces and hardware abstraction layer applications are built on modern microservices methodologies. End users can individually select which of your world-leading applications (your BGP stack, for example) they want to employ, based on each specific application and they can instantiate it, on only the platforms that require it, without any downtime. Now what? Well, they need to configure it and that’s where we take a step back in time, once more, to the dark ages of computing. But we don’t have to.
The problem I have with many intent-based networking pitches is that they imply that the intent engine operating in network infrastructures, today -- namely the border gateway protocol (BGP), if they name it at all -- are fundamentally broken. They then launch into arm-wavy AI and ML pitches like the solution in something completely new at the switch/router level. The issue, however, lies not in the protocols, access control list rules or algorithms; it’s how the devices are configured. The goal of IBN is not to define new intent engines but to ensure the ones we have can be provisioned globally to conform with the operator’s intent. The fact is, there is usually a big disparity between the intent and reality.
While there’s been a lot of talk about automation, of late, that is not the solution as it doesn’t guarantee accuracy. Guaranteeing an operator’s intent has been correctly configured at a network level, with the appropriate checks and balances, requires entirely new methodologies and toolsets that demand the same degree of evolution in the core processes we are finally witnessing in the networking software arena with the introduction of composable networking philosophies.
In essence, the goal of intent-based networking is to ensure the policy specification makes sense and that the individual devices are configured accurately. We must then ensure that the dynamic state of the network, i.e. the routing and forwarding information bases (RIBs/FIBs), are correct and that the network is operating – and continues to operate -- as it should since the real runtime behavior of a network is constantly evolving as devices and links are added or fail. We need to know if a packet will make it from A to B in the event of a link outage and we need to be assured that a packet cannot breach a defined security boundary. Having network management tell us something has gone wrong, once the incident has occurred, is too little and way too late.
So far from being anything new, intent-based network is simply an approach to delivering what we always…err... intended. It’s a cohesive objective to build a universal environment in which device configuration can be written, checked for errors and continuously verified as operating correctly in runtime.
In the same way we finally recognized how little networking hardware and software philosophies have evolved over that last 20-25 years, the same can be said for network configuration and validation. The degree we’ve attempted to tame the complexities of network implementation and operation is absolutely nothing compared to how quickly we are attempting to – or are forced to -- evolve network infrastructures. Applications (the Candy Crush ones not the EVPN variety) are now cloud native and highly distributed, demanding access to more and more real-time data pulled dynamically from disparate sources. This requires us to deploy increasingly granular compute instances in more places. This has resulted in an explosive growth in switches, which has, in turn, dramatically increased the switch-to-network engineer ratio at a time where software defined networking has led companies to assume they can cut back in manpower, not hire more.
Network disaggregation and composable networking (CN) doesn’t help matters either, to be fair. Let’s face it, it would be easier if every restaurant was a Taco Bell, but in the same way choice and competition is healthy, we must evolve -- breaking monopolies, accelerating the velocity of innovation, reducing expenses and evolving architectures to support new models. CN is not the catalyst for IBN and you don’t need IBN for CN. They are complementary and compatible but remain mutually exclusive. Simply put, there’s CN, IBN and IBCN, and that’s just fine.
All that is best explained with exactly the type of fluffy diagram I vehemently berated at the beginning of this post. (No, I don’t have a conscience. Yes, I sleep well at night, save the increasingly frequent trips to the bathroom. Hmmm -- maybe that’s karma? Naw -- just old age. Moving on.)
Intent-Based Composable Networking (IBCN) presented in a totally non-fluffy way
We do have standardized tools for configuring network intent and validating network state and runtime operation. They are called vi or Emacs; Ping and TraceRoute -- which prove the fact that nothing has changed in 25 years. I’m being a little melodramatic as, again, there are automation tools and techniques being employed, like custom Python scripts or more standardized Ansible playbooks, that categorize endpoints into well-defined roles, but you get the point: It’s quite convoluted, highly error-prone and still not broadly adopted. Again, automation does not equal IBN and even if network engineers have adopted automation in one form or other, a recent survey reported that over 90 percent of them still use CLI for ad-hoc configuration changes, which means a network is always one keystroke away from unmitigated disaster.
NetDevOps Survey (Fall 2016). Source
The goal of IBN is to deliver a workflow that includes new toolsets that can debug and substantiate each step of the process and handoff between the policy specification and device configuration while simulating device state and runtime operation, preferably against numerous network state scenarios, prior to the configuration going live.
In our BGP example (being arguably all the intent you need!), we can look to Propane as a good example of a modern toolset targeted at reducing the disparity between operator intent and policy specification while guaranteeing the accuracy of the device configuration. Some may view Propane as an entirely academic exercise, but it has some big-name backers and is open source, while also spawning the almost obligatory academia-led start-up spin-off seeking to productize and support it. Propane introduces a new programming language, with which the operator expresses its intent as a highly abstracted, network-wide (i.e. device-agnostic) set of policies. Indeed, the Propane programming language can reduce tens of thousands of lines of individual device configuration into 50 lines of code, or less. It is the Propane Compiler that automatically generates BGP configurations but unlike manual operator configuration, compilers generally don’t make mistakes so the accuracy of the individual device configuration is practically assured.
The Propane compiler is perhaps the most interesting element here as it must take the policy and execute it. Doing so involves a three-step process: As with any compiler, first comes the intermediate representation (IR), where the regular expressions employed by the Propane language are summarized. Employing Tree Automation (or tree automata, if we want to be scientific), Propane reverses and defines paths so that the initial states become the final states – we work back from the destination to the source, rather than from the source to the destination in order to define the intended path. In that respect, this is a classic nondeterministic algorithm where (unlike deterministic algorithms) we do not expect the same result each time we run the operation. Lastly, the reverse automata’s are combined with a topological map to create a product graph that includes all policy-compliant paths while taking into account all possible failures in the network. This is then translated into BGP filters that can be pushed to the individual devices. Once again, it’s easier to show, diagrammatically, how Propane takes a high-level intent specification and dynamically generates device-level configurations.
The Propane Compiler pipeline (via various Propane-related sources)
As in integral part of IBN, the same group of academia that developed Propane have also tackled the other intent-based issue of state and runtime validation. The oddly named (though no stranger than any other technical initiative, I guess) Batfish project is targeted at providing a general approach to network configuration analysis and verification. Since its inception, Batfish has spawned other initiatives, such as Minesweeper. This also suggests a nautical foundation for the nomenclature, which is a very important matter for me to clear up. There is also complementary (but far less nautical) work on BGP validation coming out of academia, namely Bagpipe.
Batfish performs control plane analysis, taking a network configuration and a specific environment scenario (i.e. one that includes link failures) and analyzes the resulting data plane. The problem with Batfish is that it could only examine one data plane on each run. Minesweeper, on the other hand, can verify all possible data planes that might emerge from the control plane, including interior gateway routing topologies and protocols (OSPF and IS-IS) plus access control lists (ACLs). While I don’t want to get in the middle of a professorial bun fight, Minesweeper claims superiority over Bagpipe, in this regard, as Bagpipe models only one Autonomous System (AS), considering only networks where BGP border routers are connected in a full mesh. Even if that’s not correct, the fact that I now keep thinking of Mike Myers in So I Married an Axe Murder means that I’m going to insist that it is. Minesweeper wins.
Control plane analytics tools must still depend on data plane analysis as part of their verification processes, which means this must evolve to support intent-based networking. Classic solutions, such as BGP route servers, only have limited information, such as the routes advertised. There’s no indication of why they were ultimately chosen. This level of data has typically been obtained through regular polling using SNMP or CLI scraping, which is hugely detrimental to a router's performance. You will bring the network to its knees and still not get even close to the level of real time, granular, BGP information required to support these verification engines.
Defined within RFC 7854, the BGP monitoring protocol (BMP) overcomes these limitations by dynamically streaming every significant BGP event, allowing the validation engine to get a complete, real-time view (and maintain a historical record of) of BGP state in every router without any performance degradation. While other protocols, namely NETCONF, can stream BGP RIB information, BMP is the preferred mechanism, which is why the BGP protocol stack you ultimately select must support it.
Thanks to composable networking, you have that choice: You can pick the most robust BGP stack for your application and within the most suitable operating system on the best hardware platform, which is one reason why CN is also intrinsically tied to IBN rather than exclusively the other way around. With composable networking, we can parlay the hype around intent-based networking, providing the network infrastructure required to support this new provisioning and operational model, and with intent-based networking, we can help realize the goals of composable networking. So, I’m flying the IBCN flag, but please don’t rush to help. Marketing’s got this.