Does 5G mean we can finally update the mobile data plane protocol?
For reasons I can’t even begin to explain, I’ve never particularly liked the General Packet Radio Service Tunnelling protocol (GTP). It’s certainly not because I don’t like nested acronyms because nothing could be further from the truth. I love them! Maybe it’s because I subconsciously believe data plane protocols should be the domain of the IETF, rather than the 3GPP. Childish, perhaps, but people who know me will recognize childishness as a distinguishing quality of mine. Fortunately, recent interactions between the two aforementioned factions would suggest I’m not alone.
The advancement of mobile network technologies and infrastructure towards its fifth generation comes replete with oft-repeated industry rhetoric around high bandwidths and low latency. The higher bandwidths can be implemented, in a limited manner, purely with the introduction of 5G New Radio (5GNR) in the Radio Frequency (RF) domain while employing what is essentially the same Distributed RAN (D-RAN) architectures used today. The ease by which individuals can run speed tests within pockets of New Radio, occupying existing bands of spectrum within tightly controlled areas and environments, has the effect of misleading the general public (and people who should frankly know better) into believing that a broad rollout of 5G is something that can be achieved with relative ease.
We can ignore the fact that these early pioneers are not witnessing any low-latency qualities, not simply because the commensurate infrastructure is not in place but because this attribute was never intended to be a consumer-level feature. Even still, when the few 5G forerunners are joined by millions of individuals and billions of IoT endpoints, the complete evolution of the global mobile architecture must be implemented to support them. While there will be a migration period from 4G to 5G, given the insatiable demand for mobile bandwidth, it will be swift and transitory. Plus, in the eyes of some industry pundits and vendors, the shift has already happened, and we are running headlong toward 6G. Meanwhile, the smart money remains laser-focused on the 5G end goal, while we remain cognizant that work by the relevant standards bodies in the area of user plane protocols and services in the 5G Core (5GC) -- an area of significant importance -- has, actually, barely begun.
Dissecting the Dataplane
The evolution of the 5G user plane can be divided into two distinct areas: The Radio Access Network (RAN) and the multi-layered access, aggregation, edge and core. The need to support reliable high-bandwidth mobility and low latencies is resulting in an architectural revolution in the RF domain. Millimeter wavelengths from highly distributed, dense antennas featuring massive multiple-input, multiple-output (MIMO) and beamforming/tracking necessitates a re-architecture of the backhaul RAN infrastructure. The D-RAN will be replaced by a disaggregated Cloud RAN (aka a Centralized RAN/C-RAN), which separates the higher layers of the radio functions from the lower layers.
This allows previously co-located gNodeB (gNB) functions to be divided and deployed at different points in the RAN, itself split into fronthaul, midhaul and then backhaul realms, depending on the granularity of the RAN’s decoupling. In its most disaggregated configuration, the antennas can be deployed on lampposts with the Radio Unit (RU) close by, reducing RF signal loss. The digital baseband function’s Distributed Unit (DU) and Centralized Unit (CU) can then be deployed both in the corresponding fronthaul or in the fronthaul and midhaul before the traffic is ultimately backhauled to the mobile gateways. In 5G, these are the User Plane Function (UPF) for data and the Access and Mobility Management Function (AMF) for control plane information, which themselves reside at the network Multi-Access Edge Compute (MEC) and 5G Core (5GC) networks, respectively.
A transport network architecture for Independent CU and DU deployment (Source: ITU-T GSTR-TN5G)
While there are eight defined options for functionally splitting the C-RAN, the most likely option -- a Low-Level Split (LLS) somewhere between Option 7 and Option 8 (aka 5G(c)) -- is a trade-off between throughput, latency and cost-effectiveness. This LLS requires PHY-layer and MAC-layer encapsulation and transport of RAN protocols -- namely Common Public Radio Interface (CPRI) -- in the fronthaul and midhaul, which would be supported by DWDM and Ethernet. As one might expect, there are industry consortiums being formed around this matter (specifically xHaul) but with the GPRS Tunneling Protocol (GTP) as currently defined, originating at the MEC-located CU, the focus is purely optical.
Optional split points in the 5G signal processing chain (Source: ITU-T GSTR-TN5G)
The mobile gateways first pick up the user plane protocol (GTP) over the 5G N3 reference interface. Outside the interface designation, the major difference between earlier generations is the fact that the User Plane Function (UPF) of the separated control and user plane architecture would likely be co-located with the CU in the Multi-Access Edge Compute (MEC) layer. This then marks the demarcation between the switched optical and packet switching domains, requiring the cloud native UPF to implement the broad range of packet handling functionality defined within 3GPP’s Technical Specification (TS) 23.501.
With the revolution occurring in the RAN, it’s only natural that the relevant industry standard bodies have taken a step back to examine the current user plane protocol and have begun reflecting on whether it continues to be the ideal solution for meeting the objectives of 5G. If every other element of how mobile networks are architected is being rethought, why not GTP?
The default decision to adopt this IP overlay for 5G stems from its deeply embedded deployment base and previous success in the scaling of isolated mobile session states. 3GPP’s Technical Report (TR) on Core Network and Terminals (TR 29.891) captured the user plane requirements for 5G, concluding that a GTP user plane (GTP-U) version 1 was the protocol to employ, going on to suggest how to transport 5G-specific Quality of Service (QoS) information between the UPF and the network infrastructure. Far from a detailed analysis, this conclusion and analysis was performed in under 250 (somewhat repeated) words in Section 5.2 and 11.2 of that document.
Follow the Standards
Focused on the complexities of mobile communications, the 3GPP would likely be the first to admit that it rarely concerns itself with the underlying network transport and control plane protocols. The Internet Engineering Task Force’s (IETF’s) remit, however, is a complete contrast. Touting a charter that includes extending the IPv6 protocol to support mobility anchoring in flattened network architectures, it’s no surprise that the Distributed Mobility Management (DMM) working group began investigating IPv6 as an alternative to GTP as a user plane protocol.
The 3GPP was made aware of this during a coordination meeting at IETF#100 in November 2017. Subsequently, 3GPP working group CT4 (Core and Terminal) initiated a study item on user plane protocols in 5GC for Release-16 (5G phase 2) and the relationship was formalized with a liaison request from the 3GPP in January 2018. In their initial justification, working group CT4 noted that the “…growth of IPv6 adoption as a user packet data protocol has been observed” and that “it is worth investigating the potential limits of the existing user plane solution and potential benefits of alternative user plane solutions.”
The Study on User Plane Protocol in 5GC commenced in July 2018 as 3GPP TR 29.892. At the same time, the IETF DMM Working Group started documenting their investigation into the topic within IETF (Informational) draft-hmm-dmm-5g-uplane-analysis: User Plane Protocol and Architectural Analysis on 3GPP 5G System. Both working groups are focused on the 3GPP N3 reference interface from the RAN to the Packet Data Unit Session Anchor UPF (PSA-UPF) and the Intermediate UPF (I-UPF) and the N9 reference interface from the I-UPF to the PSA-UPF, although N9 is the primary area of interest.
User Plane Protocol Requirements
The architectural requirements for any user plane protocol employed in mobile network infrastructures can be derived from two 3GPP technical specifications: TS 23.501 and TS 22.261. Essentially, these documents describe the UP protocol as a stateful tunnel representing a single packet data unit (PDU) stream. In today’s implementations, IP packets from the user (identified by way of a Tunnel Endpoint ID/TEID) are encapsulated into GTP frames, which are then carried within IP packets. The term PDU in 3GPP parlance includes the IP header, rather than simply being the payload, as the IETF defines it.
In reviewing these 3GPP specifications, the IETF DMM WG identified 8 architectural requirements for any user plane protocol that is employed within mobile infrastructures -- up to and including 5G. It must support:
- IPv4, IPv6, Ethernet and Unstructured PDUs.
- IP connectivity over the N3, N6 and N9 reference interface.
- The deployment of multiple UPFs as anchors for a single PDU session.
- Flexible UPF selection for each PSU session.
- No limitation for the number of UPFs in a data path.
- The aggregation of multiple QoS flows (identified by the QFI) into one PDU session.
- Support for Network Slicing.
- End Marker support, indicating the end of a payload stream on any given tunnel.
These architectural requirements form the basis of the IETF’s research into 5G release 16 candidate user plane protocols. Employing the IETF’s IPv6 specification (RFC 8200 / STD: 86), the 3GPP’s TR 29.892 v1.1.0 (April 2019) dives into an analysis of IPv6’s ability to support the requirements of GTP in the 5GC. It then considers an alternative to GTP: Segment Routing with IPv6 (SRv6). With the main body of the analysis largely complete, 3GPP members are currently debating what conclusion to reach from this body of work.
Following the 3GPP’s WG4 meeting (#91) in May 2019, it was clear that there are two distinct camps with opposing opinions on this matter. Large incumbent vendors with a significant installation base (Ericsson, Nokia, Huawei) joined with their large operator associates to protect their current infrastructure and recommend that, with some updates, GTP remains the user plane protocol employed in 3GPP Rel-16. As a relative upstart in this arena, Cisco partnered with more agile operators to propose that TR 29.892 concludes that SRv6 eventually replaces GTP in the user plane.
With over 70% market share of the equipment which will underpin 5G infrastructure, Cisco is making a play to extend emerging network technologies into the mobile user plane, eliminating the need for disparate overlays and onerous overheads. Employing an IETF hammer, they are using the DMM working group to help defend their position with a standards-track draft “Segment Routing IPv6 for Mobile User Plane.” This is sponsored by a similar group to the one making the case for SRv6 in the 3GPP but with the addition of Huawei, in the form of its Futurewei alter ego, whose unofficial remit is to participate in as many standards activities as possible -- regardless of their conflicting positions.
Not surprisingly SRv6 is itself the result of activities within the IETF. After several birds of a feather (BoF) gatherings, the Source Packet Routing in Networking (SPRING) working group formed in 2017 with a remit of defining “procedures that allow a node to steer a packet through an SR Policy instantiated as an ordered list of instructions called segments.” This would be “without the need for per-path state information to be held at transit nodes” and while avoiding “modification to existing data planes that would make them incompatible with existing deployments.” The initial output of the SPRING WG was an approach to segment routing employing Multiprotocol Label Switching (MPLS). Both SR-MPLS and SRv6 deployment scenarios were ratified in 2018 as RFC 8402 “Segment Routing Architecture.”
MPLS segment routing differs from a classic MPLS implementation in that there is no label distribution protocol (LDP) or Resource Reservation Protocol (RSVP) employed and therefore the network does not need large label databases or countless traffic engineering LSPs. Segments are defined and pre-established in the network, providing paths between adjacent nodes (an Adjacent Segment) and non-adjacent nodes, known as Nodal Segments. With predefined and well-advertised paths to the network service functions, an SDN controller -- complete with a Path Computation Element (PCE) and using the PCE Protocol (PCEP) -- calculates the most appropriate route and tells the classifier, a label edge router (LER), to apply a stack of MPLS labels to the traffic flow. At the end of the chain, the last label is popped, and the traffic goes on its way using whatever transport technique the network operator desires.
Around the same time as MPLS segment routing was on the IETF draft circuit, a multiprotocol Network Service Header (NSH) was defined in order to deliver on the Service Function Chaining (SFC) component being defined within ETSI’s Network Functions Virtualization (NFV) initiative. In 2017, SPRING made a course correction and moved to circle the wagons around an implementation of segment routing built exclusively on IPv6, which was originally conceived of within the Network Working Group around the same time SPRING was being formed. A primary reason for this was the fact that SRv6 could deliver on SFC without the need for any additional overlays, such as the NSH. With a common IPv6 transport layer and sole SRv6 overlay, potentially complex virtual private networks (VPNs) and service function chains could be dramatically simplified and made far more scalable.
SRv6 Service Function Chains (SFCs) within Network Slices (Source: IETF99 WG presentation)
Defined within raft-ietf-6man-segment-routing-header, segment routing is applied to an IPv6 data plane by encoding IPv6 segments into new routing extension header (SRH). The concept of IPv6 headers themselves are outlined within RFC 8200, with the SRH being appointed a new routing type by the Internet Assigned Numbers Authority (IANA). The SRH is variable in length, comprising eight bytes of standardized fields followed by a list of 128bit segment IDs (SIDs), which are broken into three fields: Locator, Function and Arguments. These represent the forwarding information (Locator) and any actions to be performed at that destination (Function), plus any information required by the individual SID (Arguments). The Argument field of the SID could carry the QoS Flow Identifier (QFI), for example. While not an IPv6 address, the most significant bits of the SID are routable in an IPv6 network. Consequently, the Next (SID) Header (which is decremented at every SRv6 hop) is also copied to the IPv6 Destination field, allowing standard routing practices to be applied to SRv6 packets when traversing non-SRv6 capable network elements.
SRv6 SID’s making up a segment list inside a segment routing header within an IPv6 payload
Finally, the type-length-value (TLV) header, prior to the IP payload, enables the SRH to carry metadata pertaining to the user, security, or a service function chain. With SPRING running full steam toward an end goal of segment routing using IPv6, the working group also made a play in defining “SRv6 Mobility Use Cases,” this time backed by larger players like AT&T.
SRv6 vs. GTP
GTP’s legacy is its primary advantage, which is why its continued use has the backing of some of the larger network operators and most of the incumbent equipment vendors. Employing GTP as a user plane protocol will serve to lock in existing infrastructure and could effectively stifle innovation. Coupling the mobile infrastructure too closely to the underlying transport is a universal threat to the status quo, potentially turning the tides toward a single-vendor full-stack mobile network implementation.
In reality, GTP comes with the type of baggage you might expect from a legacy overlay. Stateful tunnel set-up and teardown requires complex additional control plane signaling and there are excessive packet overheads that affect packet processing throughput. This is amplified as the underlying transport infrastructure is almost inevitably comprised of tunnels itself. Adopting SRv6 in the mobile infrastructure has the added advantage of employing the very same Layer 3 VPN techniques that will ultimately be used for managed enterprise services, particularly valuable in 5G fixed mobile access (FMA) and fixed mobile convergence (FMC).
While GTP does indeed meet all the architectural requirements of a mobile user plane, employing SRv6 will serve to eradicate an unnecessary overlay, reducing protocol overheads while eliminating the need for the User Plane Function (UPF) to maintain state. This is particularly important in cloud native implementations of network functions, such as the UPF, that demand stateless operation to effectively scale. Simplifying and flattening the network exposes mobile user plane flows to the transport network, enabling efficient routing and service differentiation when operating a network slicing environment, or not. Although the advantages of an exclusive implementation are obvious, GTP can also run over SRv6, allowing a smooth migration path. Ideally, the encapsulation would occur at the N9 reference interface between the I-UPF and the PSA-UPF, eliminating the need to upgrade the gNB N3 interface. This is less critical as N3 likely operates over a non-switch infrastructure, anyway.
Due to the fact that a packet processing engine must continually dig deep into an IPv6 header, SRv6 will also prove problematic for legacy suppliers to adopt. Indeed, ignoring the reality that existing application specific integrated circuits will be unable to support it, SRv6 will likely demand a programmable software data plane in order to be effectively implemented. This could still be hardware-centric switch silicon, within the underlying transport network, using vector packet processing (VPP) and P4, a language for Programming Protocol-independent Packet Processors. The UPF, however, must be a fully virtualized network function, running on standard x86 compute server hardware within a containerized environment. Only this allows operators to truly achieve the 5GC objectives. In this challenging environment, highly specialized data plane acceleration techniques will be required to meet the deep packet inspection demands of SRv6.
SRv6 requires the UPF to participate in the route discovery which, at a minimum, would be an Interior Gateway Routing Protocol. With the advent of IPv6, many network architects are moving to IS-IS because it is a Layer 2 connectionless network protocol that doesn’t use IP to transport routing information. With Version 2 only supporting IPv4, the OSPF protocol was rewritten in v3 to support IPv6, but IS-IS remains neutral regarding the type of network addresses it can route and therefore natively supports IPv4 or IPv6. The exterior Border Gateway Protocol (BGP) would be required for more advanced interworking requirements, such as peering over the N6 reference interface to the data network. It would also be necessary to support Layer 3 VPNs, as outlined within the IETF draft defining SRv6 BGP based overlay services (dawra-bess-srv6-services). Increasingly critical in this content-driven world, SRv6 also allows efficient SLA-enabled multicast content injection using the standard unicast core through a method called “Spray”. This technique has been outlined within informational draft IETF documents.
A New User Plane for a Next Generation Network
The continued use of GTP in the mobile user plane serves to prolong the dominance of legacy equipment vendors and their embedded base of network functions limits the ability for network operators to truly realize the benefits of a dynamic, highly automated and granularly orchestrated 5G core network. Aligning with the underlying infrastructure may require market positioning that involves not only the mobile architects but also the transport network engineers. However, I believe they will favor a 5G core infrastructure which is better integrated, over one that continues to employ an archaic overlay. And if you don’t agree, I’m taking my ball and going home.