Full disclosure: I work for a company that sells software that enables network operators to deliver services based on Rich Communications Suite (RCS) specifications. If that’s not completely obvious from even the shortest breeze through (the rest of) our website, let me know and I’ll inform our marcom team. Quite frankly, I’ve heard countless people say they are utterly useless and it’s about time someone told them to their faces. I’m also going to make another admission here in the interest of openness1 and out of respect to long-time Metaswitch followers who might think that our newfound love of RCS is a tad hypocritical: Back in the early part of the decade, we bought an RCS application server company and then sold it, a few years later.
Unlike in those unfathomable real estate reality shows, this was not a deliberate fix and flip to make a quick buck, nor was it because we lost faith in RCS -- quite the opposite, in fact. The problem was one of having a proverbial tiger by the tail. We are a solutions company, but RCS was dependent on an IMS core to operate effectively and there were simply no cost-effective options to provide this critical infrastructure. The market was monopolized by big vendors that managed to lock in this open infrastructure and were charging a pretty penny for each provisioned subscriber. Moreover, in order to justify this cost, they were holding on to a hardware-centric deployment model, for a solution that should have been 100 percent software-based. Yes -- we tried fronting the RCS application server with a software shim to “fool” it into thinking it was connected through an IMS core, but it simply would not scale to the degree that such a communications offering should. With that, it was hard to integrate the product into our portfolio, so we let it go to a specialist company, where it thrives to this day.
Some might say that this was our catalyst for building Clearwater -- a revolutionary cloud-native IMS core built from scratch using web design methodologies and released to the open source community, where it turned those existing IMS pricing and deployment models on their heads. That’s what I would say, at least, but that might be just me and my penchant for the dramatic.
Now, you could accuse me of having a bit of a love affair with IMS and RCS -- and you are probably right -- but in all fairness, it’s more eHarmony than Tinder. This relationship has been a long-term one, going all the way back to 2001. That’s when the Internet Engineering Task Force’s SIMPLE Working Group Charter was approved. In classic IETF acronym form, SIMPLE defined a suite of standards (RFCs) for easily employing the Session Initiation Protocol (SIP) for Instant Messaging and Presence Leveraging Extensions. Formed in 2002, in an attempt to consolidate the efforts of fragmented and overlapping standards and specifications bodies (NFV ecosystems, take note), the Open Mobile Alliance (OMA) picked up the baton in March 2005.
In a series of early candidate (1.0) specifications, IETF SIMPLE was, itself, extended with other standards, such as XML configuration and document management (XCAP/XDM), for communicating and storing presence policies and buddy lists. In 2007, those specifications were combined to establish the OMA’s Converged IP Messaging (CPM) initiative, which started to take into account existing services like SMS and MMS. The GSMA took CPM as its foundation and added elements that it excelled in, namely video calling. This was introduced to the world in December 2008 as RCS 1.0.
The GSMA’s RCS Timeline
Like all network operator standardization initiatives, it’s been a long road -- though not as long as IMS itself, which traces its roots back to before 1999. 2 Design by committee; endless corner cases; backwards compatibility -- this is simply the nature of the beast. That doesn’t mean the resulting solution, a decade later, is dated or otherwise undeployable. There are many examples, in my posts, where I’m diving into some “new technology” and can trace its roots back many decades. In fact, with time, a solution often becomes more deployable. SIP, itself, is a good example of this. Those lauding the binary, lightweight nature of H.323, circa 2000, will remember bashing SIP for its high overhead and packet parsing demands. Fast-forward a few years and the processing capacity of even the smallest consumer communications devices,3 combined with the abundance of mobile bandwidth, made it a moot point.
The introduction of joyn was an admission by the GSMA that RCS was bloating and that non-interoperable islands were forming, due to differing interpretations of the comprehensive RCS standards or differing regional requirements. With the goal of broad interoperability,4 joyn defined a deployable product based on a subset of RCS features with clear implementation specifications. However, as is always the way in a nascent, highly fragmented global market, still not everyone wanted the feature sets defined in joyn. Some wanted more. Some wanted less. Enter joyn Blackbird and later joyn Crane, which was developed for just five (admittedly large) network operators.5 Right about now, the naysayers are probably collectively throwing their hands up in the air in exultation, while proclaiming “Exactly! That’s my point!” I get that. However, these product definition documents did provide flexibility in the early days, while still serving to turn those islands of interoperability into continents. The introduction of service definitions attempted to simplify the underlying infrastructure in the same way. It is through a combination of these product definitions and service descriptions (and because of the experience gained through them) that we now have the Universal Profile Service Definition, which is, according to the GSMA, “an industry-agreed common set of features and technical enablers for Advanced Communications.” 6 And the GSMA has never lied to me yet. Well, if I ignore absolutely everything to do with Mobile World Congress, that is, because that’s just what you get when a technical body hires sales and marketing people.
Inside the RCS Universal Profile
The Universal Profile’s service definition document breaks down each feature related to the standardization of RCS deployments into two areas: User Experience (user stories and feature requirements) and Technical Information. The user stories make my stomach turn a little, if I’m being honest (and I usually am), but it’s probably just the peppy nature of the statements that is both grating and brings out my rebellious side. “As a user, I want to enjoy the benefits of enriched calling whenever I make / receive the call over active RCS SIM.” See what I mean? Even if I did, which I do, I’m now going to argue that I don’t.
RCS Universal Profile high-level feature descriptions
Most of the features are bread-and-butter stuff, which is a good thing in my opinion. An appropriate alternative and upgrade to existing offerings employing the updated technologies and techniques we have to hand, rather than an attempt to introduce weird and not-so-wonderful communications concepts to replace basic call and messaging services. There are a few notable exceptions, however, the most prominent being (you guessed it) “Enriched Voice Calling,” which also holds the honor of being the longest section, so it’s probably not just me who thinks it is important. The “stories” here tell tales of people employing pre-call, in-call and post-call services like file sharing, location pushes and live sketching over maps and images. The RCS UP specification also keeps the RCS-e live video sharing option, where an end-to-end IR.94/IR.51 video call is not available because the calling or called (or both) handset is in a circuit switched (CS) domain.
These features are, of course, widely available today but typically from disparate applications or confined to device/os/service provider-specific stovepipes. Plus, the experience is not integrated into a generic call, in that you have to jump from application to application, and you can’t interact with every other (mobile) caller in the same way.
Now, once again, Metaswitch is no stranger to such call enrichment. Some might say we pioneered delivering such services, way back in 2011, with our cloud-native, EC2-hosted Thrutu service/app, which delivered a similar experience and features to those described above. To prove it, here’s me in a pink shirt, which -- trust me -- was fashionable back then, demonstrating Thrutu at CommunicAsia 2011 in Singapore for China Central Television (CCTV) News.7 [0:46-1:22]
I demonstrate the Thrutu in-call enrichment app in Singapore (2011) while wearing a pink shirt.
Back to the Thruture
So, six years on, why are we back here again? It’s certainly not only so I could use that brilliant subtitle. Thrutu was great and helped us understand what it takes to build cloud-native communications applications -- something other vendors still only do in their advertising copy8 -- but there were commercial issues that proved impossible to overcome. To put it in GSMA terms, Thrutu could not deliver on the “Green Button promise.” Even though the app had an awesome in-call dialer overlay9 it only really worked well on Android-based handsets because Apple, quite frankly, didn’t really want us doing that sort of thing, at the time. iOS support improved a little, over time, and the interface got better, but now Apple is really opening up with the likes of CallKit10 -- a sign, perhaps, that the company is finally confident enough in its position as the premier platform provider to let go of the services reins, a little.
But the UX disparity between devices was only one of the problems. While it came complete with open development APIs, Thrutu was still proprietary in nature. Even if a major mobile operator chose to pre-load it, it would still require a user to enable the application and would only work if they were calling another subscriber who had gone through the same process. It was the epitome of Metcalfe's (n-squared) law. Our early experience with Thrutu, back in 2011, also demonstrated that mobile endpoints still lacked the oomph and users lacked ubiquitous, cost-effective bandwidth.
The last issue we struggled with was not technology or infrastructure but user behavior. Our (brilliant) marketing presentations highlighted an opinion that the phone call was moving from a heads-up to a heads-down experience, like all our other mobile phone activities, with a permanently inserted earbud or two. While I would never admit that I was wrong to my wife, I will concede to you, here, that not only was this view a little shortsighted in that it suggested that all phone calls would be enriched in some way (even if just by video), it also implied that we were incapable of saying “hold on -- I’m looking” and removing the device from our ear for a short time or temporarily turning on speakerphone. That’s what I frequently do today and it’s an approach that works and scales just fine. Duh.
The case for the RCS UP
I’m not an emotional person. OK -- that’s not, strictly speaking, true. I’ve been known to get a little angry, at times, but never over having to text from one application or persona, video chat from another and call from a third. In the same vein, I don’t go kissing the ring at 1 Infinite Loop either, but I do experience a pang of empathy when I see a green text bubble appear in my iPhone text window. Maybe that’s why I’m so motivated to level the playing field. One bubble for all.
What about WebRTC? Well, we love it, here at Metaswitch. I was actually demonstrating browser-based WebRTC calls across an EC2-hosted IMS core back in 2012, employing Project Clearwater and its integrated SIP-over-WebSockets gateway functionality, which is now a feature of our world-leading and all-around exceptional session border controller: Perimeta.11 Fortunately, I don’t have a video of that, else I’d make you sit and watch that one as well. WebRTC and RCS are not mutually exclusive. One does not negate the other. Indeed, lacking signaling mechanism of its own, one might argue that WebRTC needs SIP and, together with RCS, the two can live in relative harmony off a common IMS infrastructure, extending telephony and other enriched service offerings to practically any endpoint.
RCS and WebRTC Harmony
While RCS remains a contentious topic, most agree that the communications environment has changed significantly in the last five years. We no longer have an anti-competitive IMS environment dominated by large, incumbent vendors pushing tin in an attempt to shore up their legacy business. Instead, we have open, cloud-native virtualized network functions that can be dynamically instantiated and elastically scaled in highly orchestrated public and private compute infrastructures with granular subscription and usage-based pricing models. The dynamics of delivering the critical core IMS architecture to support RCS has fundamentally changed, as has the way we deploy the all-important application servers.12 There is copious processing capacity in even the lowest-end handsets and there is an abundance of easily accessible bandwidth. There are also many more ways to communicate and that’s a good thing. We message, Skype, Snapchat, but sometimes we just want to talk and that should be a pleasant, enriching experience.
In Part II, I will delve into another option for enriching the Internet of Talk for the mass consumer market: mobile UC. In the meantime, please help me bolster the campaign figures while, of course, learning more about the IoTalk, by visiting www.metaswitch.com/IoTalk
1. And with a nod to the freedom afforded to me by my esteemed management line… or an acknowledgement that they generally ignore what I write. Whatever.
3. Note: The case for a binary CoSIP for constrained IoT endpoints per http://www.metaswitch.com/the-switch/iot-the-internet-of-turkey-bacon not withstanding
4. Thankfully, Joyn was more than just a marketing initiative.
5. Deutsche Telekom, KPN, Orange, Telefonica and Vodafone
7. Yes - this is a thinly veiled attempt to pump up my views. Please help. 200. That’s got to be an achievable goal. Anything less is just embarrassing,
9. Technology we employ today in our Accession-based mobile UC propositions, which we will introduce in Part II
12. Do I get any credit for not mentioning Google Jibe here? Meh. No, probably not. How about the fact that this article doesn’t make a single mention of MaaP? That has to be worth something, no?! Or does this footnote count as a mention? Damn!