Hello Planet,
This post is about how Ring ( http://www.ring.cx ) and its KDE client ( named, umm, Ring-KDE, formally known as SFLPhone-KDE ) work. Ring is a communication platform built on open standards aiming for maximum compatibility with the existing software and hardware infrastructure while providing a secure and distributed architecture. First of all, a little bit of background information and terminology:
* Centralized service: A service where all request pass through a single entity. Such services include Skype, Facebook and Google Hangout. While it is possible to implement client to client encryption with Diffie-Hellman on top of the service, the vast majority don’t and use term of services to notify the user about how they decide to use your personal information.
* Federated service: A service where different provider can setup their own node and communicate with each other. This include DNS, HTTP, Jabber/XMPP, Diaspora and emails. Their still require some “fix” components and infrastructure to locate and communicate with each others. All the data is often seen unencrypted by all the nodes involved if no additional security layer is implemented of top of the base protocol. You have to trust all nodes involved, something that is, in practice, impossible given the potentially infinite number of nodes.
* Decentralized: Each client is part of a cloud of “equal” clients. No centralized node is required. Order can be created momentarily out of the relative chaos and each clients are responsible to be “good citizen”. The most common “protocol” for this is called DHT (distributed hash table) The most well known example of this are recent Bittorrent clients, many scientific cluster software and some malwares. Privacy and security is left in the hand of each client and handled locally. Correctly implemented, only basic network metadata is being “leaked”. Even then, it is only available to your ISP.
Ring is able to work in all 3 of these modes. It’s predecessor, SFLphone, could only work on a centralized SIP and IAX2 server, multiple of them using built-in media mixing or direct IP to IP mode that required open ports on each side. Ring add a new type of account implementing secure SIP top of a DHT cloud. Given the project early state, there is little information about how everything works together and how Ring is different for emerging alternatives such as Tox or Bittorrent-Blink.
SIP (Session Initiation Protocol, RFC3261) was created in the 90’s by the IETF, the same comity responsible for most Internet standards. It share a lot of similarities with HTTP and emails. It is less or more the stateful equivalent of HTTP. SIP is a very large specification comprised on a master RFC (request for comments, the name used by the IETF for their standards) and multiple sub RFC to address individual use cases, such as file transfer (RFC5547), text messaging (RFC3428) or security. It also integrate with other IETF specifications such as TLS (RFC5646), ICE (RFC5245), UPnP (RFC6970), MIME (RFC2045), STUN (RFC5389), TURN (RFC5766), RTP (RFC3550), SDES (RFC4568) and many more. These day, even if the trend of using gigantic and complex protocols for stateful data synchronization faded (SOAP, UML, CORBA anyone?), SIP survived. It is the industry standard and is used by most VoIP product shipped today. It is also used in niche industrial use case to synchronize/negotiate streams and in some distributed networking products (like Ring-DHT!).
What is also notable is the sheer number of hardware that can interact with SIP. Any landline phone can be bridged to work with SIP using a cheap adepter of an online service. Most office hardware phones with ans RJ45 or Wifi equipped networking actually use SIP, so Ring can be a drop in replacement for these device while being integrated into your desktop and existing applications. For example, it can pause your music in Amarok when an incoming call arrive or you can click on phone numbers in KAddressBook or Firefox to place a call. With a good headphone with integrated microphone, you don’t even have to move to handles phone calls. But, there is more, remote communication is not all about phone calls anymore. Skype anyone? SFLphone was all about voice (and a little about video), Ring is all about communication, the media(s) are up to you!
But this introduce a new kind of issues: how do you connect all users? Historically, SIP was mostly used in client/server scenarios. While the protocol is very generic and very vast (for the better and worst), this was by far the most common mechanism. For one thing, this is how the Internet/Web usually work, and the landline phone/telegraph network before it. Then, for most “large” use cases, such as corporation and business VoIP, it make sense. The large and mighty IT departments of the pre BYOD movement also wanted a lot of control when it came to VoIP. While those case are still supported, Ring want something else: connecting people together.
While it can be done using a centralized server, like most competing product do, we also want it to be secure and distributed. This is why we are working on our layer under SIP to find and connect people. For this, we use the Distributed Hash Table (DHT) theory. You might not hard heard about it, but you may already use it. This is what power the Bitorrent MagnetLink technology and help finding peers. The “protocol” is quite simple. You first have to find a bootstrap node. This can come from an hardcoded list, a cache from previous sessions, a web server, tor are from your local network using MDNS or UPnP discovery. For now, Ring use a cache and an hardcoded DNS entry. An UPnP discoverability feature is partially implemented. You can then get a value out of the table. For this, you query the bootstrap node, that query more nodes, and so on until the value is found or a timeout is reached. You can put a value, where you send it to 8 nodes. You can also listen to specific key insertions and be notified asynchronously. The implementation is still a work in progress, but is already quite mature and has been proven reliable in medium sized deployments.
On top of the DHT layer, we add a security layer. This layer add public key infrastructure signing using TLS (eventually with TLS 1.3, this will provide forward secrecy too). Each “account” are in fact a pair of public/private keys created locally by the user, just like the keys you use to authenticate yourself when doing a git push. This is used to ensure that a client read a value that really come from the right source, not just someone hijacking the DHT key. To know if the source is right, Ring use a mix of public key infrastructure (signed certificate), certificate pinning (avoiding man in the middle a attack in subsequent communications with a self signed peer certificate) and other means of distribution, like inserting the certificate key in contact backends such as Akonadi[1], GMail, Evolution Data Server, a vCard folder, Apple Contact or Windows contacts.
When a user turn Ring on (with a DHT account enabled), it will start to listen to key inserted that correspond to its certificate public key. When it receive such information, it will match it to a know (or new) certificate. Depending on settings, it can tell the peer its IP address and start a peer to peer socket negotiation using ICE and UPnP. Obviously, this is like telling the world your IP address and opening ports: Not the best idea. To solve the “potential” security and privacy issues, Ring can be used in private mode. In this mode, it will only automatically answer requests from known/allowed certificates. For new person/certificates to be able to call you, they first have to perform a “trust request”. The same as you have with other products where someone has to ask first yo be in your friend list. This trust request come with a vCard containing the person profile. This can be used to insert this person into your contact backend such as Akonadi. This vCard can also be signed using a certificate authority to validate it. Currently, while in alpha, Ring is not using the private mode by default, but it can be enabled. We first want to test the DHT calling ability before enabling the full security stack. There is also some missing elements in the pipeline making it impractical for day to day usage for now.
Once you called someone or multiple people, you can create conferences, including between participants from landland phone, Ring-DHT users or a centralized SIP server. This works because the mixing is done client side. It is using your local CPU power instead of sending all the confidential information to a third party service. While not the most energy or bandwidth efficient way, this prevent your data from being sent in a black box. This is in line with the new User Data Manifesto 2.0 recently created with KDE as a founding signing organization.
Other interesting architectural information is that Ring is a collection of multiple projects. The most low level is the Ring daemon, a dbus service manage communications and connections. On top of that is LibRingClient, a Qt library I originally wrote for the KDE client and now shared by all Ring clients. It has all the analytic and notion of “people” added on top of the daemon. The daemon itself have no notion of person or people, it only handle the protocols and hardware. LibRingClient has recently moved from KDE infrastructure to the Ring one to reflect this change and avoid arising conflict of interest/cultural shock of the new contributors coming from non-KDE background. On top of that are native “thin” clients for Gnome, OS X, KDE and a Qt based client for Windows. The Gnome and OS X clients are binding less or more directly to QObject, QAbstractItemModel, signals and slots and so on. The very interesting fact about this is that it actually work and didn’t required a large effort to implement. Because of the new Qt5 C++11 features, Qt is now mostly compatible with these alien GUI toolkits! The reasons different clients are used is also for much deeper platform integration, such as the native contact abstraction, global keyboard shortcuts, accessibility and so on. While requiring a much larger effort to implement, they also provide a better user experience.
Ring is currently still in alpha stage. Before the first official stable release, the DHT negotiation protocol might still break in incompatible ways, bugs will be fixed and incomplete features, including security ones, will be completed. I hope this blog post help you understand what Ring is about and why you (hopefully) should use it.
One last note. Starting tomorrow, many KDE developers will join force in Randa, Switzerland to work toward a touch friendly future. While there is already many touch friendly Free Software, I don’t think there is anything as well integrated as what you can find in native desktop platforms such as KDE. This is true when it come to organization. The major mobile platforms lack major community working on a greater scope than a single or couple of applications. What made and make KDE different in the past, present and future is that we are one community working toward the creation of a complete and coherent software suite. Moving to mobile devices is, in my opinion, crucial to the fulfilment of a flly Free Software based phone ecosystem. Please consider to donate by clicking the banner below to make coding sprint like this one possible.
[1] Akonadi support has only recently been introduced to KF5, I have yet to re-enable it in Ring, I will as soon as some major distros actually ship with it. A vCard bridge is used to synchronize with KDE4 based KAddressBook for now.