Jan Krutisch - Decentralize ALL THE THINGS!

This text is a modified version of my manuscript for my talk “Decentralize ALL THE THINGS” I gave at Eurucamp 2014. I sometimes write manuscripts for mostly non technical talks. In this case, it made a lot of sense, because it was also a good way to reason about the talk with my great Mentor, Frank Webber. After Eurucamp I thought this was a good way to kick off my long overdue article series about Decentralization.

There’s also the accompanying slide deck you could take a look at.

So, how was your year?

Of course I didn’t mean personally. Professionally, if you’re not totally ignoring the rest of the world, 2013 and 2014 seem to be quite shitty years for working in IT. After all the Snowden Revelations, the shitty, inappropriate government reactions and then all the security holes in critical pieces of infrastructure (most of you will at least remember Heartbleed, of course) and the countless account breaches (Remember the Adobe breach?), I definitely considered leaving our field completely, with a strong urge to do something simple as gardening (And thus completely ignoring the devastating problems gardening is currently facing, with a fast and steady decline of bee populations, the general issues of pollution and the strong bias of regulators for multinational corporations).

So, this is why I’ve turned a rather large part of my attention to this idea called decentralization. Because I believe that we need to change our thinking on what we’re doing here. All of us. First of all me. So this is a work in progress. I’m new to this, so please, if you’ve been preaching decentralization since, like, ever, bear with me for a a few paragraphs. I’m hopeful that I can actually shed light on this from some interesting, non common angles.

History lessons

So, let’s start with a little look back at history. The internet, as we know it, started around 1983 (I was five back then!) with the deployment of TCP/IP, a brand new network protocol. Now, if you’ve grown up with the internet of today, and you would look at the network topology back in 1983, you would notice a big difference: There’s not really a difference between clients and servers. They were simply called “hosts” and were, for the most part, large multi user systems that people used via terminals. Now, a few years later, the home computer revolution kicked off and even though, through mailboxes, home computers had some sort of network access, it wasn’t until the late 90s that “The Internet” really had it’s breakthrough.

I’m telling you this, because, with todays network topology, it’s easy to forget that the Internet started off as a network of equal hosts. Every computer connected to the network acted as a server and a client, depending on what you were doing. This is important, because it also follows that the early protocols were designed with this “equality” in mind. As we’ll see later, this has changed. Just one example: ADSL, the protocol probably most of us use to get their home internet in which, as you all of course know, the A stands for Arbitrage.

The Meaning of Decentralization

So why is decentralization important? My answer is a single word: “Resilience”. That’s a big word, but as you’ll see, this is intentional.

Resilience in Nature

I like this image that turned up when I was searching for “resilience” on google’s image search. Because Nature is really, really resilient. More resilient, at least in sum, than we (humankind) will ever be.

You all probably at some point heard the saying that the internet (or more it’s predecessor, Arpanet) is designed to withstand a nuclear war. Now, as far as my research goes, this is an urban myth, and yet, in 2014, this may actually be true. The Internet’s fabric is decentralized. The basic routing protocols and our directory services (DNS) are able to route around temporary or permanent failure.

Imagine for a moment an internet that was based on a centralized system. Somewhere in an unbelievably huge Texan data center, a company named AOL would host “The Internet”. Exactly. It’s not really imaginable. And if it were, how long until a group of hackers (state sponsored or not) would take over that data center? No, thank you.

But on top of this wonderful architecture, we’ve built systems that are indeed centralized. Take twitter. I love twitter, don’t get me wrong. But let’s assume you’re sitting in a rather undemocratic country and you’re part of the opposition. You’ve got hold of some important information you want to share with your people. Most of us would use twitter, and that’s great, because information, according to some unnamed scientists,travels faster than light on twitter, but, as a centralized service, it’s more prone to attacks, as we’ve seen in countless examples throughout the last years. People have suggested decentralized alternatives (or even built them, like rstat.us and the likes), but they didn’t really caught on.

So, if you really want to spread unwanted information, you probably should resort to other means. For example, only spreading addresses of web pages that contain the wanted information. publish it to as many websites as you can. distribute it. decentralize it. Twitter and Facebook are really hard to block completely unless you’re switching off all networks in your country, which, if only for economical reasons, is almost unfeasible in most countries today.

Google's DNS address painted on a wall in Turkey

But suppressing decentralized, distributed information is almost impossible. Not only the net, also the people route around failure, if they can. And blocking or censoring stuff is considered a failure in the net and with the people.

So resilience can be seen as something technical and thus with technical solutions, but also as something that transcends technology.

Technology

Coming back to the internet of the 80s, most protocols designed between the 80s and 90s are also decentralized in nature. Both SMTP and NNTP for email and group messaging are decentral in nature. There’s no central “mail hub”, at least if we ignore GMail for a while. In theory, this is also true for HTTP, at least when combined with HTML and the hypertext properties of it. In practice, it takes careful design of the applications built on top of HTTP to make them truly distributed.

The most important part, of course, is to open source ALL THE THINGS. Decentralization means that we want people to run their own stuff, and that mostly means open source. Of course, the “open source it” part is easy. The hard part is the actual software and the usability of it. So we need to care about UX as well. On every level, so not only should the software itself be easy to use, but also Setup/Installation and Maintenance should be easy. This is a field where we need to learn and improve a lot. If you’ve ever tried to install your own mailserver, you know what I mean.

I’ve set up my own git server with GitLab CE, a rails based github style web interface and git hosting system. It took me about an hour to set it up, because Digital Ocean has these need preinstalled containers. Now, it wasn’t perfect and I still had to do some command line stuff to get it running the way I wanted, but it was way better anything I’ve ever encountered before.

Now imagine someone without a lot of technical expertise could set up his how mail server this quickly. A mailserver that brings a good web mailer, spam filtering, best practices regarding server setup and configured with a few simple questions. I don’t think it’s impossible to do this. It’s probably hard to maintain, with so many moving parts in one package, but it’s not impossible.

Interoperability

Speaking of Email. The reasons we all still use Email: Simplicity and interoperability. The protocols are (with the exception of IMAP, of course) relatively simple and interoperability is key in the design.

Take a look at XMPP for realtime chat. nothing about XMPP, except for the original idea, is simple anymore. Which is why networks that are proprietary, but more convenient, are winning (okay, actually, the advent of the smartphone, and the problems XMPP, especially with Stuff like OTR, has with ephemeral connections are probably also a big reason).

Interoperability does not mean that we have to reinvent all kinds of protocols, though. Take Calendars for example. Now, CalDAV is a horrible spec, but still, even without CalDAV, you can send appointments to people and back, because iCal and others cleverly piggybacked on the most interoperable protocol out there: Email.

Nevertheless, we need more, not less open protocols. CalDAV and XMPP have their drawbacks and many people say they’re both broken beyond repair. We should see this as opportunities. Of course it would or will be hard to come up with a new calendar syncing protocol, implement it properly and then somehow establish it as a new standard. But if we don’t even try, we won’t make things better.

Discovery

One thing that’s important for distributed apps that I wanted to mention is service discovery and that’s closely related to identity and authentication. And actually, we’ve made a lot of progress here, with open standards like Webfinger that allow you to publish a list of services on your domain.

Identity

The identity part is, unfortunately, a little murky. We still don’t have a good answer what a good token for identity would look like. OpenID tried the URL, and I still kind of like that concept, but for most people, URLs mean nothing and so it didn’t really take of. Mozilla Persona, a piece of technology I also liked very much, used email, which is a very good token (and also, nowadays, heavily used for more conventional login systems), but is also more or less dead, unfortunately. This leaves things like OpenID connect on the table, which looks good on paper, but I haven’t spent time with it. let’s see how it goes.

Security

A large part in the technology puzzle obviously needs to be security. Security in IT is more than encryption and cryptography, but that’s certainly the most visible part. Getting cryptography right is hard. The OpenSSL disaster shows that we as an industry and as “the open source scene”, what ever that means, need to get better at this and that we need to take this more seriously. I’m in no way qualified to talk about that, though, so let’s keep it at that and talk about applied cryptography instead. If we want to keep our users data safe, we need to talk about encryption at various levels. Again, I think Email is a good example. On the transport layer, it’s absolutely vital that any communication over the net is encrypted. SMTP servers should only talk via TLS to each other and the fact that in 2013, various large german Email providers made a PR campaign out of the fact that they NOW do this is ridiculous.

Encrypt ALL THE THINGS!

But transport encryption only goes so far. On each Email server, to do proper routing, emails need to be passed around in plain text form and can be hovered off if only one link in the delivery chain is compromised or unencrypted. Is this fixable? For routing, I would somehow guess (And I have not idea what I’m talking about here) that the answer to this is a cryptographical, theoretical yes. Tor, the onion router, a set of tools to enable people to anonymously browse the internet, seems to do this in a way. Is it fixable for every other application that currently needs to inspect metadata to do its job? I have no idea.

On top of this, content encryption is desirable. That’s what people often mean when they talk about end-to-end encryption, especially when we talk about email. PGP and S/MIME are great tools if you absolutely have to send a password to someone via email, but for many people, it only solves a tiny part of the problem, as we’ve seen with the Metadata vs. Content discussion. Getting true End-To-End encryption right is hard, and PGP with Email hardly fits the bill.

With web applications, E2E is even more complicated, as true E2E would mean that content needs to be decrypted on the client, which has very difficult and hard-to-get-right security characteristics as the web platform simply wasn’t built with that in mind. The web crypto API probably fixes a large part of this, which is great. If we don’t do true E2E, we need to be able to decrypt user data on the server (or store it, as most of us currently probably to, unencrypted) and then we’re in the realm of symmetric encryption, which is undesirable, because

One of the reasons complete encryption is desirable is because we that way we can build systems that require a lot less trust. I’ll come back to that later, so keep that in mind. I could go on and on forever. Let me say it again: Security is hard.

###Organisations and Businesses

Of course, for many of you, many of the things I’ve said so far aren’t especially new, so, let’s talk about non technical stuff for a bit. Because decentralization, when done right, also means the decentralization of organisations and businesses. Let’s quickly talk about resilience one more time. Do you remember that “too big to fail” argument from the banking crisis? The notion that we need to save the big banks no matter what, because otherwise they would take down the whole economy with them? Does that sound like a resilient system? Imagine, for some reason, within a few months, Google has to shut down it’s whole operation. Unlikely, I know. Impossible? What’s exactly impossible these days? So, let’s imagine that. How many of you in this room would be totally, royally screwed if they could not access Google Mail from one day to another? This is not resilience. Facebook, Google, Apple, maybe Twitter currently probably make up for 80-90% of electronic human interaction. I’ve made that number up, but it’s a guess. Oh, I forgot Microsoft with Skype. The bottom line: Big monolithic companies are probably as bad as big monolithic systems.

So I think we need to change that. We need to realize, that if we take this seriously, HUGE is actually BAD. Let’s go back to that David Weinberger book, what’s the title: “Small pieces, loosely joined”. From that follows, that we also need “Small organisations, loosely joined”. And, back in technical terms: “Small deployments, loosely joined”. And I don’t think that this neccessarily means “self hosted”. Self hosting is great, but like I said before, we’re not quite there yet, and a certain bunch of people would probably never waste their time hosting their own email. So we need to build small, self sustaining organisations that do this for you. Which brings me to the business part: If HUGE is BAD, it also means that the usual “World domination” as a goal of a startup is BAD. It’s the wrong goal. Which means two things: This is probably completely at odds with the VC model of startup funding. Now, a lot of reasonable things are at odds with the VC model, but this is, I think, a real dealbreaker. The second thing: It’s also probably at odds with the economies of scale. For example: github is able to host all open source of this planet for free, because they have enough paying customers. How well does that scale down? I don’t know. I do know that I run my own Gitlab instance on a VPS and I probably don’t want to host the canonical rails repo there.

Small businesses, loosely joined

So what we need in the end is “small businesses, loosely joined”. You might want to ask what “small” means. I’m not sure, because I didn’t really try to do the math, but of course we need a size of operation that allows people to work on both ops and, more important, building the software that we need. We won’t be able to do this with volunteers, so much is clear.

One thing that I’d like to throw into the discussion here at this point is the “Coop”. In german it’s “Genossenschaft” and it’s a form of organisation that, in my eyes, has been largely undervalued (and, also, deprived of many of it’s original rights in the last few decades). Creating a coop in germany is a lengthy, costly process, and so there aren’t a lot of examples out there. One of them is the Hamburg based “Hostsharing e.G.” which does hosting of email, web services and stuff. Another example, and I’m really happy about this, is Nössefiber, a coop to create a glassfiber network in a village in sweden where I happen to own a summer house. In sweden, this has become one of the standard ways of building high speed networks in rural areas where it’s (according to them, of course) not economically viable to build high speed networks for the big carriers.

Here are two other examples that might at least surprise the german crowd: DENIC and DATEV.

Coops have a few interesting properties: They largely work like an incorporated business, with one notable exception: To be a customer of a coop, you usually need to be a member. Also coops have interesting ways to limit liabilities, which is important. I personally think that it’s really good fit for everything infrastructure.

Summary

To sum things up: Distributed, decentralized systems, organisations and businesses are the best way to create a resilient communications infrastructure that will hopefully be harder to surveil, harder to bring down and harder to control. This is exactly the internet we need and the internet that we should build. (If you don’t think we’ll need this, you haven’t been paying attention to what’s politically and structurally happening.)

This might be a lot of work. We need more developers spending time on , this might fail, but in the long run, this is the only way I see at the moment to at least realize some of the utopian potential the web has shown to have on so many occasions. We as developers, designers and ops people have a unique position right now in that all of us are pretty much in high demand. This gives us leverage that we should use to make things better.

To be honest, it took me a while to get here. And so I would love to close this with a thank you to all the people who have been saying all of this for years and instead of frustratedly shaking their heads just started to build things.I hope I can play my part now by giving talks like this one and also starting to build or at least advocate and use stuff that makes the web more decentralized, like it should.

Image credits

Post header: Derived from the image found here (Public Domain): http://en.wikipedia.org/wiki/ARPANET
Resilience in Nature: Graham Horn [CC-BY-SA-2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons
DNS-Grafitti: @kaansezyum on Twitter
Background in “Small Businesses Loosely Joined”: 2bgr8stock on deviantart.com