Friday, September 21, 2012

Hackday project: tarpio.us

I saw this tweet from @kellan that got me to thinking:

It would be useful if Tent.io could describe itself in the world of protocols I know, particularly PuSH, XMPP, and SMTP.

I think the notion of a distributed social network is an interesting one, and I've been pondering how it might work. I think I've got a scheme for one, which I'll outline in this post, and I think it might not take that long to throw it together with off-the-shelf open source and a dedicated hack day.

General Principles

Use existing, open standards. This increases the likelihood of finding good open source code to use when building software, ensures that a distributed marketplace of (interoperable!) clients and servers could arise, and increases the likelihood of avoiding platform specificity. Plus, giants have pretty high shoulders. I have limited hack time; I'm not into wheel reinvention.

Central service provider not required. This is pretty fundamental; means there's no central organization that owns/has access to everything ("creepy alert!"), but also that there's no central organization that can be subjected to lawsuits (e.g. Napster[1]).

Users have control over sharing. At least for the initial communication, users need to be able to say exactly who gets to see a piece of content. Of course, once you've shared something with someone else, there's no stopping that other person from sharing it further, but that's what social trust is all about.

Protocol Sketch

TL;DR. S/MIME, public key crypto, Atom.

To expand: everyone runs their own AtomPub server somewhere, or contracts with a service provider to do it on their behalf. Multiple PaaS and IaaS providers have a free tier of service, which is almost certainly more than enough for most users. I get whole tens of visits to my blog every day. And I am not a prolific sharer: looks like I've averaged 1.1 tweets per day since starting to use Twitter. "Roflscale" is probably enough, to be honest. Easy recoverability probably more important than high availability for a single-user server/hosted PaaS app, so I think this could be done for less than $10 a year (most of which is probably just buying a domain name). Given the relatively low traffic, actual friends could pretty easily share infrastructure.

Now, I'm the only one who posts to that Atom feed, and anyone can poll it at any time. The trick is that the content is encrypted, and you specify who gets access to the decryption key. Basically, you generate a random (symmetric) encryption key for the content, then public-key encrypt copies of the content key for all your recipients. Put it all together in one multipart package, upload to, say, S3, and then post a link to it on your Atom feed. Of interest is that neither S3 (the storage) nor the AtomPub server care one whit about the encryption/decryption stuff.

Interestingly, following gets somewhat inverted. You can post content intended for someone who never subscribes to your feed (gee, I hope he notices me!). You can also subscribe to someone's feed who never encrypts stuff for you. But you can see anyone's public broadcasts, of course. Actual "friending" is just GPG key exchange and an addition of the key to appropriate keyrings to represent friend groups/circles.

On the client side, then, it's mostly about subscribing to people's Atom feeds, polling them periodically (for which HTTP caching will really help a lot), cracking open the content encryption keys with your own private key, then interleaving and displaying the content.

That's about it. Definitely possible to layer PubSubHubbub on top to get more realtime exchange; no reason your free tier server couldn't host one of those for you too. Or perhaps just a feed aggregation proxy that would interleave all your subscriptions for you (without needing to understand who any of the recipients were, or even if you were one of the recipients!).

Hack Day Proposal

Roughly, get a simple AtomPub server stood up somewhere. Apache Abdera has a tutorial for doing exactly that, if you can implement the storage interface (MySql/Postgresql/SimpleDB/S3 I think would all work; looks to be just a Map). Front with Apache and some form of authentication module. Alternatively, grab Abdera and retrofit into Java Google AppEngine and use Google federated OpenID for login.

Client side, cobble together a UN*X command-line client out of bash/Python/Ruby/whatever, depending on what open source is out there. It'd be kind of neat to perhaps do commands interleaved on the command line, in MH style. Everybody's got an HTTP client library that can be made to convince the server you're you. AtomPub clients are available in Python (1,2), Java (3,4), Ruby (5,6). GnuPG for the public key implementation. Bouncycastle has a Java S/MIME implementation. Lots of S3 clients out there (7,8).

I think it's entirely possible to have a simple client and server up in a day with a smallish group of folks (5-6). If there are more folks we can also try multiple clients and servers.

Tentative Projectname

Since @kellan started this off by mentioning tent.io, and this is an attempt to slap something together with off-the-shelf stuff, it felt more like a lean-to made out of a tarp than a true tent, so I've snarfed the domain name tarpio.us (which is the closest I could come to something that meant "like a tarp" that lined up with a cheap TLD). Cool part, of course, is that there's no reason tent.io and tarpio.us couldn't interoperate with protocol bridges someday.

If you're interested, hit me up on Twitter at @jon_moore or follow along on GitHub. I'll see when I can free up for a hackday soon!

Footnotes

  1. I'm not advocating violating copyrights here, merely noting that Napster was a P2P system with a single point of failure (an organization that could be sued).

1 comments:

Jon Moore said...

Update: @steveklabnik pointed me to OStatus and Salmon as similar existing specs; perhaps we implement those too/instead? (haven't grokked how much overlap there is yet)