Today I've been reading about XMPP PubSub and now have a general idea of how it works in my head. I'm still not entirely convinced that it's the best way to shift large volumes of entry data between the big sites, but I do think it'd be worthwhile to implement it anyway. The nice thing about XMPP PubSub is that we can implement it in two parts:
- A companion to synsuck which can consume PubSub content (where the payload is an Atom feed) into type Y journals. This would presumably take the form of a special Jabber client daemon which handles the subscribing and then makes sure that the recieved items end up in the right journals on the site.
- A component which publishes posted journal content from all journals. I'm still not sure of the best approach for this yet, but whatever happens it should only be called on to publish entries for which there are active subscriptions (or else it'll get really behind on LiveJournal.com). I'm currently a little confused about who is responsible for distributing the message to the various servers where there are subscribed clients, though. This component would serve a similar purpose to the /data/atom output, but without all the polling.
The PubSub consumer seems like a good place to start, assuming there are actually data sources we can subscribe to for testing. It looks like the PubSub.com feeds aren't suitable because the client must connect directly to their server to do it, and they only serve up their own feeds. If I'm understanding correctly, the pubsub consumer daemon will need to have an account on a Jabber server through which it sends and recieves messages. LiveJournal.com has the resources to run its own Jabber server quite easily, and others who don't want to run their own could presumably just use an account on jabber.org or whatever, so this doesn't exclude anyone.
If anyone wants to add anything or correct me, please go right ahead…