|Beyond a server-based content distribution network
||[Jan. 12th, 2005|04:09 pm]
mart has raised some very interesting ideas about how to integrate the social networks of journaling sites. The models he has proposed are sensible, logical extensions of the way Usenet was built, however they introduce new challenges in terms of server performance. To be specific, a bunch of traffic between servers is created and managed so that content is duplicated around the place. This creates a number of challenges:
Sites lose control over who receives served content. If my friends-only entries get copied into a cache at DeadJournal, then I am now trusting DeadJournal to maintain my security as well as trusting LiveJournal. Ideally, the number of trust relationships anyone has to enter will be minimised.
Sites serve content from other sites. Proper payment for hosting can become problematic if someone establishes a primary relationship with one site, but most of their content gets served by another site. The most likely consequence is that big sites like LJ will be paying for even more bandwidth and disk, subsidising smaller sites. This then requires the establishment of chargeback mechanisms or other cost-recovery devices, and the difficult politics imposed by the trust issue get even worse.
A fairer system would be one in which sites serve their own content and nobody else's, and yet are able to provide a unified friends page. This can be achieved by separating the aggregation function from the content-delivery function, and using client-side includes.
The system might work like this:
I send a request to LiveJournal for my friends page. LiveJournal sees that my friends page is 50 entries long, and finds the ID and timestamp of 50 entries that it might serve to me. It then sends requests to TypePad, DeadJournal and GreatestJournal containing the lists of friends that I have at those places, and a timestamp range. TypePad, DeadJournal and GreatestJournal send URLs (but no content) back to LiveJournal for any entries that meet those search criteria.
LiveJournal then generates a friends page that contains client-side includes for the entries that are assembled into my friends page, and my web browser is then responsible for fetching the entries back from the various sites where they live. (Would this method increase the ability of my web browser to cache LiveJournal entries locally and reduce the load on LJ when I refresh my friends page? That's a fun possibility.)
In version 1 of the protocol, it might be best to get only public entries from aggregation partners. Version 1.1 could involve incorporate an assertion of my identity to those sites, allowing more secure entries to be distributed to me.
The weakness of this method is the delay introduced by the dynamic query to other sites to fetch a list of content, the possibility of timeouts and so forth. Ideally, this could be managed with a highly-intelligent client. Alternatively, it simply becomes a marketing opportunity for our Evil Robot Overlords (who have been nice to support people already - I like that a LOT) who get bragging rights over the fast integration between TypePad and LiveJournal, but can say "Integrating with sites outside our Evil Corporate Alliance may delay processing of your friends page."
If the version 1 (non-secure) version of the protocol is implemented, it should be possible to cache data pretty effectively.
Perhaps LJ and TypePad can market an ultra-clever browsing client that does a lot of this integration at the client end to match their ultra-clever posting software?