An idea hit me while I was reading a journal today where the writer posted everything twice: once in Russian, and once in English. We should support multiple versions of single journal posts.
Brad was talking a long time ago about making the news journal and other similar journals translatable via the translation system. That works on a limited basis, but it'd be much nicer to simply support having multiple logtext and logsubject rows for entries (with some kind of limit, or it'd get silly) and then record the characteristics of each version which right now would only be language but could become other things in the future such as versions in different formats.
We're already going to start storing journal, entry and possibly later comment language so that with S2 journal output can optionally include the relevant language specifications, but this takes it to a logical extreme, and works a bit like the multipart/alternate format, where several versions of the same content are given with the characteristics described in some headers.
A very rough DB schema design would go something like this: Add an ‘alternate’ byte field to both logsubject2 and logtext2 which defaults to 0, then have a logformat (better name please!) table which looks like this:
Now, the formatkey field would probably be an integer index to a list of valid formatkeys, just like how props work, but including the literal text will do for example. Now, when deciding what to retrieve, the system should retrieve the data for all of the alternates and use it to decide what version is best to present based on a list which is the result of aggregating the Accept-Language header,
$remote's language preference userprop (not the one for ML support, but one which can contain multiple options) and a query string field giving language preferences for the purposes of one-off overrides and embedding specific versions. It could also take into account a 'quality' specification where the user can rate how good each version is relative to the others, so if my language spec say I prefer English but I'll also accept Spanish but the Spanish version is rated higher because the writer doesn't consider their English expressive enough, the Spanish version can ‘win’, but there would have to be a way to make English always win if I wanted it to. The precise nature of this decision algorithm can be figured out later.
The system can then be set up in such a way that alternate versions of news posts are creatable from the translation interface but are stored in the database alongside the original, thus not filling the translation data files with journal data. Also, people can be allowed to make alternate versions of their own posts from a different interface whose position and layout is a difficult discussion for another time.
When displaying a journal entry on talk pages and other one-entry-only things, a list of alternative versions can be given alongside the entry linking to each version so that users who want to override their preferences temporarily have a way to do so.
Problems with this scheme include:
- It may create too much overhead in friends view generation, since so many different things have to be queried in order to decide on a version and load that version.
- Different versions of the same entry would share the same comments. I'm not sure if this is good or bad. If language can be stored with comments too, perhaps users could optionally be able to exclude comments in languages they can't read.
I'd be interested in thoughts on this plan, including thoughts on the parts of the scheme I wasn't able to be specific about (version selection algorithm, version creation interface) and any considerations I should have made but haven't. Of course, you can also completely dis' my idea and propose something else, if you like! ;)