It was really easy, but right now we're spitting out bad XML on journals that aren't in UTF-8 or a subset.
I'm fixing this by doing everything properly with XML::DOM and Unicode::MapUTF8 ... both great modules.
We have a 'lang' field in the user table. We'll also need a default encoding userprop probably. We need to expose that and the language field, then.
And we should modify the protocol to let it take an encoding to convert from. Internally we'll store all data as utf-8.
And if we detect a charset encoding with an HTTP POST, we'll do the conversion automatically. Still have to look into how that works (which HTTP request headers are sent...).
Just wanted to make it known that I really do want LiveJournal to be smart about encodings and non-English languages. It's just slow going doing so much at once.
Anybody interested in working on this? If you need help I can guide you.