November 3rd, 2001

RSS, XML, Encodings, fun

I added RSS support today (example) because somebody asked for it recently.

It was really easy, but right now we're spitting out bad XML on journals that aren't in UTF-8 or a subset.

I'm fixing this by doing everything properly with XML::DOM and Unicode::MapUTF8 ... both great modules.

We have a 'lang' field in the user table. We'll also need a default encoding userprop probably. We need to expose that and the language field, then.

And we should modify the protocol to let it take an encoding to convert from. Internally we'll store all data as utf-8.

And if we detect a charset encoding with an HTTP POST, we'll do the conversion automatically. Still have to look into how that works (which HTTP request headers are sent...).

Just wanted to make it known that I really do want LiveJournal to be smart about encodings and non-English languages. It's just slow going doing so much at once.

Anybody interested in working on this? If you need help I can guide you.

binary files from LJ cvs are checked out as text files ?

i tried to checkout the LJ source from the CVS repository at (, using TortoiseCVS (, which is an windoze exploder plugin cvs gui.

it looks as if binary files (e.g. mood icon gif files) are checked out as text files, cause they look pretty scrambled when they reach my disk ..

i assumed that my cvs plugin wasnt configured correctly and so i tried several other repositories - but there binary files were checked out correctly.

looking at the cvs entries files, i found out that the "-kb" (keyword expansion binary) was set on them.
on the cvs entries for the LJ binary files, this option seems not to be set.

however, when i access the LJ binary files using the webcvs interface, i can download them correctly (one by one ... yuk). so in the LJ repository the binary files are ok, i just cant check them out correctly.

to me it looks as if the repository contains the binary files correctly, but somehow has "forgotten" that they are binary files, so that they are checked out as text files ???

is my theory complete bullshit ?

any clue what im doing wrong ?

  • Current Mood
    confused confused