I just started testing out UTF-8 stuff and am finding some issues...
Set UNICODE = 0. Make a non-ASCII post, interests, and bio.
Turn UNICODE = 1. Restart LJ server.
Go to editinfo.bml. Non-ASCII in bio and interests are question-marked out. If you hit save (like, say, to set your old-encoding to remove the question marks from your /users/ view), then your interests and bio are "lost" (permanent-ly question-mark-tized)
editinfo.bml shouldn't even load if it detects non-ASCII stuff. we discussed this already. there should be a simple page to let you set your old encoding, and any other part that detects weirdness (editinfo.bml, editjournal.bml, etc) should all force you to go to /utf8oldenc.bml or /utf8convert.bml, as appropriate.
Anyway, don't save yet. Go look at /userinfo.bml. It question-mark-tized the bio, but not the interests. Bug.
Go back and set your old encoding. Now /users looks fine, and /talkread.bml looks fine (assume talkpost too, since all in talklib, right?). But go to /editjournal.bml from the toolbar on talkread. You don't respect oldenc there, and pass non-UTF8 8bit to the browser, even in UTF-8 mode. Bug.
utf8convert.bml doesn't do interests?
utf8convert.bml gives an error if previewing with UTF-8 source encoding (which can't ever even happen... nobody does that... shouldn't be a choice)
Get me a patch for these. I'll keep testing.