Brad Fitzpatrick (bradfitz) wrote in lj_dev,
Brad Fitzpatrick

utf-8 bugs


I just started testing out UTF-8 stuff and am finding some issues...

Set UNICODE = 0. Make a non-ASCII post, interests, and bio.
Turn UNICODE = 1. Restart LJ server.

Go to editinfo.bml. Non-ASCII in bio and interests are question-marked out. If you hit save (like, say, to set your old-encoding to remove the question marks from your /users/ view), then your interests and bio are "lost" (permanent-ly question-mark-tized)

editinfo.bml shouldn't even load if it detects non-ASCII stuff. we discussed this already. there should be a simple page to let you set your old encoding, and any other part that detects weirdness (editinfo.bml, editjournal.bml, etc) should all force you to go to /utf8oldenc.bml or /utf8convert.bml, as appropriate.

Anyway, don't save yet. Go look at /userinfo.bml. It question-mark-tized the bio, but not the interests. Bug.

Go back and set your old encoding. Now /users looks fine, and /talkread.bml looks fine (assume talkpost too, since all in talklib, right?). But go to /editjournal.bml from the toolbar on talkread. You don't respect oldenc there, and pass non-UTF8 8bit to the browser, even in UTF-8 mode. Bug.

utf8convert.bml doesn't do interests?

utf8convert.bml gives an error if previewing with UTF-8 source encoding (which can't ever even happen... nobody does that... shouldn't be a choice)

Get me a patch for these. I'll keep testing.

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded