July 24th, 2001

(no subject)

Hi again, all.

Thanks for the help earlier. I've knocked up a 'Current music:' analyser in perl. It'll use syncitems to get a list of journal entries (subject to the first 500 limitation), then getevents with 'one' to download each and push to an array if there's a current_music event. If it's in the form /(.*) - (.*)/ (ie. Artist - Title), it'll push that to a hash, and then sort the hash into descending numerical, then lexical, and then print it all.

Anyway. Um. The code's here, and there's a sample session as a text file here.

If anyone wants to help add features such as:
  • functions (!)
  • output to html support
  • graphing of results using GD::Graph::Lines
  • gtk as a front-end
  • moods as well as music (not as trivial as it sounds, as getevents returns moodids rather than moods)
let me know, or feel free to just hack into the code and post somewhere here..

Hope someone finds this vaguely entertaining. :-)

~C.
hulk, strong, party

calling all search engines

LiveJournal currently uses links of the following format in order to access an individual entry:

http://www.livejournal.com/talkpost.bml?itemid=7520663


Most search engines REFUSE to follow a link with a "?" in it.

The Calendar view (one day at a time) uses a format like the following:

http://revjim.livejournal.com/day/2001/07/18


This IS search engine friendly.

I think talkpost.bml and talkread.bml should be modified to accept a URL like the following:

http://www.livejournal.com/talkpost.bml/7520663


This will make them search engine friendly and allow more LiveJournal content to be spidered (unless of course the user chooses to add the robots.txt entry to stop indexing by search engines).

The existing "?" format should be retained in order to preserve any preexisting links to entries


Questions? Comments?
new cow

Saving Server/Bandwidth Resources

Reading the entries in lj_maintenance and revjim's comment here in lj_dev, I started thinking about how well it would work to improve the cachability of livejournal.

Benefits would be reduced bandwidth, and possibly reduced queries the databases.

Current state of caching on Livejournal:

Cache-Control headers set to "private, proxy-revalidate", this means no decently compliant cacher will think about caching the information.

ETag headers are not set properly in all locations. talkpost.bml seems to set it correctly, but my user page does not. At least the ETag is not recognized by Cachability Query.


Possible changes:

If a valid ETag is being generated, i.e. one that is unique for the content on the page, everything is at least somewhat known before the page is spit out. This means that it would be possible to see if there were any friends only/private posts on the page before the headers are sent out. That would allow pages without friends/private posts to have a Cache-Control headers which doesn't include "private".

It would still be possible to cause cacher's to make sure the page is still up to date before spitting it at the end user, or to put some low value of required caching. The value set for required caching could vary between paid and free accounts to save more bandwidth/server resources.

If a valid ETag is being generated, it should be possible to generate a valid Last-Modified headers as well. Last-Modified seems to work better for some unknown reason.


Now I will admit I am not familiar with the codebase, so some of these changes could possibly cause more harm (DB Activity) than good, but I thought I would spew this forth and see if it made sense.
  • Current Music
    KMFDM - Ikons

S2 Progress

Being away from LJ I've got a bunch done on S2. It now generates Perl code from the input S2, including doing all the S2 validation. None of this was done until last evening when I started. (amazing how much I can get done without having to worry about the damn sysadmin crap) The LJ::S2::* API is pretty much defined too (did that on the Ohio roadtrip) and I'm calling into it already from s2compile's perl output, but I haven't actually written it yet.

All the hard stuff is now done. Now I just need to write the web/perl crap to tie it all together. Instead of writing the core.s2 file already, I'm going to make up a mock one to use for beta-testing so people can get a feel for S2, then people can give feedback once they understand how it all works. (this isn't a substitute for documentation, though .... I have that mostly written but I just haven't gotten around to committing anything to CVS)

Writing the final core.s2 file will be the last hurdle, but we can deal with that later.

Could someone help me un-gimp my code?

Mystically, a few weeks ago, my BeOS LJ client stopped working... I haven't had much time to work on it until now, and I have looked back a bit in lj_dev to look for the answer, but have yet found it... I _think_ the text I am sending to the lj server is correct, but perhaps I am just a total gimp... :)

Collapse )
  • Current Music
    Mr. Bungle - The Air-Conditioned Nightmare
hornrims

Weird weird bug...

This is very odd. When I post *this* HTTP request, it's all fine


POST http://www.livejournal.com:80/cgi-bin/log.cgi HTTP/1.1
Host: www.livejournal.com
Content-type: application/x-www-form-urlencoded
Content-length: 232

mode=postevent&user=test&
password=test&
event=This+test+%2Ashould%2A+segfault+
clive+because+I+use+
%3CCODE%3E%24EDITOR%3C%2FCODE%3E+
in+it&
subject=Test&lineendings=unix&security=public&allowmask=1&
year=2001&mon=07&day=24&hour=19&min=56



But when I post this other HTTP request, I don't get anything back. It's rather weird.

POST http://www.livejournal.com:80/cgi-bin/log.cgi HTTP/1.1
Host: www.livejournal.com
Content-type: application/x-www-form-urlencoded
Content-length: 310

mode=postevent&user=test&password=test&
event=Hopefully+this+will+bomb+because+I+want+a+failure%2E%2E%2Ebut+then+
again%2C+maybe+it%0Awon%27t%2E++I+
%2Aam%2A+using+%3CCODE%3E%24EDITOR%3C%2FCODE%3E+
to+edit+my+entry%2E&
subject=Test&lineendings=unix&security=public&allowmask=1&
year=2001&mon=07&day=24&hour=19&min=58



Any suggestions?

Editor's Note: Any spurious newlines in the query text are simply due to formatting by the browser
  • Current Mood
    confused confused