?

Log in

No account? Create an account
March 31st, 2002 - LiveJournal Development [entries|archive|friends|userinfo]
LiveJournal Development

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

March 31st, 2002

random thought. [Mar. 31st, 2002|01:48 am]
LiveJournal Development

lj_dev

[mishakal]
hey it's just a small suggestion but wouldn't it be a neat little feature if you viewed someones user profile and the friends that you possibly shared in common with that person were highlighted like the interests? i dunno heh. maybe that's stupid. but it just came to me now ;)
link4 comments|post comment

Similar Interest Magic Index [Mar. 31st, 2002|06:18 pm]
LiveJournal Development

lj_dev

[bradfitz]
If you have a paid account, you've probably played with the newly reinstated similar interest user search page.

A pre-warning about the Magic Index it uses to sort: not a ton of thought has gone into the constants in there.

I just wanted to get it live. I tweaked the numbers a bit until the results got better for a number of users, but it's far from perfect.

I have 18 hours of airports and flying tomorrow, so maybe I'll do the math and figure out some better constants.

Basically, the root of the magic is:

$magic{$_} = $pt_weight{$_}*20 + $pt_count{$_};

Where $pt_count is one point per similar interest, and $pt_weight is (1 / total users interested in that thing) points per matching interest.

I'd be interested to hear thoughts from alanj, toast, evan, and metadaisy in particular.

Relevant code is here:
http://cvs.livejournal.org/browse.cgi/~checkout~/livejournal/htdocs/interests.bml?rev=1.18

Search for "findsim" and read down from there, until the string "Magic Index?".

The reason this feature can be back is because the query time is bounded. We iterate over a user's (up-to-150) interests and for each, query a few hundred random users with that interest. We don't pull them all in, because, well... then it wouldn't be bounded and it wouldn't scale again. Remember: you can only have 150 interests, but there's no restriction that all 512,000 LJ users can't be interested in "sex". Besides, it's useless to pull all that in, since we shouldn't weight that interest match much anyway. Doing it "correctly" isn't possible in a reasonable amount of time. It could be a directory search filter, though, which has that HTTP recheck thing and checking to make sure one query at a time is going on... but then the dirsearchres2 format would have to have a header which said the data format, and one new format would have to include weights, so the directory.bml UI could show them.

But blah. Later, perhaps. This was a 15 minute hack this morning once I realized the trick was the LIMIT clause.

This post is to solicit weighting change suggestions only, not redoing the whole algorithm.
link5 comments|post comment

navigation
[ viewing | March 31st, 2002 ]
[ go | Previous Day|Next Day ]