?

Log in

No account? Create an account
November 6th, 2001 - LiveJournal Development [entries|archive|friends|userinfo]
LiveJournal Development

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

November 6th, 2001

question [Nov. 6th, 2001|12:55 am]
LiveJournal Development
lj_dev
[asciident]
Okay, so it's fairly obvious that the directory will be returning soon. So, um, I was wondering if this will be limited to paid accounts at first?

Basically, I ask because this is a highly anticipated feature return. I'm sure as soon as word is out that it's back up when it happens, it will get used a LOT right then in the beginning. Now, I understand that the purpose of the rewrite was meant for it to be more efficient, but is there any concern that very heavy use right when it comes back will cause problems?
link5 comments|post comment

(no subject) [Nov. 6th, 2001|09:56 pm]
LiveJournal Development

lj_dev

[visions]
okay, since posting to lj_support got little feedback.. ill post here.

recently, something (be it data replication or something else) has resulted in the paid servers offering lagged/stale/outdated/incorrect data to the users.

now, one of the points to the paid servers is that the paid users get faster and more dedicated service. that is a nice benefit... right? I would say so.

now, one would say.. okay... i can understand/deal with a tiny bit of lag with the paid servers data since it IS being replicated. but what about when that delay is of a magnitude of 10-20 minutes.. and sometimes perhaps even longer?

what happens when you are getting comments from free users on posts that you have made.. and the posts themselves are not even visible yet on the paid servers? is that acceptable? situations like that would make me want to switch over to not use the fast servers just so that i could see the posts when others could.

with that said, it seems that the underlying architecture and design for paid versus free users is flawed. based on the operational disection of what is happening (purely observed) it seems as if the schema that it operates under is this:

  • all posts go to the main database, regardless of the cookie (the cookie only determines which webserver is posting to the database)

  • that database is periodically replicated to backup databases which serve the paid users.

  • the main database is what is polled if you do not have ljfastserver defined in your cookie.



with that said, i think it is a drastic design flaw. people are in effect being penalized for paying by getting out of date data. perhaps i am misinterpreting something, but from the behavior.. the model seems to be as i described above.

in my opinion it should be reversed... operating under a schema such as this:


  • all posts go to the main database regardless of user status

  • paid user servers (ljfastserver defines) directly query from the main database, or perhaps from one that is replicated on a much quicker basis (on the order of less than a minute lag as a worst case).

  • all other servers (non-paid users or people that explicitly decide not to use the ljfastserver cookie) query replicated databases at ALL times.



anyway, i hope this makes sense.. i was interrupted a lot while typing it. thoughts?

update:
since there was a little confusion, i will clarify... the schema i presented as what is currently going on was an OPERATIONAL schema. by that i mean, that is what it appears to be operating as, not necessarily what it is implemented as.

secondly... i will refine my "ideal" situation since i didnt explain it well the first time...

as noted here...


  • all posts go to the main database regardless of user status

  • paid user servers (ljfastserver defines) directly query from the main database (until a certain load-level is reached), and then at that point queries are load-delegated to one that is replicated on a much quicker basis (on the order of less than a minute lag as a worst case). once the load level of the main database drops to an acceptable level, the load is rebalanced.

  • all other servers (non-paid users or people that explicitly decide not to use the ljfastserver cookie) query replicated databases at ALL times.



another possibility is to have a pool of slave dbs that are allocated to paid members (as is currently i believe), and based on load on those servers, re-allocate slave db's from the free user pool to the paid member pool dynamically to deal with the load.. putting at most n-1 (or n-2.. whatever is the minimum acceptable based on the number of servers in the pool) servers in the paid member pool. anytime the max # of slave db servers are in the paid member pool, some paid dev person (dormando i imagine) should be paged, emailed, or whatever.. all automatically.

----

explanation of issue:

from brad (here):

normally all db-slaves are 0-2 seconds behind. replication is pretty much instant.

then we turned on synchronous key writes (or rather, disabled async key writes) because otherwise, mysql shutdowns take 5 minutes or so. while we were fixing all the database key corruption caused by the power outage, we wanted to be able to restart quickly to test.

we would've changed it back sooner, but we were debugging another issue and changing too many things at once wouldn't help our analysis of the problem. besides, the site was holding up fine even with sync key writes. until today. i leave, i come back, things be fucked. dormando made mackey (paid db) be async. i took all traffic off it, let it catch back up, then gave it traffic again. dormando's also going around turn it off on the others.

NOW ... here's the real problem: our load balancing for db connections sucks.

i wrote dbselector to monitor them all and tell web slaves where to go dynamically (web slaves get leases on handles, have to revalidate them every 'n' seconds) but it's not in use yet because I want more people to audit it.

avva's going to audit/test/improve it.

we're aware of our weaknesses. we know how to fix them. we just need more manpower.

if you want to work on bin/dbselectd.pl, it'd be much appreciated.
link36 comments|post comment

navigation
[ viewing | November 6th, 2001 ]
[ go | Previous Day|Next Day ]