Brad Fitzpatrick (bradfitz) wrote in lj_dev,
Brad Fitzpatrick

Massive Death

I woke up to find all the web servers using hundreds of megs of swap and perl processes spinning endlessly.

Kenny didn't like using all its swap and kept crashing.

I finally killed and restarted everything so it's working for now. However, here's the big question:


I imagine two possibilities:

1) We were attacked. There are limits in LJ code... somebody could potentially upload a 500 mb journal entry and LJ code would choke all over the place trying to deal with it.

2) I can't code. There's some infinite loop eating up memory, which I think is more likely, given that shutting down apache killed all but several perl processes which continued to eat memory and grow.

But... the symptoms of 2) could be caused by 1)... like if it was still proessing a 500 mb journal entry.

So, I need to put limits in place pretty fast so this doesn't happen again, and at the same time I'd like people to continue to review my code for things that are shitty.

This sucks. I need more help.

Update: It's happening again. :-(

