Sources:
Section two (storage concepts and terminology) and section four (choosing your RAID configuration (a big comparison between RAID sets)) of Dell's "Flexible Array Storage Tool" manual. Which came with the PERC2 RAID card in the Poweredge 2400 server I own.
The SCSI Bus and IDE Interface (Protocols, Applications, and Programming) by Friedhelm Chmidt.
Found out that my book is pretty old, so I didn't really use it.
http://linas.org/linux/raid.html which was referrenced from the raidtools README file.
/usr/doc/Linux-HOWTOs/Software-RAID-HOWTO
(dated january 2000)
Namely, section 7 on performance.
http://linas.org/linux/Software-RAID/Software-RAID-8.html
The only searching I did was for my Dell manual on my floor, the SCSI book I found buried under my perl and crypto books, and a locate searching for the RAID HOWTOs, because I'm a lazy bastard.
Stuff:
Database tendencies:
It does a lot of random-reads and random-writes, as much as can be expected from a database. We do try to buffer the writes, but it's still bouncing around between sections of a table and different tables, so it's not quite sequential.
We have been significantly more limited by the I/O speed of the system than the space usage of the database. Since we have about 40G of space available to the database, and use about 4-4.5G, total. A lot of the slowness was horrid random-write problems and random-read problems. Now it's mostly random-read problems with a healthy chunk of random-write problems.
Have a link to a logfile of iostat output on cartman. From last thursday, starting at 3pm and ending at around 12pm EST. Stats were taken every ten seconds. A warning though: It's a text file that's just over a megabyte in size.
RAID 5:
Space quotient: (N-1)*S (N is the number of drives, S is the smallest drive in the set)
Let's say we want ten drives per set (for both the RAID 5 and RAID 10 setups). For RAID five, we have ten 36G drives. One is lost to parity ((N-1)*s), and we have another as a hot-spare, since RAID 5 is generally secure, but not quite enough (it can only survive one lost drive before losing data.) So, we have eight drives worth of space available in this set, which gives us 288G.
RAID 5 read performance in random-reads and sequential-reads is increased than that of a single disk. Random-write is significantly slower due to parity checking. Sequential-write is slightly slower from the parity checking. (See question 4 from http://linas.org/linux/Software-RAID/Software-RAID-8.html for some more information.)
RAID 10:
Space quotient: 50% of the drive space put into the set is lost. Since each drive has a redundant mirror, and is then striped for volume.
Let's say again that we have ten drives per set... Same set of ten 36G drives. We do not need a hot-spare due to the extra redundancy of the system, this gives us 180G total.
RAID 10 read performance is significantly increased (for both random and sequential reads.) Since each drive in each mirror over the entire volume can read-data independently. Write performance is slightly hindered. It is load-balanced over each mirror, and each mirror has a slower write than normal. It is much faster than normal mirrored writes, however. Random-writes are ok due to the load-balancing, sequential-writes are slower than normal (and even RAID 5.)
Conclusion:
(insert your own conclusion.)
Extra stuff:
I started prodding my contacts, and have been told of two people who have set up systems for extremely large websites in a similar configuration. They should be contacting me sometime soon.
Also, I will not comment my opinion on this. That is up to the readers. I will comment if something needs clarification.