RAID Primer: What's in a number?
by Dave Robinet on September 7, 2007 12:00 PM EST- Posted in
- Storage
Introduction
The majority of home users have experienced the agony of at least one hard drive failure in their lives. Power users often experience bottlenecks caused by their hard drives when they try and accomplish I/O-intensive tasks. Every IT person who has been in industry for any length of time has dealt with multiple hard drive failures. In short, hard drives have long caused the majority of support headaches in standard desktop or server configurations today, with little hope of improvement in the near term.
With the increased use of computers in the daily lives of people worldwide, the dollar value of data stored on the average computer has steadily increased. Even as MTBF figures have moved from 8000 hours in the 1980s (example: MiniScribe M2006) to the current levels of over 750,000 hours (Seagate 7200.11 series drives), this increase in data value has offset the relative decrease of hard drive failures. The increase in the value of data, and the general unwillingness of most casual users to back up their hard drive contents on a regular basis, has put increasing focus on technologies which can help users to survive a hard drive failure. RAID (Redundant Array of Inexpensive Disks) is one of these technologies.
Drawing on whitepapers produced in the late 1970s, the term RAID was coined in 1987 by researchers at the University of California, Berkley in an effort to put in practice theoretical gains in performance and redundancy which could be made by teaming multiple hard drives in a single configuration. While their paper proposed certain levels of RAID, the practical needs of the IT industry have brought several slightly differing approaches. Most common now are:
RAID 0 - Data Striping
RAID 1 - Data Mirroring
RAID 5 - Data Striping with Parity
RAID 6 - Data Striping with Redundant Parity
RAID 0+1 - Data Striping with a Mirrored Copy
Each of these RAID configurations has its own benefits and drawbacks, and is targeted for specific applications. In this article we'll go over each and discuss in which situations RAID can potentially help - or harm - you as a user.
The majority of home users have experienced the agony of at least one hard drive failure in their lives. Power users often experience bottlenecks caused by their hard drives when they try and accomplish I/O-intensive tasks. Every IT person who has been in industry for any length of time has dealt with multiple hard drive failures. In short, hard drives have long caused the majority of support headaches in standard desktop or server configurations today, with little hope of improvement in the near term.
With the increased use of computers in the daily lives of people worldwide, the dollar value of data stored on the average computer has steadily increased. Even as MTBF figures have moved from 8000 hours in the 1980s (example: MiniScribe M2006) to the current levels of over 750,000 hours (Seagate 7200.11 series drives), this increase in data value has offset the relative decrease of hard drive failures. The increase in the value of data, and the general unwillingness of most casual users to back up their hard drive contents on a regular basis, has put increasing focus on technologies which can help users to survive a hard drive failure. RAID (Redundant Array of Inexpensive Disks) is one of these technologies.
Drawing on whitepapers produced in the late 1970s, the term RAID was coined in 1987 by researchers at the University of California, Berkley in an effort to put in practice theoretical gains in performance and redundancy which could be made by teaming multiple hard drives in a single configuration. While their paper proposed certain levels of RAID, the practical needs of the IT industry have brought several slightly differing approaches. Most common now are:
RAID 0 - Data Striping
RAID 1 - Data Mirroring
RAID 5 - Data Striping with Parity
RAID 6 - Data Striping with Redundant Parity
RAID 0+1 - Data Striping with a Mirrored Copy
Each of these RAID configurations has its own benefits and drawbacks, and is targeted for specific applications. In this article we'll go over each and discuss in which situations RAID can potentially help - or harm - you as a user.
41 Comments
View All Comments
tynopik - Saturday, September 8, 2007 - link
> So I'm looking for a solution which stores my data in a "normal" way on the discs + one extra disk with the parity (somewhat like RAID 3 but without the striping).unRAID
http://www.lime-technology.com/">http://www.lime-technology.com/
tynopik - Saturday, September 8, 2007 - link
i should point out that1. it does NOT join your drives together into one volume, each drive is separate (this is basically necessary for what you want unless you go the WHS route)
2. it has to be run on a dedicated system that it turns into NAS (you can't run it on your main desktop for instance)
that said, i really like the idea, almost all of the advantages of the WHS mechanism but much more space efficient (in most cases, i assume the largest drive will always be 'lost' to parity data)
Dave Robinet - Saturday, September 8, 2007 - link
Really, you're looking for something that is several RAID 1 mirrors of single volumes.I can think of nothing off-the-shelf that fits all those needs, though "rolling your own" may help:
- Buy two drives. Create one large partition (say, D:) on drive 1. Mirror that.
- Buy two more drives. Create another large partition (say, E:) on drive 3. mirror that.
Etc, etc.
It's still the same volume, but if you do it using software, the two drives won't be dependent on each other in any way.
If you tear one of those drives out of your computer and slap it onto another one (USB connector, etc), then it'll come up just fine, with or without the mirror.
It's inelegant, and really not something I'd ever push on someone - but you've come up with a kind of oddball request, there. :) Might I ask what it's for? Maybe your criteria can be adjusted in some way.
tynopik - Saturday, September 8, 2007 - link
> but you've come up with a kind of oddball request, there. :) Might I ask what it's for? Maybe your criteria can be adjusted in some way.i understand what he's getting at
he wants protection from drive failure, so a 'parity drive' that can rebuild any one drive that fails is handy
but he's also concerned about losing more than 1 drive simultaneously
having just a plain filesystem on the disk is far more robust than any sort of striping system as worst comes to worst you can just yank any surviving drives and recover what's on them
- a series of raid 1 arrays (like what you described) works but isn't particularly flexible (need equal sized drives)
- WHS is more flexible and powerful but it still requires double the amount of storage (EXPENSIVE)
- this only requires 1 extra drive and allows it to backup any number of other drives
it comes from a desire for some protection but not being able or willing to spend enough for true duplication plus wanting something that fails gracefully (ie not raid5)
i would actually like something like that for my system, there's a chance of recovering everything, but if it hits the fan i'll be able to at least recover something plus it's not that expensive
don't forget there may be physical limitations. if you have 4 physical drives filled with data, you might have enough room and power connectors for a 5th drive, but not for 4 more
Sudder - Sunday, September 9, 2007 - link
tnx, this goes a big step in the right direction
since unraid uses slackware there should be (at least in theory) a possibility to do this with the linux "Logical Volume Manager" (allthough one would probably have to do some work so that the TOC is saved on every disc to still being able to access the data if some of the other discs are gone)
But even without, seperate volumes and the option to access them one by one, by mounting the ReiserFS, is good enough for me.
and that's a big downside.
When I have some time I'll probably try to run it in a virtual machine (the "use a physical disc" option in VM should reduce the perfomance-penaltys significantly), but I'm not that optimistic that this will also work with the "bigger", non-free Versions that can handle more than 3 discs (e.g. handling of the registration Key, since I allready ran into some pre-boot USB Issues with VM when I tried to test the bitlocker-feature of Vista in a VM - although it just might have been my old stick or my USB-contoller ..)
yes (that's kind of a given) - the option to use discs of different sizes is a nice bonus though, since the array can now grow more "organically" over time (you just buy the disc with the best cost/gig ratio at the moment you need it without limmiting yourself to one size like with RAID 5)
with the port-multiplier Option of SATA II (up to 15 drives per cable) and an external casing I think there are ways to cope (and if you plan in advance to have X bays/connectors avalable, you just have to start a new array if the old one is full - which might be a good idea anyway as soon as you come close to double digit disc-numbers - although, that might take some time with modern disc sizes ;-) )
Again: I don't want nessecarily to being able to access my data all the time, I just want to switch from my current "DVD-storrage" to a "HD-storrage".
So what I'm looking for now is the funktionallity of unRAID (without the limitation of the drive nuber), being able to run in a VM and for free ;-).
I allready checked freeNAS and NASlite but they all seemed to be fixed either on RAID and/or JBOD without parity .. any suggestions?
tynopik - Sunday, September 9, 2007 - link
> with the port-multiplier Option of SATA II (up to 15 drives per cable)and which consumer level products support port-multipliers?
it's an optional part of the spec and most don't implement it
if you're willing to do a lot of extra work and hassle and really want offline storage you can fill a bunch of external drives with a virtual filesystem (like truecrypt for instance) and then with them all connected run par to build a par file across all your virtual disk files
disadvantages are numerous, have to be able to connect all disks at once, if you update one little piece of one drive have to recalculate the par file across all of them, etc
Sudder - Sunday, September 9, 2007 - link
most e-sata ports support it (although some controllers like the ones using JMB36X just support "Command-based Switching", e.g. all sollutions with Sil 3132 chips (e.g. many notebook S-ata - PCMCIA adapters) even support "FSI-based Switching" which works a little like SCSI (the command is sent to one disc and then the bus is free again, so you can get "close" to the theoretical 300 MB/sec with multiple discs and the bus is not blocked by one working disk (with 4 discs connected, a test showed still 40MB/sec transfer from each of the 4 parallel working discs ..)
so take AFAIKR one of the many new gigabyte mo-boards with 2 ports that can be used as e-sata, put e.g. a "Dawicontrol DC-6510 PM" on the other end of the cable (one is about 100 bucks) add a powersupply and a housing and you are good for 10 extra discs ..
look at more recent motherboards (e-sata slowly shows up on more and more boards) and you'll find that it's supported more and more
well, I'm kind of too lazy to to the par thing each and every time I just change one little file - or to be more practical, I can verry much immagine myself pushing the rebuild further and further into the future as long as I can forsee that I will add more stuff in the verry near future which then again will require a rebuild .. (if I don't find a usable sollution which does it "on the fly" I'll probably end up doing it "by hand" (evetually by adding a small RAID 5 "file-buffer" to my System to strech the write/par Intervalls) but I _really_ would prefer an automatic solution without a most likely multi-hour rebuild process (reading all discs, calculating and writing the hole par-disk) after each little change ..)
Witling - Saturday, September 8, 2007 - link
Something I don't usually see in articles on Raid is the complete lack of protection from failure due to a virus or installation of a bad driver. Both disks get corrupted.I am a home user of Raid 1 through a controller built in to the motherboard using a popular Redmond Washington operating system.
Dave Robinet - Saturday, September 8, 2007 - link
Yep - I touched briefly on this in the last part of the article.Users need to look closely at if an ARCHIVAL system (tape, etc) is better for their needs than RAID 1.
Let's face it - RAID 1 is for "(almost) always on / critically needed to be working when powered up" configurations ONLY. How many home computers fall into this category... really?
kobymu - Saturday, September 8, 2007 - link
In certain cases RAID 1 will give you better read performance.