To be honest, 4 TB of storage isn't really necessary for home usage. However, I like to collect movies in full DVD or HD quality and so I need some storage.

I decided to build myself a NAS box based on Debian Etch. Samba is used to allow clients to access the data. The machine itself was initially based on 4 x 0.5 TB disks using the four SATA ports on the mainboard. With Linux build-in support for software RAID, I created a RAID 5 array, giving me 1.5 TB of storage space. Since a single movie is around 4 GB, the 1.5 TB turned out to become rather tight.

So I bought 4 x 1 TB disks and a Highpoint RocketRaid 2320 controller (SATA 4x). I put all 8 disks on this controller.

I wanted to create a single RAID 6 array using both the 1 TB disks and the 0.5 TB disks. I didn't want to create two separate array's because although it would have provided additional space, it wouldn't have given me the same safety level as RAID 6 does.

I mainly chose for RAID 6 since I cannot afford a backup solution  for this amount of data. I'm aware that RAID is no substitute for a proper backup, but it's an accepted risk for me.

Using both 1 TB disks and 0.5 TB disks, how to create a RAID 6 array using different drive sizes? The solution is fairly simple. Just put two 0.5 TB disks together in one RAID 0 volume and you'll have a 'virtual' 1 TB disk. Since I had four 0.5 TB disks, I could create 2 'virtual' 1 TB disks. 

The only downside is that I had to skim a little bit of storage capacity of the native 1 TB drives, because 2 x 0.5 TB provides slightly less storage space than a single 1 TB disk. We're talking about something like 50 MB here, so It's not a big deal in my opinion. 

The funny thing is that this array actually performs rather well. The disks are connected using a HighPoint RocketRaid 2320 controller. This controller is used just for it's SATA-ports, the on-board RAID functionality is not used. For RAID, I use Linux software RAID, using mdadm. This is how the RAID 6 array looks like:

        server:~# mdadm --detail /dev/md5
        /dev/md5:
        Version : 00.90.03
  Creation Time : Thu Jul 24 22:40:26 2008
     Raid Level : raid6
     Array Size : 3906359808 (3725.40 GiB 4000.11 GB)
    Device Size : 976589952 (931.35 GiB 1000.03 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 5
    Persistence : Superblock is persistent
    Update Time : Sun Aug 10 15:36:18 2008
          State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0
     Chunk Size : 128K
           UUID : 0442e8fa:acd9278e:01f9e43d:ac30fbff (local to host server)
         Events : 0.14170

Number   Major   Minor   RaidDevice State
   0       8        1        0      active sync   /dev/sda1
   1       8       17        1      active sync   /dev/sdb1
   2       8       33        2      active sync   /dev/sdc1
   3       8       49        3      active sync   /dev/sdd1
   4       9        0        4      active sync   /dev/md0
   5       9        1        5      active sync   /dev/md1

And this is how this array performs:

server:~# dd if=/storage/test.bin of=/dev/null bs=1M

10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 45.7107 seconds, 229 MB/s

server:~# dd if=/dev/zero of=/storage/test.bin bs=1M count=10000

10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 81.0798 seconds, 129 MB/s

With 229 MB/s read performance and 129 MB/s write performance using RAID 6, I think I should be content.

How important is availability of an information system to you and your company? What are the costs of, let's say, a couple of hours downtime and maybe the loss of all the work since the last backup? 

Depending on the information system, the impact can be quite grave, I presume. So what are the biggest risks regarding the availability of your systems? Human error is probably number one. Number two might be the hardware.

One of the most unreliable components of the hardware on which your precious information systems run is the good old hard drive. There are not two but three certainties in life: death, taxes and that sooner or later hard drives will fail.

So back in the eighties (some patent was already awarded back in 1978) some smart people invented RAID. Using RAID, your information system can tollerate a disk faillure, and still continue to operate. 

There are many different tastes of RAID, so called RAID levels. One of the most populair RAID levels is RAID 1. Two disks acting like 1. If one fails, the other takes over. For performance, you can stack them together and you get RAID 10 arrays. However, 50% of your storage space is waisted because for every n of storage, you need (n / c ) * 2 disks, where c represents the capacity of a single drive.

RAID level 5 is a more efficient solution. Using this RAID level, only the capacity of one disk is lost in order to provide redundancy. So for every n of storage, you need ( n / c ) + 1 disks. It is easy to see that for larger arrays with more disks, RAID 5 is much more efficient. The downside of RAID 5 is mainly (write) performance, if compared to RAID 10. However, if it is sufficient, that is often not an issue. Hence the popularity of RAID 5. 

This story is all about risk vs. costs. And there is a risk using RAID 1 and 5 that can not be neglected that should be pointed out. If a drive fails, redundancy is lost. At that moment, until the faulty drive is replaced you will run the risk of losing the entire RAID array and all data if another disk would fail. 

How big is that risk? Well, that is the weakest point of this article. I honestly don't know. There is some anecdotal "evidence" that it occurs occasionally. And it is not that surprising: restoring an array puts extra stress on all de disks involved, which might be fatal for a second drive. 

Today, RAID arrays of 10+ disks are not a rarity. With that amount of drives, it wouldn't be surprising if, during recovery, a second drive would fail. It's easy: with a 10-disk array the chance that a disk fails is twice that of a 5 disk array. 

The most common solution is to revert back to RAID 10. RAID 10 consists of disk pairs concatenated to one big virtual disk. RAID 10 can tollerate up to 50% loss of drives if one member of every pair would fail. The caveat is obvious: if a disk fails and the other drive of that pair will fail during recovery, the whole array will be lost. However, compared with RAID 5, the risk is reduced. In degraded mode (non-redundant) any drive failure will destroy a RAID 5 array. RAID 10 can tollerate additional drive failures as long as it is not the drive of the pair that just already lost one. 

So, although the risk that a second drive failure might destroy your array is greatly reduced using RAID 10 (compared to RAID 5), there is still a risk that the array is lost is the 'wrong' drive fails. 

So the solution should be that redundancy is not lost if a single drive failure occurs. RAID 6 provides that solution. RAID 6 is in nature identical to RAID 5. However, an additional drive is sacrified for additional redundancy. So for every n of storage, you need ( n / c ) + 2 disks. If you need 10 TB of storage using 1 TB disks, you need 12 disks. If a disk fails, the array is still redundant. Even a second drive can fail and the array will still continue to operate. I think that the chance that a third drive would fail is so low that it is an accepted risk.

For smaller arrays, the risk of a double drive failure might not that high to justify RAID 6, but with larger arrays (more drives) RAID 6 might become a necessity.

So there you have it. With current costs of hard drives and the wide support for RAID 6, it is an option that should be taken into account when designing the hardware platform for an information system. 

Aftertought: this article is mainly about considering RAID 6 in stead of RAID 5. Raid 5 or 6 may often not be a solution if performance in terms of IO (input/output) is an issue. Please note that when running in degraded mode (a drive failure occurred) the performance penalty on RAID 5 and RAID 6 will be severe (may be 80%). RAID 10 will suffer far less in that regard.

20 DISK 18 TERRABYTE NAS

Just for fun, I've build myself an 18 TB NAS based on Debian Linux, software RAID, 20 disks and a Norco 4020 case.

Projects

Contact

Donate

If you find PPSS, WFS or LFS, usefull, consider a donation.

Categories

Archives