Articles in the RAID category

  1. Speeding Up Linux MDADM RAID Array Rebuild Time Using Bitmaps

    December 22, 2011

    When a disk fails or gets kicked out of your RAID array, it often takes a lot of time to recover the array. It takes 5 hours for my own array of 20 disks to recover a single drive.

    Wouldn't it be nice if that time can be reduced? Even to 5 seconds?

    Although not enabled by default, you can enable so called 'bitmaps'. As I understand it, a bitmap is basically a map of your RAID array and it charts which areas need to be resynced if a drive fails.

    This is great, because I have the issues that of every 30 reboots, sometimes a disk won't get recognized and the array is degraded. Adding the disk back into the array will mean that the system will be recovering for 5+ hours.

    I enabled Bitmaps and after adding a missing disk back into the array, the array was recovered instantly.

    Isn't that cool?

    So there are two types of bitmapsL

    1. internal: part of the array itself
    2. external: a file residing on an external drive outside the array

    The internal bitmap is integrated in the array itself. Keeping the bitmap up to date will probably affect performance of the array. However I didn't notice any performance degradation.

    The external bitmap is a file that must reside on a EXT2 or EXT3 based file system that is not on top of the RAID array. So this means that you need an extra drive for this or need to use your boot drive for this. I can imagine that this solution will have less impact on the performance of the array but it is a bit more hassle to maintain.

    I enabled an internal bitmap on my RAID arrays like this:

    mdadm --grow /dev/md5 --bitmap=internal
    

    This is all there is to it. You can configure an external bitmap like this:

    mdadm --grow /dev/md5 --bitmap=/some/directory/somefilename
    

    There probably will be some performance penalty involved, but it does not seem to affect sequential throughput, which is the only thing that is important for my particular case.

    For most people, I would recommend configuring an internal bitmap, unless you really know why you would have to use an external bitmap.

  2. Do Not Buy a Hardware RAID Controller for Home Use

    November 17, 2010

    Hardware RAID controllers are considered 'the best' solution for high performance and high availability. However, this is not entirely true. Using a hardware RAID controller might even endanger your precious data.

    For enterprise environments, where performance is critical, it is more important that the arrays keeps on delivering data at a high speed. Professional RAID controllers use TLER with TLER-enabled disks to limit the time spend on recovering bad sectors. If a disk encounters a bad sector, there is no time to pause and try to fix it. The disk is just dropped out of the RAID array after just a couple of seconds. At that moment, the array still performes relatively well, but there is no redundancy. If another disk fails (another bad sector?) the array is lost, with all its data.

    More people are building NAS boxes for centralized storage of data, for private home use. Since disks are cheap, it is possible to create lots of storage capacity for little money. Creating backups of terabytes of data is however not cheap. Or you have to create two NAS boxes. But that is very expensive and not worth the effort.

    People seem to spend lots of money on expensive enterprise level hardware RAID cards, not understanding that the whole TLER-mechanism causes an increased risk for their data. In enterprise environments, budgets are relatively big, and data is always backed up. They can afford to take the risk of losing a RAID array due to these backups. But consumers often don't have the money to spend on creating backups of terabytes of data. They just go for RAID 5 or RAID 6 and hope for the best.

    For consumers, if the RAID array goes, all data is lost.

    So consumers should choose a RAID solution that will do its best to recover from hardware failure. Performance is not so much an issue. Reliability is. So consumers do want disks to spend 'ages' on recovering bad sectors if this means that the RAID array will survive.

    Linux software RAID and ZFS do not use TLER and therefore are a safer choice for your data then regular hardware RAID controllers. You may still use such controllers (but please test them properly) but only to provide SATA ports with individual disks, the RAID part should be handled by Linux.

    So in my opinion, hardware RAID controllers are more expensive, require more expensive (enterprise) disks and are less safe for your data.

    Tagged as : Uncategorized
  3. Linux Software RAID Benchmarking Script

    September 29, 2010

    Just a small post.

    To benchmark your Linux software RAID array as setup with MDADM, please use my new benchmark script. I used this script to create these results.

    You may need to configure some values within the header of this file to make it fit your enviroment.

     DEVICES="/dev/sd[a-f]"
     NO_OF_DEVICES=6
     ARRAY=/dev/md5
     CHUNKS="4 8 16 32 64 128 256 512 1024"
     MOUNT=/storage
     LOG=/var/log/raid-test.log
     LOGDEBUG=/var/log/raid-test-debug.log
     LEVEL="0 5 6 10"
     TESTFILE=$MOUNT/test.bin
     TESTFILESIZE=10000 (IN MB, thus this is 10 GB)
     TRIES=5 (how many times to run a benchmark.)
    

    By default, the script wil format the array using XFS, feel free to format it with another filesystem such as EXT4 or EXT3 or whatever you want to test.

    Tagged as : Uncategorized
  4. RAID 5 vs. RAID 6 or Do You Care About Your Data?

    August 13, 2010

    Storage is cheap. Lots of storage with 10+ hard drives is still cheap. Running 10 drives increases the risk of a drive failure tenfold. So often RAID 5 is used to keep your data up and running if one single disks fails.

    But disks are so cheap and storage arrays are getting so vast that RAID 5 does not cut it anymore. With larger arrays, the risk of a second drive failure while your failed array is in a degraded state (a drive already failed and the array is rebuilding or waiting for a replacement), is serious.

    RAID 6 uses two parity disks, so you loose two disks of capacity, but the rewards in terms of availability are very large. Especially regarding larger arrays.

    I found a blog posting that showed the results on a big simulation run on the reliability of various RAID setups. One picture of this post is important and it is shown below. This picture shows the risk of the entire RAID array failing before 3 years.

    From this picture, the difference between RAID 5 and RAID 6 regarding reliability (availability) is astounding. There is a strong relation with the size of the array (number of drives) and the increased risk that more than one drive fails, thus destroying the array. Notice the strong contrast with RAID 6.

    Even with a small RAID 5 array of 6 disks, there is already a 1 : 10 chance that the array will fail within 3 years. Even with 60+ drives, a RAID 6 array never comes close to a risk like that.

    Creating larger RAID 5 arrays beyond 8 to 10 disks means there is a 1 : 8 to 1
    5 chance that you will have to recreate the array and restore the contents from backup (which you have of course).

    I have a 20 disk RAID 6 running at home. Even with 20 disks, the risk that the entire array fails due to failure of more than 2 disks is very small. It is more likely that I lose my data due to failure of a RAID controller, motherboard or PSU than dying drives.

    There are more graphs that are worth viewing, so take a look at this excelent blog post.

    Tagged as : Uncategorized
  5. 'Linux RAID Level and Chunk Size: The Benchmarks'

    May 23, 2010

    Introduction

    When configuring a Linux RAID array, the chunk size needs to get chosen. But what is the chunk size?

    When you write data to a RAID array that implements striping (level 0, 5, 6, 10 and so on), the chunk of data sent to the array is broken down in to pieces, each part written to a single drive in the array. This is how striping improves performance. The data is written in parallel to the drive.

    The chunk size determines how large such a piece will be for a single drive. For example: if you choose a chunk size of 64 KB, a 256 KB file will use four chunks. Assuming that you have setup a 4 drive RAID 0 array, the four chunks are each written to a separate drive, exactly what we want.

    This also makes clear that when choosing the wrong chunk size, performance may suffer. If the chunk size would be 256 KB, the file would be written to a single drive, thus the RAID striping wouldn't provide any benefit, unless manny of such files would be written to the array, in which case the different drives would handle different files.

    In this article, I will provide some benchmarks that focus on sequential read and write performance. Thus, these benchmarks won't be of much importance if the array must sustain a random IO workload and needs high random iops.

    Test setup

    All benchmarks are performed with a consumer grade system consisting of these parts:

    Processor: AMD Athlon X2 BE-2300, running at 1.9 GHz.

    RAM: 2 GB

    Disks: SAMSUNG HD501LJ (500GB, 7200 RPM)

    SATA controller: Highpoint RocketRaid 2320 (non-raid mode)

    Tests are performed with an array of 4 and an array of 6 drives.

    • All drives are attached to the Highpoint controller. The controller is not used for RAID, only to supply sufficient SATA ports. Linux software RAID with mdadm is used.

    • A single drive provides a read speed of 85 MB/s and a write speed of 88 MB/s

    • The RAID levels 0, 5, 6 and 10 are tested.

    • Chunk sizes starting from 4K to 1024K are tested.

    • XFS is used as the test file system.

    • Data is read from/written to a 10 GB file.

    • The theoretical max through put of a 4 drive array is 340 MB/s. A 6 drive array should be able to sustain 510 MB/s.

    About the data:

    • All tests have been performed by a Bash shell script that accumulated all data, there was no human intervention when acquiring data.

    • All values are based on the average of five runs. After each run, the RAID array is destroyed, re-created and formatted.

    • For every RAID level + chunk size, five tests are performed and averaged.

    • Data transfer speed is measured using the 'dd' utility with the option bs=1M.

    Test results

    Results of the tests performed with four drives:

    Test results with six drives:

    Analysis and conclusion

    Based on the test results, several observations can be made. The first one is that RAID levels with parity, such as RAID 5 and 6, seem to favor a smaller chunk size of 64 KB.

    The RAID levels that only perform striping, such as RAID 0 and 10, prefer a larger chunk size, with an optimum of 256 KB or even 512 KB.

    It is also noteworthy that RAID 5 and RAID 6 performance don't differ that much.

    Furthermore, the theoretical transfer rates that should be achieved based on the performance of a single drive, are not met. The cause is unknown to me, but overhead and the relatively weak CPU may have a part in this. Also, the XFS file system may play a role in this. Overall, it seems that on this system, software RAID does not seem to scale well. Since my big storage monster (as seen on the left) is able to perform way better, I suspect that it is a hardware issue.

    because the M2A-VM consumer-grade motherboard can't go any faster.

Page 1 / 2