Articles in the RAID category

  1. Speeding Up Linux MDADM RAID Array Rebuild Time Using Bitmaps

    Thu 22 December 2011


    Update 2020: Please beware of the impact of random write I/O performance.

    Please note that with a modern Linux distribution, bitmaps are enabled by default. They will not help speed up a rebuild after a failed drive. But it will help resync an array that got out-of-sync due to power failure or another intermittent cause.


    When a disk fails or gets kicked out of your RAID array, it often takes a lot of time to recover the array. It takes 5 hours for my own array of 20 disks to recover a single drive.

    Wouldn't it be nice if that time can be reduced? Even to 5 seconds?

    Although not enabled by default, you can enable so called 'bitmaps'. As I understand it, a bitmap is basically a map of your RAID array and it charts which areas need to be resynced if a drive fails.

    This is great, because I have the issues that of every 30 reboots, sometimes a disk won't get recognized and the array is degraded. Adding the disk back into the array will mean that the system will be recovering for 5+ hours.

    I enabled Bitmaps and after adding a missing disk back into the array, the array was recovered instantly.

    Isn't that cool?

    So there are two types of bitmapsL

    1. internal: part of the array itself
    2. external: a file residing on an external drive outside the array

    The internal bitmap is integrated in the array itself. Keeping the bitmap up to date will probably affect performance of the array. However I didn't notice any performance degradation.

    The external bitmap is a file that must reside on a EXT2 or EXT3 based file system that is not on top of the RAID array. So this means that you need an extra drive for this or need to use your boot drive for this. I can imagine that this solution will have less impact on the performance of the array but it is a bit more hassle to maintain.

    I enabled an internal bitmap on my RAID arrays like this:

    mdadm --grow /dev/md5 --bitmap=internal
    

    This is all there is to it. You can configure an external bitmap like this:

    mdadm --grow /dev/md5 --bitmap=/some/directory/somefilename
    

    There probably will be some performance penalty involved, but it does not seem to affect sequential throughput, which is the only thing that is important for my particular case.

    For most people, I would recommend configuring an internal bitmap, unless you really know why you would have to use an external bitmap.

  2. Do Not Buy a Hardware RAID Controller for Home Use

    Wed 17 November 2010

    Hardware RAID controllers are considered 'the best' solution for high performance and high availability. However, this is not entirely true. Using a hardware RAID controller might even endanger your precious data.

    For enterprise environments, where performance is critical, it is more important that the arrays keeps on delivering data at a high speed. Professional RAID controllers use TLER with TLER-enabled disks to limit the time spend on recovering bad sectors. If a disk encounters a bad sector, there is no time to pause and try to fix it. The disk is just dropped out of the RAID array after just a couple of seconds. At that moment, the array still performes relatively well, but there is no redundancy. If another disk fails (another bad sector?) the array is lost, with all its data.

    More people are building NAS boxes for centralized storage of data, for private home use. Since disks are cheap, it is possible to create lots of storage capacity for little money. Creating backups of terabytes of data is however not cheap. Or you have to create two NAS boxes. But that is very expensive and not worth the effort.

    People seem to spend lots of money on expensive enterprise level hardware RAID cards, not understanding that the whole TLER-mechanism causes an increased risk for their data. In enterprise environments, budgets are relatively big, and data is always backed up. They can afford to take the risk of losing a RAID array due to these backups. But consumers often don't have the money to spend on creating backups of terabytes of data. They just go for RAID 5 or RAID 6 and hope for the best.

    For consumers, if the RAID array goes, all data is lost.

    So consumers should choose a RAID solution that will do its best to recover from hardware failure. Performance is not so much an issue. Reliability is. So consumers do want disks to spend 'ages' on recovering bad sectors if this means that the RAID array will survive.

    Linux software RAID and ZFS do not use TLER and therefore are a safer choice for your data then regular hardware RAID controllers. You may still use such controllers (but please test them properly) but only to provide SATA ports with individual disks, the RAID part should be handled by Linux.

    So in my opinion, hardware RAID controllers are more expensive, require more expensive (enterprise) disks and are less safe for your data.

    Tagged as : Uncategorized
  3. Linux Software RAID Benchmarking Script

    Wed 29 September 2010

    Just a small post.

    To benchmark your Linux software RAID array as setup with MDADM, please use my new benchmark script. I used this script to create these results.

    You may need to configure some values within the header of this file to make it fit your enviroment.

     DEVICES="/dev/sd[a-f]"
     NO_OF_DEVICES=6
     ARRAY=/dev/md5
     CHUNKS="4 8 16 32 64 128 256 512 1024"
     MOUNT=/storage
     LOG=/var/log/raid-test.log
     LOGDEBUG=/var/log/raid-test-debug.log
     LEVEL="0 5 6 10"
     TESTFILE=$MOUNT/test.bin
     TESTFILESIZE=10000 (IN MB, thus this is 10 GB)
     TRIES=5 (how many times to run a benchmark.)
    

    By default, the script wil format the array using XFS, feel free to format it with another filesystem such as EXT4 or EXT3 or whatever you want to test.

    Tagged as : Uncategorized

Page 1 / 3