When a disk fails or gets kicked out of your RAID array, it often takes a lot of time to recover the array. It takes 5 hours for my own array of 20 disks to recover a single drive.
Wouldn't it be nice if that time can be reduced? Even to 5 seconds?
Although not enabled by default, you can enable so called 'bitmaps'. As I understand it, a bitmap is basically a map of your RAID array and it charts which areas need to be resynced if a drive fails.
This is great, because I have the issues that of every 30 reboots, sometimes a disk won't get recognized and the array is degraded. Adding the disk back into the array will mean that the system will be recovering for 5+ hours.
I enabled Bitmaps and after adding a missing disk back into the array, the array was recovered instantly.
Isn't that cool?
So there are two types of bitmapsL
- internal: part of the array itself
- external: a file residing on an external drive outside the array
The internal bitmap is integrated in the array itself. Keeping the bitmap up to date will probably affect performance of the array. However I didn't notice any performance degradation.
The external bitmap is a file that must reside on a EXT2 or EXT3 based file system that is not on top of the RAID array. So this means that you need an extra drive for this or need to use your boot drive for this. I can imagine that this solution will have less impact on the performance of the array but it is a bit more hassle to maintain.
I enabled an internal bitmap on my RAID arrays like this:
mdadm --grow /dev/md5 --bitmap=internal
This is all there is to it. You can configure an external bitmap like this:
mdadm --grow /dev/md5 --bitmap=/some/directory/somefilename
There probably will be some performance penalty involved, but it does not seem to affect sequential throughput, which is the only thing that is important for my particular case.
For most people, I would recommend configuring an internal bitmap, unless you really know why you would have to use an external bitmap.