RAID Array Size and Rebuild Speed

Sat 03 April 2010 Category: Uncategorized

When a disk fails of a RAID 5 array, you are no longer protected against (another) disk failure and thus data loss. During this rebuild time, you are vulnerable. The longer it takes to rebuild your array, the longer you are vulnerable. Especially during a disk-intensive period, because the array must be reconstructed.

When one disk fails of a RAID 6 array, you are still protected against data loss because the array can take a second disk failure. So RAID 6 is almost always the better choice. Especially with large disks (1+ TB), because the rebuild time largely depends on the size of a single disk, not on the size of the entire RAID array.

However, there is one catch. The size of the RAID array matters when it becomes big, 10+ drives or more. No matter if you use hardware- or software- based RAID, the processor must read the contents of all drives simultaneously and use that information to rebuild the replaced drive. When creating a large RAID array, such as in my storage array, with 20 disks, the check and rebuild of the array becomes CPU-bound.

This is because the CPU must process 1,1 GB/s (as in gigabyte!) of data and use that data stream to rebuild that single drive. Using 1 TB drives, it checks or rebuilds the array at about 50 MB/s, which is less than half what the drives are capable of (100+ MB/s). Top shows that indeed the CPU is almost saturated (95%). Please note that a check or rebuild of my storage server takes about 5 hours currently, but that could be way shorter if the CPU was not saturated.

My array is not for professional use and fast rebuild times are not that of an issue. But if you're more serious about your setup, it may be advised to create more smaller RAID vollumes and glue them together using LVM or some similar solution.

Comments