Experiences Running ZFS on Ubuntu Linux 12.04

Thu 18 October 2012 Category: ZFS

I really like ZFS because with current data sets, I do believe that data corruption may start becoming an issue. The thing is that the license under which ZFS is released does not permit it to be used in the Linux kernel. That's quite unfortunate, but there is hope. There is a project called 'ZFS on Linux' which provides ZFS support through a kernel module, circumventing any license issues.

But as ZFS is a true next generation file system and the only one in its class stable enough for production use, I decided to give it a try.

I used my existing download server running Ubuntu 12.04 LTS. I followed these steps:

  1. move all data to my big storage nas;
  2. destroy the existing MDADM RAID arrays;
  3. recreate a new storage array through ZFS;
  4. move all data back to the new storage array.

Installation of ZFS is straight forward and well documented by the ZFSonLinux project. The main thing is how you setup your storage. My download server has six 500 GB disks and four 2 TB disks, thus a total of ten drives. So I decided to create a single zpool (logical volume) consisting of two vdevs (arrays). I thus created a vdev of 6 500 GB drives and a second vdev of the four 2 TB drives.

root@server:~# zpool status
  pool: zpool
 state: ONLINE
 scan: scrub repaired 0 in 1h12m with 0 errors on Fri Sep  7 

    NAME                               STATE   READ WRITE CKSUM
    zpool                              ONLINE     0     0     0
      raidz1-0                         ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:1:0  ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:2:0  ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:3:0  ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:4:0  ONLINE     0     0     0
      raidz1-1                         ONLINE     0     0     0
        pci-0000:00:1f.2-scsi-2:0:0:0  ONLINE     0     0     0
        pci-0000:00:1f.2-scsi-3:0:0:0  ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:0:0  ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:5:0  ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:6:0  ONLINE     0     0     0
        pci-0000:03:04.0-scsi-0:0:7:0  ONLINE     0     0     0

So the zpool consists of two vdevs that each consist of the physical drives.

Everything is going smooth so far. I did have one issue though. I decided to remove a separate disk drive from the system that was no longer needed. As I initially setup the arrays based on device names (/dev/sda, /dev/sdb), the array broke as device names changed due to the missing drive.

So I repared that by issuing these commands:

zpool export zpool
zpool import zpool -d /dev/disk/by-path/

It's important to carefully read the FAQ of ZFS on Linux and understand that you should not use regular device names like /dev/sda for your ZFS array. It is recommended to use /dev/disk/by-path/ or /dev/disk/zpool/ exactly to prevent the issue I had with the disappeared drive.

As discussed in my blog entry on why I decided not to use ZFS for my big 18 TB storage NAS, ZFS does not support 'growing' of an array as Linux software RAID does.

As the zpool consists of different hard disk types, performance tests are not consistent. I've seen 450 MB/s read speeds on the zpool, which is more than sufficient for me.

ZFS on Linux works, is fast enough and easy to setup. If I would have setup my big storage NAS today, I would probably have chosen ZFS on Linux by now. I would have accepted that I could not just expand the array with extra drives the way MDADM permits you to grow an array.

In some way, ZFS on Linux is combining the best of both world. One of the best modern file systems with a modern and well-supported Linux distribution. Only the ZFS module itself may be the weak factor as it's fairly new for Linux and not optimised yet.

Or we might have to just wait until BTFS is mature enough for production use.