1. Understanding Backups

    September 07, 2015

    Just imagine for a moment that you just lost all your pictures. I think you will feel pretty horrible. You probably want to prevent this from ever happening to you.

    You may have heard about the solution to this kind of disaster: 'make backups'. What does that really mean?

    What is a backup?

    This is my definition of a backup:

    A backup is a copy of your files on a separate physical device

    The second part is what a backup is all about. If you create a copy of a file and store the copy on your computer, that is not a backup. It's only a version. Only if you store a file on a different device, such as a USB-stick or external USB hard drive, we may call the copy a 'backup'.

    Why do I need backups?

    If you don't make backups you will lose all your data at some point.

    This is how you will lose your precious data:

    1. Human error: You accidentally delete or overwrite a file or entire folder.
    2. Hardware failure: Your computer dies and with it, all your data.
    3. Disaster: Your house burns down or all your computer equipment gets stolen.

    A good backup will protect you against all three risks.

    Why are manual backups bad?

    A backup solution that requires human interaction is a backup solution that is setup to fail. Automate your backups so your backups contrain fresh, recent data.

    A six-month old backup is better than nothing, but having to discover that you lost all your photos made in the last six months is still a bitter pill to swallow.

    The •only• proper way to make backups is to make them automatically.

    To automate the backup process, you need backup software. Both Windows and Mac OS X have build-in backup software and it is really recommended to use it, in combination with a USB-disk or NAS.

    Backup software


    Windows 10 has a nice new feature called 'file history'. I would really recommend using this feature. It creates a backup every hour and only stores changed or added files. By default, modified or deleted files are kept for one year.

    Mac OS X

    Since years, Mac OS X is shipped with "Time Machine". If you connect a USB drive to your computer, you may have seen the question wether or not to use the drive with Time Machine. Apple really tries to make creating backups as easy as possible. Time Machine offers you to travel back in time to older 'snapshots' of your computer. Very nice if you need to recover accidentally deleted files or an older version of a file.

    What are the options for storing my backups?

    There are three types of storage for your backups:

    1. On a USB hard drive (Directly-Attached Storage)
    2. On a NAS (Network-Attached Storage)
    3. In the cloud

    USB hard drive

    Cheap and straight forward to setup. Just attach a USB disk to your computer. It's ideal for desktops as they don't move around and the drive can always stay attached and USB3 is very fast.

    It's not an ideal solution for laptops because it requires human intervention: attaching and detaching cables. Long stretches of time may pass between backups, increasing the risk that you lose significant amounts of data.

    Some people buy multiple USB hard drives and rotate between them on a weekly or monthly basis. In the mean time, they store those drives at somebody else's house to protect their data against theft or fire. Although this approach works, there are three downsides:

    • Unless you use encryption, other people can access your data stored on your backup hard drive.

    • There is probably a gap of weeks or months between rotations, so recent data is at risk of being lost anyway.

    • It's a manual process, switching drives and moving them to other locations can be forgotten and thus no backups are made or recent ones stored outside of the house.

    NAS (Network-Attached Storage)

    A NAS or home server attached to your home network has the benefit of being accessible through your wireless network. Backups can then be made wirelessly so there's no messing around with cables.

    The storage provided by your NAS or home server can be shared with multiple computers, so a NAS allows you to backup multiple computers on the same storage. You don't need to buy a separate disk for each of them.


    Storing your files in the cloud means that they are stored outside of your house. Storing files on a USB drive or NAS doesn't protect against the risk of theft or fire, so cloud-based backup should probably be your first pick.

    Backup media that are not recommended:

    1. USB-sticks: are often of low capacity and quality.
    2. Optical media: Bluray and DVDs are expensive, have relatively low capacities and can be scratched easily, rendering data inaccessible.

    Recommended backup strategy for consumers

    The best way to deal with the three risk outlined previously is to employ multiple backup solutions.

    1. Online or cloud backup (mandatory)
    2. Local file backup to USB disk or NAS (recommended)
    3. Disk clone stored on USB disk or NAS (optional)

    Cloud backup is ideal because it addresses all the three risks outlined previously in one single solution. If you only want to invest in a single backup solution, use a cloud backup provider. So cloud backup should be the number one priority.

    Downloading your files from the cloud may take a while. Also the backup history of cloud backup solutions is often limited to 30 days. This is why it's adviced to combine both cloud backup and local disk or NAS backup.

    If you want quick access to your files, it's recommended to create local backups, that are stored either on a USB-drive or on a NAS (Network-Attached Storage).

    Additionally, build-in software such as Windows Backup or Mac OS X "Time Machine" allows you to create a backup history. If you find out that you accidentally deleted a file two months ago, most cloud backup providers can't help you anymore. It has been deleted from the backup

    A full disk clone will help to quickly restore your computer to a working state after a drive failure. This is probably not that important for most home users, but it can be a nice touch.

    Windows 10 has 'System Image Backup' build-in. This tool makes an exact copy of your hard drive or solid state drive so you computer can be quickly reinstalled.

    On Mac OS X, you can use carbon copy cloner or super duper to create a secondary bootable hard drive. Since you can boot easily boot from USB3 on a Mac, this could be an interesting option.

    Cloud or online backup first, local backups second

    Storing your backups in 'the cloud' means that your backups are stored far away from your home. In case your house burns down or all your computer equipment is stolen, your data will still be safe in the cloud.

    I have no affiliation with any cloud-based backup provider. Take a look at this list of online cloud-based backup providers.

    Here is a table (2015) comparing the different vendors.

    Cloud backups have some additional benefits:

    1. Fully automated backups: cloud backup providers require you to install backup software that creates backups automatically, often every hour. Cloud backup providers target everybody - especially novice computer users - by making their software very easy to setup and very robust. The software is very fool-proof and this helps assure you that backups are made regularly, without you having to babysit the process. Most vendors alert you by email if backups have not been made for a while.

    2. If you are a laptop-user, as long as you have internet access, backups continue to be made, even if you are traveling.

    3. Cloud backup providers often keep a history of your files for a certain number of days. If you need an older version of a file, or need to recover a file you deleted two weeks ago, cloud backup can also help.

    Cloud backup solutions do have some downsides:

    1. Restore speed: the speed at which you will be able to restore your data depends on the speed provided by your cloud provider and the speed of your internet connection. Restoring all your data could take days or even weeks. Some providers will ship you a hard drive filled with your data at an additional cost.

    2. Privacy & Security: You will be storing your data on someone else's computer. You must accept or assume that if they want to, your cloud backup provider will be able to access your files as stored on their computers.

    If your cloud backup provider ever gets hacked, you must assume that your data is compromised.

    Some cloud backup providers provide an extra - more advanced - security option where data is encrypted on your computer with a password only you know. If you ever happen to forget this password, even the cloud provider or the hacker who compromises them won't be able to access your data.

    It is up to you to decide if you want to trade the risk of losing all of your files with the risk of of losing your privacy.

    What about Dropbox, Google Drive, Skydrive or iCloud Drive?

    Storing a copy of a file online, such as on Dropbox, Google Drive, Skydrive or iCloud Drive also counts as a backup. Because you are storing the file on a different computer owned by Dropbox, Google, Microsoft or Apple.

    Drop box - for instance - also allows you to recover accidentally deleted files or folders up to 30 days ago.

    However, these solution do not provide backup software. They don't offer the option to create a full backup of your computer - including all your photos - at a reasonable price and are not a real substitute for a true online backup solution.

    Test a restore or you won't know if your backup works

    You need to check your backups periodically. Are you able to restore a few files? There are too many anekdotes of people who did setup a backup solution, that was silently failing for a long time. Then, when disaster struck, they discovered that there was no recent backup available. Horrible.

    Make sure you do a test restore once in a while. Check your backups.

    Closing words

    I hope this post helps you understand backups and how to approach them. Feel free to ask any questions or leave any comments. Feedback is appreciated.

    Tagged as : Backup
  2. ZFS Performance on HP Proliant Microserver Gen8 G1610T

    August 14, 2015

    I think the HP Proliant Microserver Gen8 is a very interesting little box if you want to build your own ZFS-based NAS. The benchmarks I've performed seem to confirm this.

    The Microserver Gen8 has nice features such as:

    • iLO (KVM over IP with dedicated network interface)
    • support for ECC memory
    • 2 x Gigabit network ports
    • Free PCIe slot (half-height)
    • Small footprint
    • Fairly silent
    • good build quality

    The Microserver Gen8 can be a better solution than the offerings of - for example - Synology or QNAP because you can create a more reliable system based on ECC-memory and ZFS.


    Please note that the G1610T version of the Microserver Gen8 does not ship with a DVD/CD drive as depicted in the image above.

    The Gen8 can be found fairly cheap on the European market at around 240 Euro including taxes and if you put in an extra 8 GB of memory on top of the 2 GB installed you have a total of 10 GB, which is more than enough to support ZFS.

    The Gen8 has room for 4 x 3.5" hard drives so with todays large disk sizes you can pack quite a bit of storage inside this compact machine.


    Netto storage capacity:

    This table gives you a quick overview of the netto storage capacity you would get depending on the chosen drive size and redundancy.

    Drive sizeRAIDZRAIDZ2 or Mirror
    3 TB 9 TB 6 TB
    4 TB12 TB 8 TB
    6 TB18 TB12 TB
    8 TB24 TB16 TB

    Boot device

    If you want to use all four drive slots for storage, you need to boot this machine from either the fifth internal SATA port, the internal USB 2.0 port or the microSD card slot.

    The fifth SATA port is not bootable if you disable the on-board RAID controller and run in pure AHCI mode. This mode is probably the best mode for ZFS as there seems to be no RAID controller firmware active between the disks and ZFS. However, only the four 3.5" drive bays are bootable.

    The fifth SATA port is bootable if you configure SATA to operate in Legacy mode. This is not recommended as you lose the benefits of AHCI such as hot-swap of disks and there are probably also performance penalties.

    The fifth SATA port is also bootable if you enable the on-board RAID controller, but do not configure any RAID arrays with the drives you plan to use with ZFS (Thanks Mikko Rytilahti). You do need to put the boot drive in a RAID volume in order to be able to boot from the fifth SATA port.

    The unconfigured drives will just be passed as AHCI devices to the OS and thus can be used in your ZFS array. The big question here is what happens if you encounter read errors or other drive problems that ZFS could handle, but would be a reason for the RAID controller to kick a drive off the SATA bus. I have no information on that.

    I myself used an old 2.5" hard drive with a SATA-to-USB converter which I stuck in the case (use double-sided tape or velcro to mount it to the PSU). Booting from USB stick is also an option, although a regular 2.5" hard drive or SSD is probably more reliable (flash wear) and faster.

    Boot performance

    The Microserver Gen8 takes about 1 minute and 50 seconds just to pass the BIOS boot process and start booting the operating system (you will hear a beep).

    Test method and equipment

    I'm running Debian Jessie with the latest stable ZFS-on-Linux 0.6.4. Please note that reportedly FreeNAS also runs perfectly fine on this box.

    I had to run my tests with the disk I had available:

    root@debian:~# show disk -sm
    | Dev | Model              | GB   |   
    | sda | SAMSUNG HD103UJ    | 1000 |   
    | sdb | ST2000DM001-1CH164 | 2000 |   
    | sdc | ST2000DM001-1ER164 | 2000 |   
    | sdd | SAMSUNG HM250HI    | 250  |   
    | sde | ST2000DM001-1ER164 | 2000 |   

    The 250 GB is a portable disk connected to the internal USB port. It is used as the OS boot device. The other disks, 1 x 1 TB and 3 x 2 TB are put together in a single RAIDZ pool, which results in 3 TB of storage.

    Tests with 4-disk RAIDZ VDEV

    root@debian:~# zfs list
    testpool  48.8G  2.54T  48.8G  /testpool
    root@debian:~# zpool status
      pool: testpool
     state: ONLINE
      scan: none requested
        NAME                        STATE     READ WRITE CKSUM
        testpool                    ONLINE       0     0     0
          raidz1-0                  ONLINE       0     0     0
            wwn-0x50000f0008064806  ONLINE       0     0     0
            wwn-0x5000c5006518af8f  ONLINE       0     0     0
            wwn-0x5000c5007cebaf42  ONLINE       0     0     0
            wwn-0x5000c5007ceba5a5  ONLINE       0     0     0
    errors: No known data errors

    Because a NAS will face data transfers that are sequential in nature, I've done some tests with 'dd' to measure this performance.

    Read performance:

    root@debian:~# dd if=/testpool/test.bin of=/dev/null bs=1M
    50000+0 records in 50000+0 records out 52428800000 bytes (52 GB) copied, 162.429 s, 323 MB/s

    Write performance:

    root@debian:~# dd if=/dev/zero of=/testpool/test.bin bs=1M count=50000 conv=sync 50000+0 records in 50000+0 records out 52428800000 bytes (52 GB) copied, 169.572 s, 309 MB/s

    Test with 3-disk RAIDZ VDEV

    After the previous test I wondered what would happen if I would exclude the older 1 TB disk and create a pool with just the 3 x 2 TB drives. This is the result:

    Read performance:

    root@debian:~# dd if=/testpool/test.bin of=/dev/null bs=1M conv=sync 50000+0 records in 50000+0 records out 52428800000 bytes (52 GB) copied, 149.509 s, 351 MB/s

    Write performance:

    root@debian:~# dd if=/dev/zero of=/testpool/test.bin bs=1M count=50000 conv=sync 50000+0 records in 50000+0 records out 52428800000 bytes (52 GB) copied, 144.832 s, 362 MB/s

    The performance is clearly better even there's one disk less in the VDEV. I would have liked to test with an additional 2 TB drive what kind of performance would be achieved with four drives but I only have three.

    The result does show that the pool is more than capable of sustaining gigabit network transfer speeds.

    This is confirmed when performing the actual network file transfers. In the example below, I simulate a copy of a 50 GB test file from the Gen8 towards a test system using NFS. Tests are performed using the 3-disk pool.

    NFS read performance:

    root@nano:~# dd if=/mnt/server/test2.bin of=/dev/null bs=1M
    50000+0 records in
    50000+0 records out
    52428800000 bytes (52 GB) copied, 443.085 s, 118 MB/s

    NFS write performance:

    root@nano:~# dd if=/dev/zero of=/mnt/server/test2.bin bs=1M count=50000 conv=sync 
    50000+0 records in
    50000+0 records out
    52428800000 bytes (52 GB) copied, 453.233 s, 116 MB/s

    I think these results are excellent. Tests with the 'cp' command give the same results.

    I've also done some test with the SMB/CIFS protocol. I've used a second Linux box as a CIFS client to connect to the Gen8.

    CIFS read performance:

    root@nano:~# dd if=/mnt/test/test.bin of=/dev/null bs=1M
    50000+0 records in
    50000+0 records out
    52428800000 bytes (52 GB) copied, 527.778 s, 99.3 MB/s

    CIFS write performance:

    root@nano:~# dd if=/dev/zero of=/mnt/test/test3.bin bs=1M count=50000 conv=sync
    50000+0 records in
    50000+0 records out
    52428800000 bytes (52 GB) copied, 448.677 s, 117 MB/s

    Hot-swap support

    Although it's even printed on the hard drive caddies that hot-swap is not supported, it does seem to work perfectly fine if you run the SATA controller in AHCI mode.

    Fifth SATA port for SSD SLOG/L2ARC?

    If you buy a converter cable that converts a floppy power connector to a SATA power connector, you could install an SSD. This SSD can then be used as a dedicated SLOG device and/or L2ARC cache if you have a need for this.

    RAIDZ, is that OK?

    If you want maximum storage capacity with redundancy RAIDZ is the only option. RAID6 or two mirrored VDEVs is more reliable, but will reduce available storage space by a third.

    The main risk of RAIDZ is a double-drive failure. As with larger drive sizes, a resilver of a VDEV will take quite some time. It could take more than a day before the pool is resilvered, during which you run without redundancy.

    With the low number of drives in the VDEV the risk of a second drive failure may be low enough to be acceptable. That's up to you.

    Noise levels

    In the past, there have been reports about the Gen8 making tons of noise because the rear chasis fan spins at a high RPM if the RAID card is set to AHCI mode.

    I myself have not encountered this problem. The machine is almost silent.

    Power consumption

    With drives spinning: 50-55 Watt. With drives standby: 30-35 Watt.


    I think my benchmarks show that the Microserver Gen8 could be an interesting platform if you want to create your own ZFS-based NAS.

    Please note that it is likely that since the Gen9 server platform is already out for some time, HP may release a Gen9 version of the microserver in the near future. However as of August 2015, there is no information on this yet and it is not clear if a successor is going to be released.

    Tagged as : ZFS microserver
  3. The Sorry State of CoW File Systems

    March 01, 2015

    I'd like to argue that both ZFS and BTRFS both are incomplete file systems with their own drawbacks and that it may still be a long way off before we have something truly great.

    Both ZFS and BTRFS are two heroic feats of engineering, created by people who are probably ten times more capable and smarter than me. There is no question about my appreciation for these file systems and what they accomplish.

    Still, as an end-user, I would like to see some features that are often either missing or not complete. Make no mistake, I believe that both ZFS and BTRFS are probably the best file systems we have today. But they can be much better.

    I want to start with a terse and quick overview on why both ZFS and BTRFS are such great file systems and why you should take some interest in them.

    Then I'd like to discuss their individual drawbacks and explain my argument.

    Why ZFS and BTRFS are so great

    Both ZFS and BTRFS are great for two reasons:

    1. They focus on preserving data integrity
    2. They simplify storage management

    Data integrity

    ZFS and BTRFS implement two important techniques that help preserve data.

    1. Data is checksummed and its checksum is verified to guard against bit rot due to broken hard drives or flaky storage controllers. If redundancy is available (RAID), errors can even be corrected.

    2. Copy-on-Write (CoW), existing data is never overwritten, so any calamity like sudden power loss cannot cause existing data to be in an inconsistent state.

    Simplified storage management

    In the old days, we had MDADM or hardware RAID for redundancy. LVM for logical volume management and then on top of that, we have the file system of choice (EXT3/4, XFS, REISERFS, etc).

    The main problem with this approach is that the layers are not aware of each other and this makes things very inefficient and more difficult to administer. Each layer needs it's own attention.

    For example, if you simply want to expand storage capacity, you need to add drives to your RAID array and expand it. Then, you have to alert the LVM layer of the extra storage and as a last step, grow the file system.

    Both ZFS and BTRFS make capacity expansion a simple one line command that addresses all three steps above.

    Why are ZFS and BTRFS capable of doing this? Because they incorporate RAID, LVM and the file system in one single integrated solution. Each 'layer' is aware of the other, they are tightly integrated. Because of this integration, rebuilds after a drive faillure are often faster than with 'legacy RAID' solutions, because they only need to rebuild the actual data, not the entire drive.

    And I'm not even talking about the joy of snapshots here.

    The inflexibility of ZFS

    The storage building block of ZFS is a VDEV. A VDEV is either a single disk (not so interesting) or some RAID scheme, such as mirroring, single-parity (RAIDZ), dual-parity (RAIDZ2) and even tripple-parity (RAIDZ3).

    To me, a big downside to ZFS is the fact that you cannot expand a VDEV. Ok, the only way you can expand the VDEV is quite convoluted. You have to replace all of the existing drives, one by one, with bigger ones and rebuild the VDEV each time you replace one of the drives. Then, when all drives are of the higher capacity, you can expand your VDEV. This is quite impractical and time-consuming, if you ask me.

    ZFS expects you just to add extra VDEVS. So if you start with a single 6-drive RAIDZ2 (RAID6), you are expected to add another 6-drive RAIDZ2 if you want to expand capacity.

    What I would want to do is just to ad one or two more drives and grow the VDEV, as is possible with many hardware RAID solutions and with "MDADM --grow" for ages.

    Why do I prefer this over adding VDEVS? Because it's quite evident that this is way more economical. If I can just expand my RAIDZ2 from 6 drives to 12 drives, I would only sacrifice two drives for parity. If I add two VDEVS each of them RAIDZ2, I sacrifice four drives (16% vs 33% capacity loss).

    I can imagine that in the enterprise world, this is just not that big of a deal, a bunch of drives are a rounding error on the total budget and availability and performance are more important. Still, I'd like to have this option.

    Either you are forced to buy and implement the storage you may expect to need in the future, or you must add it later on, wasting drives on parity you would otherwise not have done.

    Maybe my wish for a zpool grow option is more geared to hobbyist or home usage of ZFS and ZFS was always focussed on enterprise needs, not the needs of hobbyists. So I'm aware of the context here.

    I'm not done with ZFS however, because the way ZFS works, there is another great inflexibility. If you don't put the 'right' number of drives in a VDEV, you may lose significant portions of storage, which is a side-effect of how ZFS works.

    The following ZFS pool configurations are optimal for modern 4K sector harddrives:
    RAID-Z: 3, 5, 9, 17, 33 drives
    RAID-Z2: 4, 6, 10, 18, 34 drives
    RAID-Z3: 5, 7, 11, 19, 35 drives

    I've seen first-hand with my 71 TiB NAS that if you don't use the optimal number of drives in a VDEV, you may lose whole drives worth of netto storage capacity. In that regard, my 24-drive chassis is very suboptimal.

    The sad state of RAID on BTRFS

    BTRFS has none of the downsides of ZFS as described in the previous section as far as I'm aware of. It has plenty of its own, though. First of all: BTRFS is still not stable, especially the RAID 5/6 part is unstable.

    The RAID 5 and RAID 6 implementation are so new, the ink they were written with is still wet (February 8th 2015). Not something you want to trust your important data to I suppose.

    I did setup a test environment to play a bit with this new Linux kernel (3.19.0) and BTRFS to see how it works and although it is not production-ready yet, I really like what I see.

    With BTRFS you can just add or remove drives to a RAID6 array as you see fit. Add two? Subtract 3? Whatever, the only thing you have to wait for is BTRFS rebalancing the data over either the new or remaining drives.

    This is friggin' awesome.

    If you want to remove a drive, just wait for BTRFS to copy the data from that drive to the other remaining drives and you can remove it. You want to expand storage? Just add the drives to your storage pool and have BTRFS rebalance the data (which may take a while, but it works).

    But I'm still a bit sad. Because BTRFS does not support anything beyond RAID6. No multiple RAID6 (RAID60) arrays or tripple-parity, as ZFS supports for ages. As with my 24-drive file server, putting 24 drives in a single RAID6, starts to feel like I'm asking for trouble. Tripple-parity or RAID 60 would probably be more reasonable. But no luck with BTRFS.

    However, what really frustrates me is this article by Ronny Egner. The author of snapraid, Andrea Mazzoleni, has written a functional patch for BTRFS that implements not only tripple-parity RAID, but even up to six parity disks for a volume.

    The maddening thing is that the BTRFS maintainers are not planning to include this patch into the BTRFS code base. Please read Ronny's blog. The people working on BTRFS are working for enterprises who want enterprise features. They don't care about tripple-parity or features like that because they have access to something presumably better: distributed file systems, which may do away with the need for larger disk arrays and thus tripple-parity.

    BTRFS is in development for a very long time and only recently has RAID 5/6 support been introduced. The risk of the write-hole, something addressed by ZFS ages ago, is still an open issue. Considering all of this, BTRFS is still a very long way off, of being the file system of choice for larger storage arrays.

    BTRFS seems to be way more flexible in terms of storage expansion or shrinking, but it slow pace of development makes it still unusable for anything serious for at least the next year I guess.


    BTRFS addresses all the inflexibilities of ZFS but it's immaturity and lack of more advanced RAID schemes makes it unusable for larger storage solutions. This is so sad because by design it seems to be the better, way more flexible option as compared to ZFS.

    I do understand the view of the BTRFS developers. With the enterprise data sets, at scale, it's better to use distributed file systems to handle storage and redundancy, than on the smaller system scale. But this kind of environment is not reachable for many.

    So at the moment, compared to BTRFS, ZFS is still the better option for people who want to setup large, reliable storage arrays.

    Tagged as : ZFS BTRFS
  4. Configuring SCST iSCSI Target on Debian Linux (Wheezy)

    February 01, 2015

    My goal is to export ZFS zvol volumes through iSCSI to other machines. The platform I'm using is Debian Wheezy.

    There are three iSCSI target solutions available for Linux:

    1. LIO
    2. IET
    3. SCST

    I've briefly played with LIO but the targetcli tool is interactive only. If you want to automate and use scripts, you need to learn the Python API. I wonder what's wrong with a plain old text-based configuration file.

    iscsitarget or IET is broken on Debian Wheezy. If you just 'apt-get install iscsitarget', the iSCSI service will just crash as soon as you connect to it. This has been the case for years. I wonder why they don't just drop this package. It is true that you can manually download the "latest" version of IET, but don't bother, it seems abandoned. The latest release stems from 2010.

    It seems that SCST is at least maintained and uses plain old text-based configuration files. So it has that going for it, which is nice. SCST does not require kernel patches to run. But particularly a patch regarding "CONFIG_TCP_ZERO_COPY_TRANSFER_COMPLETION_NOTIFICATION" is said to improve performance.

    To use full power of TCP zero-copy transmit functions, especially
    dealing with user space supplied via scst_user module memory, iSCSI-SCST
    needs to be notified when Linux networking finished data transmission.
    kernel config option. This is highly recommended, but not required.
    Basically, iSCSI-SCST works fine with an unpatched Linux kernel with the
    same or better speed as other open source iSCSI targets, including IET,
    but if you want even better performance you have to patch and rebuild
    the kernel.

    So in general, patching your kernel is not always required, but an example will be given anyway.

    Getting the source

    cd /usr/src

    We need the following files:

    wget http://heanet.dl.sourceforge.net/project/scst/scst/scst-3.0.0.tar.bz2
    wget http://heanet.dl.sourceforge.net/project/scst/iscsi-scst/iscsi-scst-3.0.0.tar.bz2
    wget http://heanet.dl.sourceforge.net/project/scst/scstadmin/scstadmin-3.0.0.tar.bz2

    We extract them with:

    tar xjf scst-3.0.0.tar.bz2
    tar xjf iscsi-scst-3.0.0.tar.bz2
    tar xjf scstadmin-3.0.0.tar.bz2

    Patching the kernel

    You can skip this part if you don't feel like you need or want to patch your kernel.

    apt-get install linux-source kernel-package

    We need to extract the kernel source:

    cd /usr/src
    tar xjf linux-source-3.2.tar.bz2
    cd linux-source-3.2

    Now we first copy the kernel configuration from the current system:

    cp /boot/config-3.2.0-4-amd64 .config

    We patch the kernel with two patches:

    patch -p1 < /usr/src/scst-3.0.0/kernel/scst_exec_req_fifo-3.2.patch
    patch -p1 < /usr/src/iscsi-scst-3.0.0/kernel/patches/put_page_callback-3.2.57.patch

    It seems that for many different kernel versions, separate patches can be found in the above paths. If you follow these steps at a later date, please check the version numbers.

    The patches are based on stock kernels from kernel.org. I've applied the patches against the Debian-patched kernel and faced no problems, but your milage may vary.

    Let's build the kernel (will take a while):

    yes | make-kpkg -j $(nproc) --initrd --revision=1.0.custom.scst kernel_image

    The 'yes' is piped into the make-kpkg command to answer some questions with 'yes' during compilation. You could also add the appropriate value in the .config file.

    The end-result of this command is a kernel package in .deb format in /usr/src. Install it like this:

    dpkg -i /usr/src/<custom kernel image>.deb

    Now reboot into the new kernel:



    cd /usr/src/scst-3.0.0
    make install
    cd /usr/src/iscsi-scst-3.0.0
    make install
    cd /usr/src/scstadmin-3.0.0
    make install

    Make SCST start at boot

    On Debian Jessie:

    systemctl enable scst.service

    Configure SCST

    Copy the example configuration file to /etc:

    cp /usr/src/iscsi-scst-3.0.0/etc/scst.conf /etc

    Edit /etc/scst.conf to your liking. This is an example:

    HANDLER vdisk_fileio {
            DEVICE disk01 {
                    filename /dev/sdb
                    nv_cache 1
    TARGET_DRIVER iscsi {
            enabled 1
            TARGET iqn.2015-10.net.vlnb:tgt {
                    IncomingUser "someuser somepasswordof12+chars"
                    HeaderDigest   "CRC32C,None"
                    DataDigest   "CRC32C,None"
                    LUN 0 disk01
                    enabled 1

    Please note that the password must be at least 12 characters.

    After this, you can start the SCST module and connect your initiator to the appropriate LUN.

    /etc/init.d/scst start

    Closing words

    It turned out that setting up SCST and compiling a kernel wasn't that much of a hassle. The main issue with patching kernels is that you have to repeat the procedure every time a new kernel version is released. And there is always a risk that a new kernel version breaks the SCST patches.

    However, the whole process can be easily automated and thus run as a test in a virtual environment.

    Tagged as : iSCSI SCST
  5. Why I Do Use ZFS as a File System for My NAS

    January 29, 2015

    On February 2011, I posted an article about my motivations why I did not use ZFS as a file system for my 18 TB NAS.

    You have to understand that at the time, I believe the arguments in the article were relevant, but much has changed since then, and I do believe this article is not relevant anymore.

    I really recommend giving ZFS a serious consideration if you are building your own NAS. It's probably the best file system you can use if you care about data integrity.

    ZFS may only be available for non-Windows operating systems, but there are quite a few easy-to-use NAS distros available that turn your hardware into a full-featured home NAS box, that can be managed through your web browser. A few examples:

    Arstechnica article about FreeNAS vs NAS4free.

    If you are quite familiar with FreeBSD or Linux, I do recommend this ZFS how-to article from Arstechnica. It offers a very nice introduction to ZFS and explains terms like 'pool' and 'vdev'.

    My historical reasons for not using ZFS at the time

    When I started with my 18 TB NAS in 2009, there was no such thing as ZFS for Linux. ZFS was only available in a stable version for Open Solaris. We all know what happened to Open Solaris (it's gone).

    So you might ask: "Why not use ZFS on FreeBSD then?". Good question, but it was bad timing:

    The FreeBSD implementation of ZFS became only stable [sic] in January 2010, 6 months after I build my NAS (summer 2009). So FreeBSD was not an option at that time.

    One of the other objections against ZFS is the fact that you cannot expand your storage by adding single drives and growing the array as your data set grows.

    A ZFS pool consists of one or more VDEVs. A VDEV is a traditional RAID-array. You expand storage capacity by expanding the ZFS pool, not the VDEVS. You cannot expand the VDEV itself. You can only add VDEVS to a pool.

    So ZFS either forces you to invest in storage you don't need upfront, or it forces you invest later on because you may waste quite a few extra drives on parity. For example, if you start with a 6-drive RAID6 (RAIDZ) configuration, you will probably expand with another 6 drives. So the pool has 4 parity drives on 12 total drives (33% loss). Investing upfront in 10 drives instead of 6 would have been more efficient because you only lose 2 drives out of 10 to parity (20% loss).

    However, despite all this, I still think that even at home, it is recommended to bite the bullet and invest a bit more in hard drives and switch to ZFS.

    So at the time, I found it reasonable to stick with what I knew: Linux & MDADM.

    But my new 71 TiB NAS is based on ZFS.

    I wrote an article about my worry that ZFS may die with FreeBSD as it sole backing, but fortunately, I've been proven very, very wrong.

    ZFS is now supported on FreeBSD and Linux. Despite some licencing issues that prevent ZFS from being integrated in the Linux kernel itself, it can still be used as a regular kernel module and it works perfectly.

    There is even an open-source ZFS consortium that brings together all the developers for the different operating systems supporting ZFS.

    ZFS is here to stay for a very long time.

    Tagged as : ZFS

Page 1 / 37