Build backups: an unexpected journey

Published: Jun 16, 2016 by luxagen

As time goes by, I’ve become more picky about my Windows setup, and it’s important to me to avoid spending days reconstructing Windows builds on anything like a regular basis.

On Windows XP, catastrophic power-down of a Windows session can and does occasionally result in wholesale corruption of NTFS volumes, including Windows itself; at the very least, if it happens while a write is in progress to the registry, all bets are off as to whether that install will ever work properly again. I long ago purchased data-recovery software to deal with these kinds of situations on my data partitions when my multi-level backup strategy fails, but that doesn’t solve the builds problem, so a long time ago I set about separating as much of my software configuration and user data as possible from my Windows partitions so that they could be kept to a manageable size; that allowed me to back up my builds to image files using Norton GHOST, which also had the advantage of not wastefully storing the partition’s free space in the image.

That worked swimmingly until I one day discovered that an NTFS partition backup wouldn’t restore — it just bombed out partway through; if I recall correctly, and far worse, GHOST’s verification feature passed the image as okay. This charming habit of occasionally producing unusable NTFS images recurred once or twice before I realised that I was stuck with a backup system where peace of mind hinged on restoring every backup to a spare disk in order to be sure that it was actually valid. I quickly decided that GHOST’s file-based imaging wasn’t fit for purpose.

On an unrelated note, I also used PartitionMagic, at the time, to move and resize partitions, but discovered that it, too, had a fatal flaw: it was perfectly capable of discovering some unspecified problem halfway through a partition resize and bailing out, leaving the partition effectively destroyed. My response was decisive: I physically snapped my PartitionMagic CD and uninstalled it to make absolutely sure I was never tempted to use it again. While this might strike some readers as an overreaction, and prompt them to start tutting about good backup policy, losing data wasn’t the problem — it was just that if I could afford the time to recopy a disk’s worth of data onto a new partition, I’d have deleted and recreated it in the first place and avoided wasting time on a pointless resize attempt!

I digress; to circumvent the problem, I ended up switching to a partition-cloning workflow with GHOST and got on with my life. Later, I switched to PartedMagic for both jobs and never looked back; it’s never once given up mid-resize or lost any of my data. The downside is its apparent lack of a user interface for making image files rather than just cloning, so I stuck to cloning partitions as I had with GHOST. I’m sure many of you will be shouting at your monitors by now that I should “just use dd already!”, but although I am steadily turning into a *NIX weenie, it’s through a process of slow, conservative change so as to avoid as many self-inflicted yak wounds as possible. At this point, I might have heard of dd once or twice, but it was dark magic to me.

More recently, I’ve started to resent the inconvenience of using a bulky, slow, old 3.5” SATA disk for this job, especially as it requires an external power-supply when plugged into a laptop (my desktops have HDD bays). It also occurred to me that it would be far better to keep a bootable Linux build (perhaps PartedMagic) on the same disk so that I could use one USB port instead of two.

This problem could be solved by housing a laptop-sized SATA disk in a handy, affordable USB3 enclosure bearing the charming epigram EASY YOUR PC, but would require cloning the current backup partitions onto the new drive and verifying that the copies were binary-identical. Simple, right?

Wrong.

By this point I’ve long known about Windows’s obnoxious habit, when attaching a disk, of mounting every intelligible partition it hasn’t seen before under a drive letter, and the absence of any per-disk configuration to politely tell it to keep its grubby automounting mitts off a particular disk is particularly galling. What I didn’t know is that Windows also writes existing unmounted partitions as soon as they’re available after an overwrite. My attempts to copy the partitions over with Cygwin and dd, and verify the copies, consistently failed because as soon as dd released the target partition, Windows would waltz over and phutz with something (an MFT entry perhaps) lying between one and a few hundred kilobytes into the disk.

At first I was tempted to attribute the differences to a faulty disk, but that idea didn’t stand up given the problem’s consistency. I was tempted to break out a hex editor, but the idea of trying to decipher a hex dump of an NTFS volume didn’t appeal in the least. My next thought was that Windows Disk Management utility was to blame, but the same thing happened even with Disk Management closed and no mount points or drive-letters assigned to the partitions concerned.

At this point I was pretty certain that the Windows disk subsystem was the problem, and fumed that behaviour like this, in critical core operating-system code, makes it ever harder to be a Microsoft apologist because it shows just what the median quality of their thinking is. I’ve always trusted Windows not to mess with unpartitioned space on my drives, and although, to my knowledge, that trust isn’t misplaced, writing to unmounted partitions is a violation of the same principle.

Windows rage aside, I didn’t really care about the writes, but wanted an opportunity to verify the copies before Windows had a chance to desecrate them; that would mean booting into a sane operating system for a couple of hours and losing the ability to use my workstation normally. What to do?

In the end I decided that given my current familiarity with dd — thanks to the crash course I’ve been taking on tape drives for the last couple of years — I could just go back to image files stored on one giant backups partition. While exFAT might seem like the best choice for that partition, support for it is still spotty on the big three operating systems at this moment; meanwhile, FAT32’s inability to store files larger than 4 GB rules it out, and the persistent lack of a reliable Windows filesystem driver for ext4 leaves NTFS as the best choice.

I also realised that it would be nice to pipe the images through a fast compressor on their way to the backup disk in order to stretch it a bit farther. While I knew that bzip2, being roughly equivalent in compression ratio to WinRAR, would take forever to process tens of gigabytes of data, I was hoping that gzip might be up to the job on its fastest setting. My research quickly showed it to be too slow, with one clear contender for the job of compressing this firehose of data without slowdown: lz4. Thanks to Cygwin’s binary package, I was quickly able to test it on one of my image files and found its compression surprisingly good: just less than 2:1 for a 48 GB Windows 7 partition.

Sadly, Cygwin turned out to be pioneering in this respect: packaged lz4 builds for any but the most mainstream Linux distributions are hard to come by. Although it’s already replaced gzip for kernel compression in some distributions, it’s still a fringe tool in user space despite its minimalist awesomeness.

I hope you’ll forgive me if I digress to point out that David Bryant’s WavPack is, in my opinion, a game-changer for musicians working on computers: I’ve now been using its --fast mode inside all my recording projects for about five years. When I point out that some of these projects are 80-track 24/96 monsters that only just manage real-time playback on a quad-core 3 GHz system, you’ll understand how important it is not to waste CPU time frivolously, and therefore how seriously I had to think before adopting this practice.

As far as I can tell, lz4 is the equivalent for non-audio data, and because disk capacity has been outgrowing memory bandwidth and CPU power for so long, large amounts of data generally need to be processed as quickly as possible to make them worth handling at all. I therefore expect lz4 to quietly pop up all over the place during the next few years.

Prognostication aside, I was now faced with the problem of finding a portable Linux distribution suitable for my needs with a package manager for which lz4 was available; this turned out to be such a time-consuming research job as to be effectively impossible, so I girded my loins and faced up, quaking in my boots, to the only other option: building from source.

Building from source code is something I’ve studiously avoided on every form of Linux. As far as I can tell from afar, it’s the kind of mutant Arcturan MegaYak that can eat weeks of time before spitting one out unsatisfied, still wondering dazedly which configure options will make the damned thing compile properly, let alone link.

In this case I was wrong again: on the two or three portable distributions I tried, the hardest part was installing development tools. You can only imagine my astonishment when cloning the repository from github and running one make install command in the programs directory resulted in ready-to-use lz4 and lz4c binaries.

I now have a USB3 external disk that I can plug into any machine to boot Slacko Puppy. Once there, I mount the images partition by clicking a desktop icon, start a terminal, and run:

dd if=/dev/sda1 bs=1M | lz4 -1 > /mnt/$BACKUP_TARGET/$IMG_NAME.lz4

When backing up a build from SSD, writing to the target HDD is the bottleneck, so lz4 actually improves throughput by reducing the ratio of data written to data read. There’s another major benefit if you’re using an SSD that (a) implements TRIM properly and (b) does so in RZAT mode: unused space reads as all zeroes, allowing lz4 to encode unused portions of the disk quickly and with minimal waste in the resulting image.