Running UNMAP on snapshotted VMware hardware 11+ thin VMs may cause them to inflate to full size

Scenario:

  1. ESXi 6.X (6.5 in my case)
  2. VMware hardware 11+ (13)
  3. Thin VM
  4. UNMAP aware OS (Windows 2012+)

If you snapshot VM and run UNMAP (for example retrim from defrag utility), VM may (not always) inflate to full size during snapshot commit. It also results in really long commit times.

I’ve seen it happen quite a few times and it’s really annoying if you for some historical reason have tons of free space on drives (for example NTFS dedup was enabled long after deployment) and may even cause datastores to become full (needless to say, really bad). Also it tends to happen during backup windows that keep snapshots open for quite a while (usually at night, terrible). While I could disable automatic retrim (bad with lots of small file operations, normal UNMAP isn’t very effective on them due to alignment issues) or UNMAP (even worse), it’s an acceptable risk for now if you keep enough free space on datastore to absorb inflation of the biggest VM. You can retrim after snapshot commit and it drops down to normal size quickly (minutes).

I haven’t seen this anywhere else but I guess I’ll do a reproducable PoC, contact VMware support and do an update.

vSphere 6.5 virtual NVMe does not support TRIM/UNMAP/Deallocate

Update 2018.10.15

It works more-less fine in 6.7. Known issues/notes so far:

  • Ugly warning/errors is Linux kernel log if Discard is blocked (snapshot create/commit) – harmless
  • Linux NVMe controller has a default timeout of 30s. With VMTools, only SCSI gets increase to 180s so you might want to manually increase nvme module timeout just in case. “CRAZY FAST, CRAZY LOW LATENCY!!!” you scream? Well fabrics and transport layers still may have hickups and tolerating transient issues might be better than being broken.
  • When increasing VMDK sizes, Linux NVME driver doesn’t notice namespace resize. Newer kernels (4.9+ ?) have configuration device to rescan, older require VM reboot
  • One VMFS6 locking issue that may or not be related to vNVME. Will update if I remember to (or get feedback from VMware).
  • It seems to be VERY slightly faster and have VERY slightly lower CPU overhead. It’s within the margin of error, in real life it’s basically the same as PVSCSI.
  • The nice thing is that it works with Windows 7 and Windows 2008 R2! Remember that they don’t support SCSI UNMAP. However NVME Discard seems to work. Delete reclaims space, (ironically) manual defrag frees space, also sdelete zero successfully reclaims space.

I was playing with guest TRIM/UNMAP the other day and looked at new shiny virtual NVMe controller. While it would not help much in my workloads, cutting overhead never hurts. So I tried to do “defrag /L” in VM and it return that device doesn’t support it.

So I looked up release notes. Virtual NVMe device: “Supports NVMe Specification v1.0e mandatory admin and I/O commands”.

The thing is that NVMe part that deals with Deallocate (ATA TRIM/SCSI UNMAP in NVMe-speak) is optional. So back to pvscsi for space savings…