DZ – Page 3 – Don Zoomik

Windows 7 refuses to connect to 802.1X network if server certificate’s subject is empty

If the following are true…

Windows 7 connects to 802.1X enabled network
EAP method has something to do with TLS (PEAP, EAP-TLS…)
Server certificate’s subject field is empty

…then Windows 7 will refuse to connect with useless error messages. You’ll just have to know that Windows 7 doesn’t accept server certificate with empty subject. Some Certificate Services templates (Kerberos Authentication) keep subject empty by default so watch out if you have NPS on DC for example. Windows 8.1+ will work fine.

There’s little information about it online and the issue is quite hard to track down.

vSphere 6.5 guest UNMAP may cause VM I/O latency spikes – fixed in update 1

I converted some VMs to thin and upgraded VM hardware version to 13 to test out savings. Initial retrim caused transient I/O slowdown in VM but the issue kept reappearing randomly. I/O latency just spikes to 400ms for minutes for no apparent reason. It also seems to affect other surrounding VMs, just not as badly. After several days, I converted VMs back to thick and issues disappeared.

I’m not sure where the problem is and I can’t look into it anymore. Might be a bug in vSphere. Might be the IBM v7000 G2 SAN that goes crazy. As I said, I cannot investigate it any further but I’ll update the post if I ever hear anything.

PS! Savings were great, on some systems nearly 100% from VMFS perspective. On some larger VMs with possible alignment issues, reclamation takes several days though. For example, a 9TB thick file server took 3 days to shrink to 5TB.

Update 2017.o6.29:

Veeam’s (or Anton Gostev’s) newsletter mentioned a similar issue just as I came across this issue again in a new vSphere cluster. In the end VMware support confirmed the issue with expected release of 6.5 Update 1 at the end of July.

Update much later in november

I’ve been running Update 1 since pretty much release date and UNMAP works great! No particular performance hit. Sure, it might be a bit slower during UNMAP run but it’s basically invisible for most workloads.

I’ve noticed that for some VM’s, you don’t space back immediately. On some more internally fragmented huge (multi-TB) VMs, particularly those with 4K clusters, space usage seems to reduce slowly over days or weeks. I’m not sure what’s going on but perhaps ESXi is doing some kind of defrag operation in VMDK…? And yeah, doing a defrag (you can do it manually form command line in Windows 2012+) and then UNMAP helps too.

vSphere 6.5 virtual NVMe does not support TRIM/UNMAP/Deallocate

Update 2018.10.15

It works more-less fine in 6.7. Known issues/notes so far:

Ugly warning/errors is Linux kernel log if Discard is blocked (snapshot create/commit) – harmless
Linux NVMe controller has a default timeout of 30s. With VMTools, only SCSI gets increase to 180s so you might want to manually increase nvme module timeout just in case. “CRAZY FAST, CRAZY LOW LATENCY!!!” you scream? Well fabrics and transport layers still may have hickups and tolerating transient issues might be better than being broken.
When increasing VMDK sizes, Linux NVME driver doesn’t notice namespace resize. Newer kernels (4.9+ ?) have configuration device to rescan, older require VM reboot
One VMFS6 locking issue that may or not be related to vNVME. Will update if I remember to (or get feedback from VMware).
It seems to be VERY slightly faster and have VERY slightly lower CPU overhead. It’s within the margin of error, in real life it’s basically the same as PVSCSI.
The nice thing is that it works with Windows 7 and Windows 2008 R2! Remember that they don’t support SCSI UNMAP. However NVME Discard seems to work. Delete reclaims space, (ironically) manual defrag frees space, also sdelete zero successfully reclaims space.

I was playing with guest TRIM/UNMAP the other day and looked at new shiny virtual NVMe controller. While it would not help much in my workloads, cutting overhead never hurts. So I tried to do “defrag /L” in VM and it return that device doesn’t support it.

So I looked up release notes. Virtual NVMe device: “Supports NVMe Specification v1.0e mandatory admin and I/O commands”.

The thing is that NVMe part that deals with Deallocate (ATA TRIM/SCSI UNMAP in NVMe-speak) is optional. So back to pvscsi for space savings…

An unpopular opinion about Vista

I have said it again and again. I think Vista was not a bad OS at all. Not the greatest but somewhere between good and great.

While I missed very early teething issues, I did catch a few. I didn’t get to use Vista until I completed my military service, in summer of 2007. This was the first and last OS that caused me to say “wow” on first boot. It just looked so great! Sure, Linux had all the bells and whistles and XP had WindowBlinds but they never looked as clean and classy. But to get that far, I had to remove some RAM as setup hung when you had more than 2GB… And then I got a BSOD due to Bluetooth stack. 🙂
I did keep on using Vista personally until a few months after 7 came out.

I did plenty of Vista rollouts in 2008 and 2009 and… it worked great. By that time SP1 was out and drivers had stabilized. On most of hardware it ran just fine. Maybe not as fast but XP the difference was not noticeable and people actually liked Vista. For most of enterprises, I think it was a mistake to skip Vista. As tooling and many OS concepts had changed considerably, I saw many people complaining after Windows 7 release. They hadn’t even touched Vista and were surprised how similar Vista and 7 were.

Security was better. UAC was actually great (it had some nice side-effects). Quite a few features actually became usable compared to XP. It had some nice features for sysadmins that went relatively unnoticed. On the other hand, early tools sucked big time. Later WAIKs were much better and by SP2 it pretty much looked as it does today.

I switched jobs in 2010 and didn’t get to professionally touch Vista since. Kind of sad actually. Technology was solid but teething issues caused an unrecoverable PR nightmare.

Clearing Offline Files temporary files from script

There’s a nice button “Delete temporary files” in GUI to clear automatically cached data but no public information how to invoke it from script/API.
I found some nice WMI documentation and after some experimentation I came up with this.
It only runs from admin context. If you want to run it from regular user context, modify flags according to documentation (use only 0x00000002 flag).
It might be a little faster if you filter item list to only include servers (add -Filter ‘itemtype=3’) as default list includes whole UNC trees but I didn’t test it out.

$CSCItemList=(gwmi win32_offlinefilesitem).ItemPath
$CSCWMI = [wmiclass]'\\.\root\cimv2:win32_offlinefilescache'
#0x00000002+0x80000000 to Base10 eq 2147483650
$CSCWMI.DeleteItems($CSCItemList,2147483650)

Workaround script to clean up SCCM 1610 orphaned cache

SCCM 1610 at launch had a bug that caused agent upgrades to forget about cached content. Cached data stays behind until you clean it up manually, not cool for small SSDs. More here https://support.microsoft.com/en-us/kb/3214042

So I wrote a small script to roll out with compliance and remove stale data.

Seems to work but test before use. See comments for PowerShell 2.0 fix.

$CCMCache = (New-Object -ComObject "UIResource.UIResourceMgr").GetCacheInfo().Location
#For some reason it doesn't properly directly select required attribute for returned multi-instance object so I have to loop it. Some strange COM-DotNet interop problem?
$ValidCachedFolders = (New-Object -ComObject "UIResource.UIResourceMgr").GetCacheInfo().GetCacheElements() | ForEach-Object {$_.Location}
$AllCachedFolders = (Get-ChildItem -Path $CCMCache -Directory).FullName

ForEach ($CachedFolder in $AllCachedFolders) {
    If ($ValidCachedFolders -notcontains $CachedFolder) {
        Remove-Item -Path $CachedFolder -Force -Recurse
    }
}

Script to modify SCCM client cache ACL for Peer Cache

SCCM 1610 now supports inter-node content sharing without BranchCache or 3rd party tools. Annoying part is that you have to modify client cache ACL. I threw together some quick-n-dirty bits in a few minutes and it didn’t blow in my face just yet. I rolled it out with a compliance baseline to some pilot systems and it seems to work.
Caution is advised as I didn’t test it fully yet (or if Peer Cache actually works properly). It just adds required ACE for your SCCM network access account.

#SCCM Network Access account. I think it's not possible to query it from client
$NetworkUserAccount = New-Object System.Security.Principal.NTAccount("DOMAIN\User")
#SCCM Cache path from WMI. It's pretty much the same always but just in case...
$CCMCache = (New-Object -ComObject "UIResource.UIResourceMgr").GetCacheInfo().Location

#Enums for NTFS ACLs, static stuff. Could do better but stringbased cast works fine
$ACLFileSystemRights = [System.Security.AccessControl.FileSystemRights]::FullControl
$ACLAccessControlType = [System.Security.AccessControl.AccessControlType]::Allow 
$ACLInheritanceFlags = [System.Security.AccessControl.InheritanceFlags]"ContainerInherit, ObjectInherit"
$ACLPropagationFlags = [System.Security.AccessControl.PropagationFlags]::InheritOnly

#If cache folder doesn't exist, quit with error
If (!(Get-Item -Path $CCMCache)) {
    Exit 1
}

#Current ACL
$ACL = Get-Acl -Path $CCMCache

#Check if ACL already has required entry. If it has, quit cleanly
If ($ACL.Access | Where-Object -FilterScript {
    #Specific checks
    $_.FileSystemRights -eq $ACLFileSystemRights -and 
    $_.AccessControlType -eq $ACLAccessControlType -and
    $_.IdentityReference -eq $NetworkUserAccount -and
    $_.InheritanceFlags -eq $ACLInheritanceFlags -and
    $_.PropagationFlags -eq $ACLPropagationFlags
    }
) {
    #ACL entry exists
    Exit 0
} Else {
    #Modify ACL
    $ACE = New-Object System.Security.AccessControl.FileSystemAccessRule ($NetworkUserAccount, $ACLFileSystemRights, $ACLInheritanceFlags, $ACLPropagationFlags, $ACLAccessControlType) 
    $ACL.AddAccessRule($ACE)
    Set-Acl -Path $CCMCache -AclObject $ACL
}

IBM Tivoli Storage Manager excludes most VSS protected files

Let’s say we’re using IBM TSM with agents on Windows. It supports VSS snapshots so you might expect that when you perform backup, you can restore any file in system.

Wrong!

TSM will hard-exclude any VSS-protected files except for a short list of supported inbox writers. Most recent list is here:
http://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.0/com.ibm.itsm.client.doc/t_bac_sysstate.html
Don’t worry, it hasn’t changed since ever. I count 16.

And now take a look at just the list of Windows inbox writers:
https://msdn.microsoft.com/en-us/library/windows/desktop/bb968827
I currently counted 34 items (it may change in future).
WDS, WID, RMS, Certificate Services are absent in IBM’s list for example.

Now think VSS aware products, like SQL Server, Oracle, Exchange among big names. In some cases you just might not care about application-specific backups, application consistent VSS file-based backup will do just fine. SQL Server database crashed? OK, lets copy database files back in place, start engine – good enough.

Now what will Tivoli do?

VSS snapshot like pretty much every other product
Query VSS for list of writers and writer protected files
It will hard-exclude ANY file protected by ANY VSS writer not included in list

Say you have a WSUS running on WID. WID database are hard-excluded even though they are consistent in VSS snapshot. I repeat, you cannot backup these files as Tivoli will just not let you. You have a WDS to PXE boot systems? Nope. SQL Express running in simple logging mode to run some tool that you only care to have database file in backup. Tough luck, excluded.

The cynical part is that when you query TSM for excluded files, it will say excluded by operating system. No, it is not excluded by the operating system, it is excluded by IBM! When looking around in forums, the same opinion reigns. Wrong! Operating system does not exclude them. Do a backup snapshot with diskshadow and mount it. The files are there.
Also there are claims that these files should be excluded because they may be volatile and inconsistent. Wrong! The point of VSS Writers existence is to make them consistent. Not crash-consistent but cleanly consistent! Do backup snapshot with diskshadow. The files are there. They are consistent. It seems that IBM sales/marketing are really, i mean like REALLY greedy or tech guys are really incompetent.

Oh boy… I guess some guys have only seen LVM snapshots…

When we contacted support, response was “by design”. I cannot comprehend the stupidness of this response. Backup product that refuses to protect OS components.

I dug around a bit and it seems that TSM used to work fine until about version 5.5 when this “functionality” was introduced. https://adsm.org/forum/index.php?threads/files-missing-in-windows-server-2008-backup.17112

Workaround 1: PRESCEDULECMD for pretty much anything to dump or copy data before backup. The bad part is that it is only automatically invoked when backup is started from schedule.

Workaround 2: Dump TSM and get a anything else

Workaround 3: adding these options to your dsm.opt file might help. I didn’t bother to try, I voted with my wallet.
TESTFLAG VSSDISABLEEXCL
TESTFLAG SKIPSYSTEMEXCLUDE

TL;DR: After having been forced to work with Tivoli Storage Manager for a years, avoid it like plague, burn it with fire. Expensive, slow, plain stupid.

Some bugs in Windows 2016 servicing

First. When you add .Net 3.5 compontents to image, disabling them will remove them with bits. Both online and offline, with or without servicing stack updates.

In my case I add bits back to image so they are always available when required (no looking around for SxS folder etc…) but I disable all non-essential components in because they weren’t always required. Now I have to keep .Net 3.5 enabled.

Second. Disabling some components will occasionally (can’t consisntently repro) remove Server Manager. Actually it seems that when you remove too many specific components at once, Server Manager is removed. In my case I removed:

PowerShell ISE
PowerShell v2.0 engine
.Net 3.5

Nothing too bad but annoying.

WinPE manage-bde –protectors –disable C: unexpectedly enables encryption

Final update – my BIOS configuration script had a command to temporarily disable BitLocker in case configuration was applied from already deployed OS. Good old manage-bde –protectors –disable C:

However this command unexpectedly applied BitLocker to FAT32 boot volume. When querying status with manage-bde -status there is no encryption. However volume is actually encrypted. Booting to WinPE on next start would clear encryption so it only showed up when using Linux live media. Duh!

Why would it do that? Don’t know. In the end HP BIOS boots just fine and does not require ESP partition. I edited title to reflect on the actual issue.

Hold your horses! All information below is irrelevant as HP desktop BIOS seems to have a bug. It will not properly enumerate UEFI boot drives after mode switch. It may boot sometimes but not consistently. Currently only workaround is to boot to PXE after mode switch and restart TS.

~~So I was looking at this great guide on conversion from BIOS to UEFI boot in SCCM TS.~~

However my BIOS/UEFI configuration is more locked down and HP professional desktops flat out refuse to boot from plain FAT32 partitions with some options set. I’m guessing it’s because of Removable Media Boot: Disable. But still I needed to work around that. After some tinkering I discovered that boot worked fine if partition was set as EFI boot partition. However this caused WinPE to not mount it at boot. No mount, no task sequence data, fail.

~~So I created 2 partitions, first for EFI boot, second for WinPE. Then I configured BCD to point to second partition and set first as EFI boot partition. Boom, it works!~~

~~Notes:~~

~~As always, could be more efficient but good enough…~~
My configuration is a bit different. I have both x86 and amd64 WinPE data in one package (in subfolders) to support both 32bit and 64bit UEFI implementations in one package and I select relevant boot set with %PROCESSOR_ARCHITECTURE% variable. Package download size is bigger but that’s not an issue for me. This also implies that PXE WinPE image must be the same as target architecture.
In “Format and Partition Disk” step create 2 Primary partitions. First must be smaller than the second one (for example 2GB and 4GB). This is necessary as TS data is stored on the larger partition and EFI partition will not be mounted on next boot. Set first partition variable to EfiDrive and second to BootDrive

~~Call WinPE deployment script as~~



copy.cmd %EfiDrive% %BootDrive%

~~Modified copy.cmd~~



@ECHO OFF
set efidrive=%1
::S:
set bootdrive=%2
::C:
XCOPY %~dp0%PROCESSOR_ARCHITECTURE%\* /s /e /h %bootdrive%\

::https://technet.microsoft.com/en-us/library/hh265131%28v=ws.10%29.aspx

xcopy %bootdrive%\EFI\* %efidrive%\EFI\* /cherkyfs
copy %bootdrive%\windows\boot\EFI\*.efi %efidrive%\EFI\Microsoft\Boot\*
del %efidrive%\EFI\Microsoft\Boot\BCD /f

bcdedit -createstore %efidrive%\EFI\Microsoft\Boot\BCD

bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -create {bootmgr} /d "Boot Manager"
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -create {globalsettings} /d "globalsettings"
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -create {dbgsettings} /d "debugsettings"
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -create {ramdiskoptions} /d "ramdiskoptions"
for /f "Tokens=3" %%A in ('bcdedit /store %efidrive%\EFI\Microsoft\Boot\BCD /create /application osloader') do set PEStoreGuid=%%A

bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD /default %PEStoreGuid%

bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {bootmgr} device partition=%efidrive%
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {bootmgr} path \EFI\Microsoft\Boot\bootmgfw.efi
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {bootmgr} locale en-us
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {bootmgr} timeout 10

bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {Default} device ramdisk=[%bootdrive%]\sources\boot.wim,{ramdiskoptions}
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {Default} path \windows\system32\winload.efi
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {Default} osdevice ramdisk=[%bootdrive%]\sources\boot.wim,{ramdiskoptions} 
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {Default} systemroot \windows
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {Default} winpe yes
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {Default} nx optin
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -set {Default} detecthal yes
bcdedit -store %efidrive%\EFI\Microsoft\Boot\BCD -displayorder {Default} -addfirst

diskpart /s "%~dp0diskpartefiboot.txt"


diskpartefiboot.txt


select disk 0
select partition 1
set id=c12a7328-f81f-11d2-ba4b-00a0c93ec93b


Voilà! It boots!
…occasionally on some systems fastfat driver doesn’t load. It’s load type is 3 – manual (ondemand). Partitions are shown as RAW and TS will fail as data cannot be loaded. Investigating.