Working around slow PST import in Exchange Online

If you’ve tried Exchange Online PST import then you probably know that it’s as slow as molasses in January and sucks in pretty much every way.

  • “PST file is imported to an Office 365 mailbox at a rate of at least 1 GB per hour” is pure fantasy, 0,5GB per hour should be considered excellent throughput and in test runs I achieved only ~0,3 GB/h. Running in one batch seems to import PSTs with limited parallel throughput (almost serially).
  • Security & Compliance Center is just unusably slow.
  • I had to wait 5 days for Mail Import Export role to propagate for Import to activate. Documented 24 hours, you wish.
  • Feedback
  • I’ll just stop here…

I had a dataset to import and I didn’t plan to wait for a month so I looked around a bit. Only hint was in a lost Google result that you should separate imports into separate batches. However GUI is so slow that it’s just infeasible. So I went poking around in the backend.

This blog looked promising and quite helpful  but was concerned with other limitations of GUI import. Nevertheless, you should read it to understand the workflow.

PowerShell access exists and works quite well. There’s talk of “New-o365MailboxImportRequest” CmdLet but that’s just ancient history. New-MailboxImportRequest works fine, just source syntax is different from on-prem version.

Notes:

  • You MUST use generic Azure Blob Storage. Autoprovisioned one ONLY works with GUI. If you try to access it via PowerShell, you just get 403 or 404 error for whatever reason.
  • Generate one batch per PST.
  • Azure blobs are Case Sensitive. Keep that in mind when creating your mapping tables.

So in the end I ran something like that. Script had a lot of additional logic but I cut parts unrelated to the problem at hand.

#base URL for PSTs, your blob storage
$azblobaccount = 'https://blablabla.blob.core.windows.net/blablabla'
#the one like '?sv=...'
$azblobkey = 'yourSASkey'
#I used mapping table just as in Microsoft instructions and adapted my script. My locale uses semicolon as separator
$o365mapping = Import-Csv -Path "C:\Dev\o365mapping.csv" -Encoding Default -Delimiter ';'
ForEach ($account in $o365mapping) {
	#In case you have some soft-deleted mailboxes or other name collisions, get real mailbox name
	$activename= (get-mailbox -identity $account.mailbox).name
	#Name = PST filename
	#CASE SENSITIVE!!!
	$pstfile = ($azblobaccount + '/' + $account.name)
	#Just to differentiate jobs
	$batch = $account.mailbox
	#targetrootfolder and baditemlimit are optional. Batchname might be optional but I left it in just in case
	new-mailboximportrequest -mailbox $activename -AzureBlobStorageAccountUri $pstfile -AzureSharedAccessSignatureToken $azblobkey -targetrootfolder '/' -baditemlimit 50 -batchname $batch
}

So how did it work? Quite well actually. I had 68 PSTs to import (total of ~350GB). Creating all batches took roughly an hour as I hit command throttling. But as created jobs were already running, it didn’t really matter.

 (get-mailboximportrequest|measure).count
68

Exchange Online seems to heavily distribute batches over servers, hugely helping in parallel throughput.

((Get-MailboxImportRequest|Get-MailboxImportRequestStatistics).targetserver|select -unique|measure).count
65

As Exchange Online is quite restricted in resources, expect some imports to always stall.

Get-MailboxImportRequest|Get-MailboxImportRequestStatistics|group statusdetail|ft count,name -auto

Count Name
----- ----
   43 CopyingMessages
   13 Completed
    8 StalledDueToTarget_Processor
    1 StalledDueToTarget_MdbAvailability
    2 StalledDueToTarget_DiskLatency
    1 CreatingFolderHierarchy

And now numbers

((Get-MailboxImportRequest|Get-MailboxImportRequestStatistics).BytesTransferredPerMinute|%{$_.tostring().split('(')[1].split(' ')[0].replace(',','')}|measure -sum).sum / 1GB
1,41345038358122

That’s 1,4GB per minute. That’s like… a hundred times faster. I checked it at a random point when import had been running for a while when some smaller PSTs were already complete. Keep in mind that large PSTs run relatively slower and may still take a while to complete. When processing last and largest PSTs, throughput slowed to ~0,3GB/m but that’s still a lot faster than GUI. Throughput scales with number of parallel batches so probably more jobs would probably result in even better throughput.

Outlook Auto-Mapping and delegation to groups

As discussed here, Outlook doesn’t auto-load delegated mailbox if delegation target is a group.

In the backend, Exchange populates msExchDelegateListLink attribute for for delegated mailbox user that is linked to delegated users based on DN. However, it is not populated for groups as Exchange is not directly aware of group membership changes. As a workaround, you can do it yourself as a scheduled job. Here’s a script for that.

Notes:

  • It adds group member DNs msExchDelegateListLink to attribute and also cleans up removed members (both direct and group members)
  • Logging and internal comments have been removed
  • Script is quite expensive (resource-time wise), in my environment it takes 2-3 minutes to run.
  • I have scheduled it to run every 2-3 hours, adjust to your requirements.
    Outlook should pick up changes in a few minutes after run.
  • Run visible mailbox size checker first so you don’t blow user’s default 50GB OST limit.
  • I’m running Exchange 2016 but 2010 SP1 and up should work.
  • This script will directly write to your AD, understand and test script first, understand the risks.
  • You need to load Exchange PowerShell snap-in or remote management sessioon first.
Function Populate-msExchDelegateListLink {
	$MailboxList = get-Mailbox -ResultSize Unlimited
	ForEach ($Mailbox in $MailboxList) {
		$mailboxpermissions = get-mailboxpermission -identity $mailbox.name | where isinherited -EQ $false | where accessrights -EQ 'FullAccess'
		$UserMembers = @()
		$GroupMembers = @()
		ForEach ($MailboxPermission in $mailboxpermissions) {
			$NormalizedName = $mailboxpermission.user.ToString().split('\')[1]
			#This is dumb but... it works!
			$CheckIfGroup = $(Try {Get-AdGroup -Identity $NormalizedName} Catch {$null})
			$CheckIfUser = $(Try {Get-Aduser -Identity $NormalizedName} Catch {$null})
			If ($CheckIfGroup) {
				$GroupMembers += $CheckIfGroup.DistinguishedName
			} ElseIf ($CheckIfUser) {
				$UserMembers += $CheckIfUser.DistinguishedName
			}
		}
		Foreach ($GroupMember in $GroupMembers) {
			$GroupMemberShip = (Get-ADGroupMember -Identity $GroupMember -Recursive | Where-Object 'ObjectClass' -EQ 'user' | Where-Object 'DistinguishedName' -NE $mailbox.DistinguishedName).DistinguishedName
			$GroupMemberShip | % {$Usermembers += $_}
		}
		$MailboxDelegateList = (Get-ADUser -Identity $Mailbox.DistinguishedName -Properties msExchDelegateListLink).msExchDelegateListLink
		ForEach ($MailboxDelegateListEntry in $MailboxDelegateList) {
			If ($UserMembers -notcontains $MailboxDelegateListEntry) {
				Set-ADUser -Identity $Mailbox.DistinguishedName -Remove @{msExchDelegateListLink="$MailboxDelegateListEntry"}
			}
		}
		ForEach ($UserMember in $UserMembers) {
			If ($MailboxDelegateList -notcontains $UserMember) {
				Set-ADUser -Identity $Mailbox.DistinguishedName -Add @{msExchDelegateListLink="$UserMember"}
			}
		}
	}
}

Calculating size of user’s mailbox and any delegated mailboxes

Outlook by default limits OST to 50GB (modern versions) but some users may have tons of delegated mailboxes and run into this limit. This script retrieves users that have more than 50GB of delegated and personal mailboxes visible. You might not want to increase OST limit for everyone…

Possible use case is situation where you have delegated several large mailboxes to multiple users. As tickets start coming in as mailboxes grow, you want to proactively find out problematic users.

This really becomes an issue when you delegate mailboxes to groups. I’ll post script to update msExchDelegateListBL for group memberships in a few days as Exchange doesn’t do that automatically. TL;DR: If you delegate mailbox to group, it doesn’t get autoloaded by Outlook. I have a script to remediate that.

Remarks:

  • This is a slow and ugly one-off. But as I only needed it once, it just works. As always, read the disclaimer on the left.
  • You need Exchange Management Tools installed on your PC. It doesn’t work with remote management PowerShell session as you don’t have proper data types loaded. Install management tools on your PC and run Exchange Management Shell.
  • This script looks up only admin-delegated mailboxes. Any folders or mailboxes or public folders shared and loaded by users themselves are not included. This is server-side view only.
$userlist = get-aduser -Filter *
foreach ($user in $userlist) {
	$usermailbox = get-mailbox $user.distinguishedname 2>$null
	If ($usermailbox) {
		$DelegationList = (get-aduser -Identity $user.distinguishedname -Properties msExchDelegateListBL).msExchDelegateListBL
		If ($DelegationList) {
			$usermailboxsize = (Get-mailboxstatistics -identity $usermailbox | select @{label=”TotalSizeBytes”;expression={$_.TotalItemSize.Value.ToBytes()}}).TotalSizeBytes
			$SharedSize = ($DelegationList | %{get-mailbox -Identity $_ | Get-MailboxStatistics | select displayname,@{label=”TotalSizeBytes”;expression={$_.TotalItemSize.Value.ToBytes()}},totalitemsize} | measure -sum totalsizebytes).sum
			$TotalVisibleSize = ( ($usermailboxsize + $SharedSize) / 1GB)
			If ($TotalVisibleSize -gt 50) {
				Write-Host $user.Name
				Write-Host $TotalVisibleSize
			}
		}
	}
}