MessageOps O365 Exchange Migrator is what I'm picking for the time being. They have a free trial (account needed), and the trial let's you see how it performs on up to 15 items per folder. You can register them as your O365 partner in the O365 portal and they'll get their cut from Microsoft, or you can buy the software outright for your organization size. The whole partner thing was pretty ill-defined, so we bought the site license.
Here we are now, with a 500-seat license for a year, and ~320 users in O365. MessageOps is pretty clear about the limitations on PST ingestion: Too many threads/data running on a single import account will get you throttled! As such, they give the recommendation to do concurrency via multiple admin accounts.
OK easy enough. Give an admin account a license and a mailbox (gotta have a mailbox).
We get into the app, logging into our tenant with some pretty sweet credentials (tenant admin, gotta have a mailbox. Did I mention it has to have a mailbox?). Blam, 323 seats recognized.
The next screen is where the magic happens. It's not a pretty sight, but the support page describes every function, a heretical idea it seems in these days.
Here's their linky in case the instructions change:
http://www.messageops.com/office-365-exchange-migrator/import-pst-files-to-office-365/CSV import! after the initial smoke test, we can take the output we got out of PST Capture Central Server, mix up the columns as needed, and ingest it straight into the tool.
For my jobs, I'm selecting the following options:
Left column:
Address Rewriting: Enabled
Thread Settings: 8 Max Threads, 49 Batch size (reference their help for info) [we ended up throttling down the threads to 4/5 to use the same admin account to run two jobs. This worked well and didn't seem to make import jobs run slower. It also prevented the excessive use of privileged service accounts for migration.]
Deleted items setting: (unchecked)
Folder Filtering: (Nothing specified here)
Duplicate Detection: CustId (Best)
Right Column:
I'll assume you assign the appropriate mailbox in question/
Destination: Archive
Destination Root Name: Root of Mailbox (I would like to combine folders, not set up new folder structure for each PST file)
Destination Auth: Use Administrator Credentials
Auto Grant Full Access: Enabled
Skip Auto Remove Access: Disabled (clean up permissions afterward)
Use Throttle Switching: Enabled
Use EWS Impersonation: Enabled
Error Processing - Ignore Single Property Errors: Enabled. I want to know if things really go wrong, but not so much for individual items. I see lots of issues with calendar appointments not getting recognized as the correct item type, but nothing to worry about.
Exclude Folders: Nothing selected
Email Status Notifications: my email address, not yours. Please don't send me your email notifications.
Use MAPI and EWS for Upload: Enabled ((this will be automatically set by the tool when you authenticate to the cloud. There is a "right" method to use if you're on a newer or older Exchange Online tenant))
Use Direct Discover: Unchecked. See help file if you have questions.
Once your options are set, click Add to Import Queue. Once your queue (list of PSTs to be imported) is set, click Start Import.
You can see it going. It takes a minute or two to flesh out permissions and validate some steps, then it will start cooking.
Waiting for good average speed metrics. The tool reports on average rate MB/min and average rate Items/min.
Data Rate/Metrics:
I'm seeing 15MB/min and 250 items/min on Julie's 614 MB PST file. this should wrap up in ~40 min, so I'll check it when I get home. {this job finished in 53 minutes}Amy is running on another server, a 7313 MB PST. about 5 min in, it's humming along at 21MB/Min, and 367 Items/Min
Update 9/16/2014 -
Server 1: 22 MB/min
Server 2: 56 MB/min
Server 3: 44 MB/min
Server 4: 35 MB/min
Average: 39 MB/min. This was a big job, each worker had a list of at least 20 GB of PST files to go through.
A bit on how the tool works:
Check out the MessageOps\ExchangeMigrator\ folder in your logged on user profile. In the Working folder, under the appropriate date stamp, you'll see items fly through based on GUID.
I'm keeping an eye on this MessageOps folder, as I'm using VMs with very small C: drives, and there was no configuration option to throw these logs and working directories anywhere else.
As far as system resources go, i'm using two-core, 4 GB ram VMs. The cores are on recent servers, and right now I'm seeing more memory faults than anything, but 50% RAM utilization overall. (Update: I've seen no reason to resize these VMs. They cut through SCANPST when needed at a reasonable rate.)
Issues I've ran into, and what they mean:
- In an archive's Log File (individual PST files are logged separately from the "job"), the last line of the log reads: "Errors Processing PST File: (pathname) Catastrophic failure"
- This means the PST has errors. Run ScanPST against it. You'll find Outlook will bitch about the file as well. This service uses outlook to do the processing in a background sense.
- It could also be a password-locked file. Ask the user in question.
- + Repairs don't work: sometimes a PST has invalid folders, or another issue that SCANPST doesn't fix. Open it in Outlook and go into Folder view, so you can see everything:
Above, a folder with a "blank" name actually has some escaped characters. In this case, check the folder for items, move if appropriate, and rename/delete the folder. I had to log a support ticket to get into this one. I figured SCANPST would resolve this type of issue. To it's merit, it tried but just created more problem folders in one case.
Oh, and remember to delete it permanently. if it's still in the recycle bin (deleted items folder), it will still throw errors!