When deploying an Office 365 solution, directory synchronisation is a critical component. This is the tool that will synchronise the on-premises directory data to Windows Azure Active Directory (WAAD). As you can imagine if the synchronisation stops then chances are that you will run into some issues….
Note that this post is intended to cover both DirSync and AAD Sync, hence the generic title.
When onsite with customers it has been very common to find that the Directory Synchronisation tool has been installed many moons/months/years ago and has never been touched again. Tasks that are often neglected include:
Server is not patched for OS updates
Server is never updated with the new build of the synchronisation tool
Server is not monitored. This includes both basic ping monitoring and service/event log monitoring
Previously I've have had a customer request that I delete their Office 365 tenant as it was "corrupt". When investigating why they thought so, it was because the tenant did not accept any of the changes that they were making to on-premises AD. Turns out that DirSync had stopped replicating due to errors caused by not initially correcting the legacy cruft in their directory. That had been the case for 5 months, and they never knew….
There is a single Directory Synchronisation server deployed as part of the Office 365 solution so why are so many people overlooking the proper management of that server?
Let's review some of the issues.
Not Using IdFix
Prior to letting Directory Synchronisation try and replicate incorrect or unwanted data to Office 365, IdFix must be executed as step #1.
IdFix proactively identifies objects in the on-premises AD that will have issues replicating to Office 365. Unsupported characters, local UPN suffixes and other issues are detected by the tool. IdFix allows you to bulk edit the objects and apply the changes. It also has the ability to roll them back should there be issues.
Not Updating Directory Synchronisation Tool Version
Over the years there have been multiple releases for DirSync. The build list can be reviewed on the TechNet Wiki, which also includes the issues that are addressed in each of the newer builds. DirSync has been replaced with the new AAD Sync tool. The build history for AAD Sync is available on MSDN. Though this has now been superseded with another article which has version information on AAD Sync and AAD Connect.
Maintaining the installed version of the synchronisation tool allows you to address known issues proactively.
Not Installing Windows Updates
All software needs maintenance, and the OS running the directory synchronisation server must be maintained.
One of the cardinal sins that afflict numerous implementations is a lack of monitoring. In most cases there is absolutely no monitoring configured for the Directory Synchronisation server. Even something simple like a ping monitor would alert if the server was down.
The event logs must be monitored to ensure that errors and warnings are detected, and the appropriate operations team can respond. Some monitoring products are able to alert if an expected event was *NOT* logged. For example, if there have been no successful sync messages for 6 hours then an alert could be fired as we have missed two sync attempts for some reason.
All services that should be running are to be monitored.
Not Updating Technical Contact Email
In multiple deployments consultants have been contracted to assist with the migration. One common issue is that the technical contact is set as the external consultant, rather than an entity from the organisation. Having it as a third party is fine if they are responsible for managing the environment. However I often see this as an issue since the consultant has rolled off the project as is now dedicated to another engagement and ignores any notifications.
To see what is currently set, on the main page of the Office 365 portal (https://portal.office.com) click the tenant name as highlighted in the top right hand corner.
The company profile screen is displayed. Review the current entries, ensuring all of the details are correct. Pay particular attention to the technical contact email and phone number at the bottom.
This is the email address that will receive notifications from Microsoft stating that there have been issues with synchronisation.
Ignoring Sync Issue Alerts
Do not create a rule to automatically delete synchronisation alert emails. They are sent for a reason.
If synchronisation has not occurred for a significant amount of time, Microsoft will send an email to the address listed as the technical contact.
In the above example this is currently sent from MSOnlineServicesTeam@MicrosoftOnline.com – ensure that this is allowed through any anti-spam devices. This email indicates that Windows Azure Active Directory has not registered a synchronisation attempt in the last 24 hours.
In addition to failing to connect, the technical contact will also be notified if there are issues with the objects that are being replicated. For example the below two user objects have an unsupported character, the apostrophe.
Again this is currently sent from: MSOnlineServicesTeam@MicrosoftOnline.com
Not Using A Defined Service Account
Do not use the someone's personal global admin account. When the password for that account is changed or expires the synchronisation will start to fail. This also occurs if the person changes their UPN.
Use a dedicated cloud service account for directory synchronisation and carefully manage access to its credentials.
Not Documenting The Installation
Installing the directory synchronisation appliance is pretty straightforward, FIM not so much, and this seems to lead to a tendency of not documenting the directory synchronisation install.
We need to document all of the standard aspects of the server build, and exactly how directory synchronisation was deployed. For example if you have excluded OUs from synchronising how are you going to remember which ones if the server dies and you have to rebuild it? If you rebuild the directory synchronisation server, and fail to exclude the correct OUs then additional objects will be synchronised to office 365.
Plan in advance how you will recover from the directory synchronisation server failing. There is a single directory synchronisation server. How do you want to recover from server failure? There are a few options here.
You may want to backup the database and restore it from backup. This is covered in Backup and Restore Instructions for the DirSync Database. Note that this applies to installations that use a full SQL install, and you will need a separate SQL aware backup/recovery tool.
Alternatively you may chose to simply install a brand new server, and perform a fresh deployment. The time taken can be mitigated by pre-installing the directory sync tool, but *NOT* running the configuration wizard. Running the configuration wizard would then complete the deployment and directory synchronisation would then be operational.
Note that the initial cycle in the rebuild scenario will be a full sync. This may take hours/days to complete. For this reason larger deployments will look to the first option. If there are more than 50,000 mail enabled objects then DirSync requires a full installation of SQL server, rather than the default SQL Express. AAD Sync installs SQL Server Express with has a 10GB size limit which should allow you to managed 100,000 objects.
Ignoring File System Anti-Virus Exclusions
You must ensure that any file system AV product, or any product that does similar activities, are correctly configured.
Exclusions for SQL can be found in KB 309422.
As discussed in the Exchange articles, you *MUST* ensure that mount points have been correctly excluded.