Much is made about a healthy Active Directory environment being a prerequisite for a healthy Exchange deployment. This can be especially challenging when there are separate teams managing AD & Exchange; meaning sometimes things can slip through the cracks.
A colleague of mine recently ran into an issue when preparing to deploy Exchange 2013 into an existing Exchange Organization. While running Setup /PrepareAD, the process would fail at about 14%, stating the domain controller is not available. It was determined that the DC holding all of the FSMO roles was in the process of a reboot. At first the assumption was that this was coincidental; possibly the work of the AD team. After the server came back up, /PrepareAD was run again & had the exact same result! So it appeared something that the /PrepareAd process was doing was the culprit. The event logs on the DC gave the below output:
Faulting application name: lsass.exe, version: 6.3.9600.16384, time stamp: 0x5215e25f
Faulting module name: ntdsai.dll, version: 6.3.9600.16421, time stamp: 0x524fcaed
Exception code: 0xc0000005
Fault offset: 0x000000000019e45d
Faulting process id: 0x1ec
Faulting application start time: 0x01d0553575d64eb5
Faulting application path: C:\Windows\system32\lsass.exe
Faulting module path: C:\Windows\system32\ntdsai.dll
Report Id: 53c0474e-c12d-11e4-9406-005056890b81
Faulting package full name:
Faulting package-relative application ID:
A critical system process, C:\Windows\system32\lsass.exe, failed with status code c0000005. The machine must now be restarted.
The logs were saying that the Lsass.exe process was crashing, leading to the Domain Controller restarting (see image below).
The easiest path of troubleshooting lead towards moving the FSMO roles to another server & seeing if the issue followed it. Setup /PrepareAD was run again & the issue did in fact follow the FSMO roles.
It was at this point that I was engaged & I had a feeling this was either a performance issue on the domain controllers or something buggy at play. Before too long I was able to find the below MS KB for an issue that seemed to match our symptoms:
“Lsass.exe process and Windows Server 2012 R2-based domain controller crashes when the server runs under low memory”
The customer was more than willing to install the hotfix, but we soon realized that we also had to install the prerequisite update package below (which was sizeable):
Windows RT 8.1, Windows 8.1, and Windows Server 2012 R2 update: April 2014
During this time, the domain controller was also updated to .NET 4.5.2. After all of this was done, Setup /PrepareAD completed successfully. My colleague was 90% certain the hotfix was the fix, but also noted that before the patch the DC’s CPU utilization was consistently running at 60%. After the updates, it now sits in the 20-30% range. So regardless, we saw much better performance & stability after updating the Domain Controllers.
While I understand we can’t all be up to date on our patching 100% of the time, there is some health checking we can do to the environments we manage.
For all Windows servers, I strongly recommend getting a performance baseline of the big 3: Disk, Memory, & CPU. I like to say that you can’t truly say what bad performance is defined by if you don’t have a definition of good performance in the first place. Staying up to date with Windows Updates can greatly help with this. Even though a system may have performed to a certain level at one point in time, doesn’t mean any number of other variable couldn’t have changed since then to result in poor performance today; often times, vendor updates can remedy this.
As for Domain Controllers, they’re one of the easiest workloads to test with, since a new DC can be created with relative ease. You can use a test environment (recommended) or simply deploy Windows updates to a select number of domain controllers & then compare the current behavior with your baseline.
In this customer’s case, these performance/stability issues could have resulted in any number of applications to fail that relied on AD. Some failures may have been silent, while others could’ve been show stoppers like this one.