Failures when proxying HTTP requests from Exchange 2013 to a previous Exchange version


Overview

I’ve seen this issue a few times over the past months & most recently this past week with a customer. Luckily there’s a fairly simple fix to the issue published by Microsoft, but realizing not everyone remembers every Microsoft KB that gets released I thought I’d shine a spotlight on this one.

Scenario

As part of the migration process, when customers move their namespace from either Exchange 2007 or 2010 to 2013, HTTP connections start proxying through 2013 to the legacy Exchange Servers and some users will experience failures. The potential affected workloads are:
AutoDiscover
Exchange Web Services (Free/Busy)
ActiveSync
OWA
Outlook

Test or new mailboxes may not be affected.

Resolution

The cause of this is the age old problem of Token Bloat. Users being members of too many groups or having large tokens.

The fix is to implement the changes in the below Microsoft KB article

“HTTP 400 Bad Request” error when proxying HTTP requests from Exchange Server 2013 to a previous version of Exchange Server
https://support.microsoft.com/en-us/kb/2988444

The interesting thing in this scenario is that the issue was not experienced in the legacy version of Exchange & even if you look at the tokens themselves, they may not seem overly large. It seems that the process of proxying Exchange traffic is much more sensitive to this issue. Also, in a recent case that went to Microsoft, even if you increase the recommended values to a value higher than your current headers it may not have the desired effect. In our case we had to set the MaxRequestBytes & MaxFieldLength values to exactly match the values in the Microsoft KB (65536 (Decimal)).

For further reading, please see the below articles.

Complimentary Articles

“HTTP 400 – Bad Request (Request Header too long)” error in Internet Information Services (IIS)
https://support.microsoft.com/en-us/kb/2020943

How to use Group Policy to add the MaxTokenSize registry entry to multiple computers
https://support.microsoft.com/en-us/kb/938118

 

Additional Note

As an FYI, another issue I commonly see when namespaces get transitioned to 2013 is authentication popups when connections proxy to the legacy Exchange Servers. Please see the below KB for that issue

Outlook Anywhere users prompted for credentials when they try to connect to Exchange Server 2013
https://support.microsoft.com/en-us/kb/2990117

I also blogged about it here
https://exchangemaster.wordpress.com/2014/10/30/exchange-2010-outlook-anywhere-users-receiving-prompts-when-proxied-through-exchange-2013/

Advertisements

Remember the basics when working with Dynamic Distribution Groups (I didn’t)


Overview:

I recently had a customer come to me with a simple issue of mail not being received in his Exchange 2013 environment when sending to a Dynamic Distribution Group he had just created. Well it certainly seemed like an easy issue to track down (which it technically was) but unfortunately I was a little too confident in my abilities & made the age-old mistake of overlooking the basics. Hopefully others can avoid that mistake after giving this a read.

Scenario:

Create a Dynamic Distribution Group named TestDL#1 whose membership is defined by a Universal Security Group named TestSecurityGroup using the following command in shell:

New-DynamicDistributionGroup -Name “TestDL#1” -RecipientFilter {MemberOfGroup -eq “CN=TestSecurityGroup,OU=End_Users,OU=Company_Users,DC=ASH,DC=NET”}

Note: This command places the Dynamic DL object into the default Users OU & also sets the msExchDynamicDLBaseDN to the Users OU’s Distibguished Name (CN=Users,DC=ASH,DC=NET). This will become important later.

I can verify the membership of this group by running:

$var = Get-DynamicDistributionGroup “TestDL#1”

Get-Recipient -RecipientPreviewFilter $var.RecipientFilter

In my case, the members show up correctly as John, Bob, Sam, & Dave. However, if I send emails to this group nobody gets them. When looking at messagetracking, the recipients show as {} (see below screenshot)

1

Now here’s the really interesting part. My security group, as well as my users are in the OU=End_Users,OU=Company_Users,DC=ASH,DC=NET Organizational Unit. However (as mentioned before in my Note), my Dynamic DL is in the CN=Users,DC=ASH,DC=NET Organizational Unit. Now if I move my users into the Users OU, then they receive the email & show up as valid recipients.

2

Now no matter which OU I move my Dynamic Distribution Group (TestDL#1) to, this behavior is the same.

For instance, if I had run the below command instead, I never would have noticed an issue because the Dynamic DL would’ve been created in the same OU as the users & the Security Group.

New-DynamicDistributionGroup -Name “TestDL#1” -OrganizationalUnit “ash.net/Company_Users/End_Users” -RecipientFilter {MemberOfGroup -eq “CN=TestSecurityGroup,OU=End_Users,OU=Company_Users,DC=ASH,DC=NET”}

The last head scratcher is if I move the actual AD Security Group (TestSecurityGroup) that I’m using to filter against to a different OU, I get the same behavior (no emails).

So it would seem that the solution is to ensure you always place the Dynamic Distribution Group into the same OU where ALL of your Security Group members are as well as the security group itself is.

This seemed crazy so I had to assume I wasn’t creating the filter correctly. It was at this point I pinged some colleagues of mine to see where I was going wrong.

Tip: Always get your buddies to peer review your work. A second set of eyes on an issue usually goes a long way to figuring things out.

Solution:

As it turned out, there were two things I failed to understand about this issue.

  1. When you create a Dynamic Distribution Group, by default, the RecipientContainer setting for that group is set to the OU where the DDG is placed. This means that because I initially did not specify the OU for the DDG to be placed in, it was placed in the Users OU (CN=Users,DC=ASH,DC=NET). So when Exchange was performing its query to determine membership, it could only see members that were in the Users OU. So the solution in my scenario would be to use the –RecipientContainer parameter when creating the OU & specify the entire domain.

EX: New-DynamicDistributionGroup -Name “TestDL#1” -RecipientFilter {MemberOfGroup -eq “CN=TestSecurityGroup,OU=End_Users,OU=Company_Users,DC=ASH,DC=NET”} –RecipientContainer “ASH.NET”

This one was particularly embarrassing because the answer was clearly in the TechNet article for the New-DynamicDistributionGroup cmdlet.

  1. The other thing I didn’t realize was the reason my DDG broke when moving the Security Group I was filtering against. It was breaking because I specified the Security Group using its Distinguished Name, which included the OU it resided in (CN=TestSecurityGroup,OU=End_Users,OU=Company_Users,DC=ASH,DC=NET). So by moving the group I was making my query come up empty. Now the first thing I thought of was if I could specify the group using the common name or the GUID instead. Unfortunately, you cannot because of an AD limitation:

“MemberOfGroup filtering requires that you supply the full AD distinguished name of the group you’re trying to filter against. This is an AD limitation, and it happens because you’re really filtering this calculated back-link property from AD, not the simple concept of “memberOf” that we expose in Exchange.”

So the important thing to remember here is to either not move the Security Group you’re filtering against, or if you move it, to update your filter.

Thanks go to MVPs Tony Redmond & Tony Murray for pointing these two important facts out to me.

Conclusion:

As I found out, a strong foundational knowledge of Active Directory is key to being a strong Exchange Admin/Consultant/Support Engineer. But even when you feel confident in your abilities for a given topic, don’t be afraid to ask people you trust. You might find out you’re either a bit rusty or not as knowledgeable as you thought you were J

Bad NIC Settings Cause Internal Messages to Queue with 451 4.4.0 DNS query failed (nonexistent domain)


Overview:

I’ve come across this with customers a few times now & it can be a real head scratcher. However, the resolution is actually pretty simple.

 

Scenario:

Customer has multiple Exchange servers in the environment, or has just installed a 2nd Exchange server into the environment. Customer is able to send directly out & receive in from the internet just fine but is unable to send email to/through another internal Exchange server.

This issue may also manifest itself as intermittent delays in sending between internal Exchange servers.

In either scenario, messages will be seen queuing & if you run a “Get-Queue –Identity QueueID | Formal-List” you will see a “LastError” of “451 4.4.0 DNS query failed. The error was: SMTPSEND.DNS.NonExistentDomain; nonexistent domain”.

 

Resolution:

This issue can occur because the Properties of the Exchange Server’s NIC have an external DNS server listed in them. Removing the external DNS server/servers & leaving only internal (Microsoft DNS/Active Directory Domain Controllers in most customer environments) DNS Servers; followed by restarting the Microsoft Exchange Transport Service should resolve the issue.

 

Summary:

The Default Configuration of an Exchange Server is to use the local Network Adapter’s DNS settings for Transport Service lookups.

(FYI: You can alter this in Exchange 07/10 via EMS using the Set-TransportServer command or in EMC>Server Configuration>Hub Transport>Properties of Server. Or in Exchange 2013 via EMS using the Set-TransportService command or via EAC>Servers>Edit Server>DNS Lookups. Using any of these methods, you can have Exchange use a specific DNS Server.)

Because the default behavior is to use the local network adapter’s DNS settings, Exchange was finding itself using external DNS servers for name resolution. Now this seemed to work fine when it had to resolve external domains/recipients but a public DNS server would likely have no idea what your internal Exchange servers (i.e. Ex10.contoso.local) resolve to.The error we see is due to the DNS server responding, but it just not having the A record for the internal host that we require. If the DNS server you had configured didn’t exist or wasn’t reachable you would actually see slightly different behavior (like messages sitting in “Ready” status in their respective queues).

 

An Exchange server, or any Domain-joined server for that matter, should not have its NICs DNS settings set to an external/ISPs DNS server (even as secondary). Instead, they should be set to internal DNS servers which have all the necessary records to discover internal Exchange servers.

 

References

http://support.microsoft.com/kb/825036

http://technet.microsoft.com/en-us/library/bb124896(v=EXCHG.80).aspx

“The DNS server address that is configured on the IP properties should be the DNS server that is used to register Active Directory records.”

http://technet.microsoft.com/en-us/library/aa997166(v=exchg.80).aspx

http://exchangeserverpro.com/exchange-2013-manually-configure-dns-lookups/

http://thoughtsofanidlemind.com/2013/03/25/exchange-2013-dns-stuck-messages/

 

Outlook 2013 Security Update breaks Out of Office for Exchange 2007 Mailboxes


Issue

In the past two weeks I’ve had two customer environments come to me with the same issue. Random Outlook 2013 clients are unable to configure their Out Of Office settings. Specifically, they would get an error message when trying to open their OOF settings to set them. Some users had no issue while others got the error. Both environments were Exchange 2007 on latest Service Pack/Update Rollup & had a few users running Outlook 2013; only a few of which were having this issue.

Resolution

After some searching online I found this thread pointing to a recent (November 12th 2013) Outlook Security Update as the culprit. In my case, after uninstalling KB2837618, rebooting the client, & re-creating the Outlook Profile (a profile repair did not work) then Out Of Office started working again.

In fact, after reading the full KB it appears this is a known issue:

  • If you are using Outlook to connect to a Microsoft Exchange Server 2007 mailbox:
    • You receive an error message that resembles either of the following when you try to configure Automatic Replies (Out of Office): 
      Your automatic reply settings cannot be displayed because the server is currently unavailable. Try again later. 
    • You cannot retrieve Free/Busy data for calendar scheduling.
    • Add-ins that use the Account.SmtpAddress property no longer work.

    These features rely on an underlying Autodiscover technology. After you install this security update, Autodiscover may fail for Exchange 2007 configurations. Therefore, Outlook features that rely on Autodiscover will also stop working.

    Microsoft is researching this problem and will post more information in this article when it becomes available.

It appears the reason only random Outlook 2013 clients were experiencing the issue was because not all of them were up to date via Microsoft Update. Of course most Exchange folks will never read all the release notes of KBs pushed to Outlook via Microsoft Update ahead of time. I suppose the answer to an actual admin (which I am not) who complains will be “What, you don’t use WSUS & test every update for Office?” 😀

Update:

As Jim Morris pointed out in the comments below, the issue is addressed in http://support.microsoft.com/kb/2850061

Common Support Issues with Transport Agents


This is a fairly basic post but it happens enough that I’d like to call out the basics of troubleshooting it. I’ve seen many cases over time where mail flow is either being halted or become sluggish due to a third-party transport agent (I actually saw 3 instances of this happening this past month which prompted this post).

Examples of Transport Agents could be Anti-Virus software, Anti-Spam software, DLP software, agents which add disclaimers to email messages, or email archiving solutions. I won’t call out specific vendors as I don’t think there’s necessarily anything wrong with any particular one. Sometimes an install of a piece of software just becomes corrupted or there’s some unforeseen incompatibility between the third-party software & Exchange; or some other software in the environment. However, sometimes the Agent can indeed have a bug which needs to be addressed with the vendor.

Anyways, here’s the ways in which I’ve seen these issues manifest themselves:

  • Messages Stuck in the Submission queue
  • A delay in SMTP response (when you telnet to the Exchange Server over 25, it takes longer than expected for the server’s SMTP banner to be displayed)
  • Messages are slow to flow through the transport pipeline (general slow delivery)
  • Microsoft Exchange Transport Service will not start or repeatedly crashes

To highlight more recent examples, last week I had a colleague come to me saying he had two Exchange 2010 Hub/CAS boxes, with the same config, yet one of them would have a slower connection when he would telnet to it; the banner would take at least 20 seconds to be displayed. This also resulted in the health checks for the hardware load balancer in place to mark the server as down. Each server had the same Anti-V/Anti-SPAM software installed, yet only one was showing the symptoms. For testing purposes he “disabled” the third-party software using its management interface but the issue persisted.

However, after running a “Get-TransportAgent” on the server, the Transport Agent still showed as being “Enabled”. This demonstrates a point I frequently make with customers, that disabling Anti-Virus software rarely serves as a useful troubleshooting step (even file-based Anti-V). This is because the TransportAgent is typically still enabled. For file-based Anti-Virus, even with the Services disabled there is usually still a network filter driver that is sitting on the TCP/IP stack which could be causing issues (only an uninstall of the 3rd-party product removes it).

Bottom-line, an uninstall is still the best method to remove potentially problematic Anti-V/Anti-SPAM/Anti-Malware software. So in this case the issue was a bad/corrupted install of the product on that server.

Another scenario (also Exchange 2010) was where messages were stuck in the Submission Queue for extended periods of time. The Application Logs were filled with Event 1050 MSExchange Extensibility events which were stating the installed agent was taking an unusual amount of time to process an event; thus causing the delay in transport (Reference 1 2 3).

After running Get-TransportAgent I was actually greeted by an error message saying it was unable to access a file located in the “C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\agents” directory. This is where the files associated with your Transport Agents are stored. So again, the issue was a corrupted install of the product. Reinstalling the software resolved the issue.

So nothing fancy about this one. Just check Event Viewer for Transport events or use process of elimination if you’re experiencing any of the symptoms above. Having worked with Microsoft Support many times in the past, they will almost always ask you to remove third-party components such as Anti-V if they are unable to pinpoint the issue to its source; so save yourself some time & rule it out first.

I know some people work for companies where this is like pulling teeth but it’s always going to be a battle between usability & security. If your management requires you spend 40 hours on the phone working with a vendor or Microsoft before finally being told you’re going no further until removing the third-party component then I give you my best & suggest you get the coffee started. We all know the most important acronym in IT is CYA after all 😉

For great reading on Exchange Transport Agents see MCM/MCSM/MVP Brian Reid’s two posts on the topic

Creating a Simple Exchange Server Transport Agent

Exchange 2013 Transport Agents

Once again, Unchecking IPv6 on a NIC Breaks Exchange 2013


Background:

It seems like this sentiment has been preached widely but yet I still see customers do this. In fact I’m writing this today because earlier this week I had a customer who’s Information Store Service, as well as the Exchange Transport Services, on Exchange 2013 would not start. Then earlier today a coworker actually did this in a lab which caused the same issue.

Summary:

Let’s start off with this, The Exchange Server Product Team performs Zero testing or validation on systems with IPv6 Disabled. So that right there should be a good indicator that you’re trailblazing on your own in the land of Exchange (bring a flashlight, it’s dark & scary).

So I’m going to cover two very different things here:

  • Unchecking IPv6 on the NIC adapter (BAD)
  • Properly Disabling IPv6 in the registry (Ok but not recommended by MS)

Unchecking Method (BAD):

Let’s first talk about un-checking IPv6 on your NIC adapters. The problem with this is while the OS still thinks it can & should be using IPv6, the NIC is unable to do so which leads to communications issues. An easy way to test that your OS is still trying to use IPv6 is to ping localhost after you have unchecked IPv6 on your NIC & rebooted. You’re see that you still get an IPv6 response. I actually did a write-up about this topic on the Sysadmin community on Reddit awhile back which you can find here. As a side note, check out the Exchange community a colleague & I moderate on reddit here.

While doing this has always caused sporadic issues with Exchange, Exchange 2013 seems to be even more sensitive in this regard. Since RTM, I’ve seen half a dozen Exchange 2013 issues that were resolved by re-checking IPv6 on the NIC adapter & rebooting. Here’s what I’ve seen so far:

  • Having Ipv6 unchecked when performing an Exchange 2013 install will result in a failed/incomplete installation which will result in having to perform a messy cleanup operation before you can continue.
  • Microsoft Exchange Active Directory Topology Service may not start if the Exchange 2013 server is also a Domain Controller and IPv6 has been unchecked. The solution is to re-check it & reboot the server.
  • Microsoft Exchange Transport Service as well as the Microsoft Exchange Frontend Transport, Microsoft Exchange Transport Submission, & Microsoft Exchange Transport Delivery services may not start if IPv6 has been unchecked on the NIC adapter of an Exchange 2013 Server.
  • Microsoft Exchange Information Store Service may not start if IPv6 has been unchecked on an Exchange 2013 Server.
  • NEW – See MVP Michael Van Horenbeeck’s post on how this can break the Hybrid Configuration Wizard

Disabling IPv6 in the Registry:

I started this post saying that MS does no testing or validation for systems with IPv6 disabled in ANY WAY. However, some customers may actually have reasons for disabling Ipv6. I’m actually interested in hearing them but I also know some customers are very adamant about it. There actually was an issue in the past where Outlook Anywhere wouldn’t work in certain scenarios with IPv6 enabled but this should not be a problem with a fully updated Exchange Server (reference).

I’ll also say that I personally have never had any issues with properly disabling IPv6 in the registry using this method. You basically add a DisabledComponents key to the registry with a value of 8 F’s (ffffffff) & then reboot the server. After this point IPv6 should be fully disabled. I’ve also spoken with a couple Microsoft Support Engineers who have also said that they have personally never seen any issues with disabling it this way; with Windows or Exchange. However, in my opinion you should have a good reason for doing so (and saying you don’t like IPv6 is NOT a good reason).

Lastly, I’d like to add that if you’re utilizing iSCSI on your Exchange server, there should be no issues with unchecking IPv6 on your iSCSI NICs if you choose to do so. The article was specifically in relation to NICs connected to your production/public/MAPI networks. As usual, follow your SAN vendor’s best practices when configuring iSCSI NICs.

Also, here’s a shameless plug for the ExchangeServer subreddit (http://www.reddit.com/r/exchangeserver) which I help moderate (username=ashdrewness). There’s always people such as myself answering questions on there.

New behavior in Outlook 2013 causing certificate errors in some environments


Background:

I originally discovered this issue back in early Feb & let a couple people on the Exchange Product Team know about it via the TAP but it seems to be affecting more customers than initially thought so I thought I’d share.

In Outlook 2007 through Outlook 2010 all domain-joined Outlook clients would initially query Active Directory for AutoDiscover information & ultimately find a Service Connection Point (SCP) value that would point them to their nearest Client Access Server’s AutoDiscover virtual directory. If that failed then they would revert to using DNS like any non-domain-joined Outlook client. The order of this non-domain-joined lookup is as follows:

https://company.com/autodiscover/autodiscover.xml

https://autodiscover.company.com/autodiscover/autodiscover.xml

Local XML File

http://company.com/autodiscover/autodiscover.xml (looking for a redirect website)

SCP AutoDiscover Record

Why it ever looked to https://company.com/autodiscover/autodiscover.xml I’ll never really know because honestly I’ve never come across a customer who had it deployed that way; most have https://autodiscover.company.com/autodiscover/autodiscover.xml but I imagine when Exchange 2007 was first being developed they weren’t exactly sure how customers would be implementing AutoDiscover.

Issue:

The above methods have served us well since Exchange 2007 timeframe but for some reason the Outlook team decided to try & implement some giddyup into Outlook & try to speed up the process. They decided to have domain-joined Outlook 2013 clients query both the SCP values in AD as well as the DNS records at the same time. If an SCP record was found it would still be used but in the event it failed then it would already have the DNS response ready to go. Great idea, however there’s one problem in the implementation.

If Outlook 2013 encounters any kind of Certificate error while doing the simultaneous DNS query then you will receive a pop-up in Outlook about the cert.

I actually stumbled upon this while in the middle of the scenario below:

error

That’s right, I actually get a certificate pop-up for my lab’s domain name (ash15.com) & not autodiscover.ash15.com like one would expect if I were to have a certificate issue on Exchange.

When Outlook 2013 does it’s simultaneous DNS AutoDiscover query the first URL it tries is https://company.com/autodiscover/autodiscover.xml, which in my lab environment resolved to my Domain Controller, which was also serving DNS, as well as a Certificate Authority. Ash15.com resolved to this server because it’s my internal Active Directory domain name & the name server entry resolves to my DC (just ping internaldomainname.local in your AD lab environment & you’ll see the same thing).

Now because I have web enrollment enabled & am listening on 443 in IIS the server responded. Also, because I did not have a cert installed on the server with ash15.com in the Subject or Subject Alternative Name then it gave the certificate error we see above.

Resolution:

The error is easy enough to get through & it only occurred on initial profile creation but this can definitely prove painful for some customers. Obviously my lab environment is a corner case but there have been several other customers report this issue with Outlook 2013 as well.

Here’s an example scenario.

Imagine you have a public website for andrewswidgets.com hosted by a third-party hosting site & you did not pay for HTTPS/443 services. However if you were to query the website using https then it could respond & obviously not return a certificate with andrewswidgets.com on it (because you haven’t paid for it you cheapskate…). Now imagine you begin deploying users using Outlook 2013 in your internal environment. In the past, they would have found the SCP record that would have pointed them to your internal Exchange 07/10/13 server for AutoDiscover & would have been happy as a clam (one Exchange Product Manager’s favorite way to describe Exchange bliss). However, now they may get a certificate pop-up for andrewswidgets.com when creating a new profile.

There are a couple ways around this. Make sure andrewswidgets.com doesn’t listen on 443, or possibly get a proper cert on your website that is listening on 443. Simply put, just make sure whatever andrewswidgets.com resolves to is something that’s not going to throw a certificate error.

I’ve heard nothing concrete or public but the Outlook team is aware of the issue & listening to customer feedback. I suggest contacting Microsoft Support if your organization is running into this issue.

 

Also, this KB offers methods to control which AutoDiscover methods are used by your Outlook clients