This is a repro of a customer reported issue where they were having issues sending email from Office 365 to on-premises Exchange servers. Apparently this had been working previously with no issues, then mailflow started to have problems. Some email was flowing from Office 365, but some was delayed or not delivered at all.
There were no issues sending email from the on-premises Exchange servers to Office 365. The issue was the mail flow from cloud to on-premises.
In order to look at the mail queues on-premises, we can use the Get-Queue cmdlet or the Queue Viewer. Queue Viewer can be found under the Exchange toolbox, which is built into the Exchange 2010 MMC and as a separate Start Menu item in Exchange 2013/2016.
Reviewing Office 365 Message Queues
Reviewing the Queues in Office 365 is also straightforward. In the below example note that the focus is on the Mail Flow, and then message trace. There are pre-canned queries to search emails from the last 24 or 48 hours. This can also be customised to suit the specific requirement. This was sufficient to troubleshoot this issue. For more complex situations please review Andrew’s excellent EOP blog, specifically the Parsing an extended message trace post.
After entering the relevant time slot, and expected recipient the trace was executed. Multiple messages were then seen in the queue.
Taking one message as example, we can see that the message delivery to on-premises failed and that a specific error code of 4.4.316 was reported. We can see this by clicking to expand the entry using the arrow on the left hand side of the Message Events table.
If we drill into the Message Events table, and expand the Defer event entry we can see the below details:
The reported error was:
Reason: [{LED=450 4.4.316 Connection refused};{MSG=Socket error code 10061};{FQDN=smtp.tailspintoys.ca};{IP=13.92.177.139};{LRT=3/16/2017 4:59:18 PM}]. OutboundProxyTargetIP: 13.92.177.139. OutboundProxyTargetHostName: smtp.tailspintoys.ca
IP 13.92.177.139 corresponds to the on-premises Exchange infrastructure.
IP 104.47.34.97 is the Office 365 IP address which is attempting to send the email to on-premises.
Reviewing the reported error LED=450 4.4.316 Connection refused};{MSG=Socket error code 10061, indicates that Office 365 was unable to connect to on-premises Exchange so let’s verify that configuration.
Reviewing On-Premises Infrastructure
As always, start with the simple things. The IP address of 13.92.177.139 is correct and does point to the on-premises Exchange servers. This was validated using nslookup. We can see the name which was used by Office 365 in the Message Event details - smtp.tailspintoys.ca. This resolves to the external IP of the on-premises environment. Since email is encrypted between Office 365 and on-premises Exchange we also need to verify the certificate used by the encrypted SMTP connection to ensure that it is valid.
Reviewing the certificate bound to the SMTP service we can see that the name on the certificate is also correct – mail.tailspintoys.ca. Also we ensure that the certificate is within the validity period, has a private key and chains correctly to the issuing CA. We can check the chaining on the Certification Path tab. Some CA vendors have their own tools to assist with this validation process. Check with the CA vendor which issued your certificate.
Also the certificate was not changed recently. Changing or updating the certificate which is used requires that the Exchange Hybrid Wizard is executed to update the new certificate thumbprint in Office 365.
What is logged in the Exchange Receive Connector logs? We need to review this to ensure that the Office 365 traffic is being processed by the correct receive connector. It is recommended that you enable the SMTP send and receive logging on all Exchange servers so that log data is available to troubleshoot an issue. Else you then need to enable the logging and wait for the issue to re-occur. The logs will be located under the Exchange installation folder, which is slightly different between Exchange 2010 and 2013/2016.
Exchange 2010
C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\ProtocolLog\SmtpReceive
Exchange 2013/2016
C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\Logs\FrontEndProtocolLog\SmtpReceive
C:\Program Files\Microsoft\Exchange Server\V15\TransportRoles\Logs\MailboxProtocolLog\SmtpReceive
We can take the sending IP address from the Message Events and then search for it in the SmtpReceive log. In the example the IP we wish to search for is 104.47.34.97. The IP is shown below so you can see where it was obtained.
There were no connections from this IP address on any of the Exchange logs. This means that the traffic was not getting to Exchange.
Now that we have done our due diligence and validation, time to speak to the firewall administrators.
Office 365 IP and URL Restrictions
Microsoft documents the IP addresses and URLs which are required to access the various components of Office 365. The addresses and IPs are often modified, and you can subscribe to the RSS feed to be notified of changes.
Note: Microsoft is developing a REST-based web service for the IP address and FQDN entries on this page. This new service will help you configure and update network perimeter devices such as firewalls and proxy servers. You can download the list of endpoints, the current version of the list, or specific changes. This service will eventually replace the XML document, RSS feed, and the IP address and FQDN entries on this page. To try out this new service, go to Web service.
The firewall admins were asked to review the drop log on their devices to review the connections from the Office 365 IPs identified above. And lo! The firewall was indeed blocking these connections.
In this case the customer did not update the firewall correctly when they made a recent change to their external firewall ACLs. For some reason they removed some of the EOP IP objects from the ACL. As a result only some of the EOP servers were allowed to communicate with the on-premises SMTP endpoint.
Once the firewall objects had been corrected, all email was then delivered without further issue.
Cheers,
Rhoderick
Hi,
I am going through a similiar problem, where we created office 365 system with cloud mailboxes before we start building hybrid (for a bad reason) so now i moved on Prem to hybrid and fix the on prem to office 365 mail flow. However,
Even though our MX records are still pointing to our on prem, cloud mailboxes delivering emails only to cloud mailbox. It doesn't route emails to our on prem server. Any ideas?
Look at the Send Connectors created in the tenant, and review to see what is set.
Did you enable Centralized Mail Transport? And are *ALL* mailboxes, groups etc. in scope of synchronisation on the Azure AD Connect instance?
Cheers,
Rhoderick
hi Rhoderick,
I have seen the same issue with CMT, cloud mailboxes are ignoring CMT and happily mailing each other within EXO only. Do you have any idea what can we check? I thought HCW will take care of this...
Hi BB,
What is the scenario here please? In the same tenant between tenants?
Cheers,
Rhoderick