Due to change in PKI Infrastructure – adding new two tire Certificate Authority (CA) with SHA256 (SHA2) certificate algorithms to replace an old single root CA with SHA1 certificate algorithm – the Lync Server 2013 pool in production environment require to be replace existing default certificate as well as OAuth certificate before old single root CA expires. In this case Lync Server 2013 pool with three Front End Servers is hardware load balanced using F5 hardware load balancer.
With using Lync server 2013 deployment wizard – Step 3, the Oauth and Default Pool certificate were requested and issued by the new CA. The new Root CA (offline) and Intermediate CA (Online – Issuing) were already exist in the servers with PKI deployment. The new default certificate was applied to all three front end servers. Additionally, same certificate with private key was imported to F5 and applied to VIP for the HLB functionality of Internal Web services and VIP for the reverse proxy where external web traffic was reaching for load balancing from perimeter network.
As soon as all new certificates were assigned to appropriate places, suddenly all web traffic start failing (both internally and externally). Due to direct impact to production environment, the certificate change was performed afterhours mean only short time is available to troubleshoot this issue and if possible resolve it.
Additionally, running the Dial-in web URL – https://dialin.custdom.com return an error:
Error 64 – The specified network name is no longer available.
The application event logs show large number of schannel errors. The CLSLogger capture show that certificate handshake on TLS negotiation is failing. However, neither event log nor CLSLogger was able to point to what is wrong with certificate. However, checking on the Dial-in page with above failure, correct certificate is shown.
With not much data from event log and CLSLogger, the WireShark was turn on in all three Front End Servers and packet captured was turned on in F5. With captured packets, it become clear that necessary Cipher Suites are missing in server hello from F5.
Due to this missing Cipher, the web services are broken for the production environment.
The F5 firmware version is 11.4.x and the F5 support link indicates that this is a known issue with F5 running with firmware older than 11.5.0.
Update to F5 firmware from 11.4.1 to 11.5.3 fixed the issue.
Lesson Leaned: When changing the default certificate from SHA1 to SHA256 (SHA2) on Front End Pool, make sure the Hardware Load Balancer or other 3rd party hardware devices has latest firmware that support appropriate Cipher suites.
Note: The issue was happened with Lync Server 2013 pool. However, it can happen with a Skype for Business Server 2015 pool as well for same reasons.