Background
Recently (last week), we installed new SSL certificate on the Tomcat instances in production. The process involved:
- Create a new Java Keystore
- Generate a new CSR
- Obtain the certificate for our domain along with certificate chain
- Import the certificate with the certificate chain in the keystore
- Update Tomcat server.xml to point to new keystore
- Restart Tomcat process
The Tomcat instance hosts a SOAP WebService. The verification steps involved
- Checking the certificate details in multiple browsers
- Verifying SOAP API invocation using SOAP-UI tool
The verification was successful and we applied the change in production.
Issue
Within few hours couple of customers reported issue that they are not able to access the API. One customer shared the error log:
Caused by: javax.xml.ws.soap.SOAPFaultException: nested fault: SSL protocol error
error:140CF086:SSL routines:SSL_VERIFY_CERT_CHAIN:certificate verify failed
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
at com.sun.xml.ws.fault.SOAP11Fault.getProtocolException(SOAP11Fault.java:189)
at com.sun.xml.ws.fault.SOAPFaultBuilder.createException(SOAPFaultBuilder.java:122)
at com.sun.xml.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:119)
at com.sun.xml.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:89)
at com.sun.xml.ws.client.sei.SEIStub.invoke(SEIStub.java:118)
The first reaction was that we messed up something in deployment. Without reviewing
the error and understanding the root cause the decision was to restore the service
and the change was rolled back. Since the old certificate was valid for few more weeks,
it was a good decision.
Analysis
Later in the day I analyzed the error and concluded that there was no issue with the
certificate or the deployment. Following is the new certificate chain when viewed in the
browser
Following is the old certificate chain when viewed in the browser
While the root CA is same, the intermediate CA changed from "Verisign Class 3 Secure Server CA - G3" to "Symantec Class 3 Secure Server CA - G4". This change happened because the new certificate that we requested was SHA2. Verisign class 3 certificate is SHA1 whereas Symantec class 3 certificate is SHA2. Symantec has issued new intermediate CA certs with after Verisign aquisition in 2010.
Clients that don't have Symantec class 3 intermediate certificate in their truststore will fail with error SSL_VERIFY_CERT_CHAIN.
Resolution
To overcome this error, customers must import the intermediate certificate from following link into their truststore:
https://knowledge.symantec.com/support/ssl-certificates-support/index?page=content&actp=CROSSLINK&id=INFO2045
Following page has instructions for installing certificate on various platforms:
https://knowledge.symantec.com/support/ssl-certificates-support/index?page=content&id=INFO212
Conclusion
- When installing new certificates, notify customers in advance (few weeks). Do this even if the change is limited to just extension of expiry date or domain name change.
- Any change in hashing algorithm i.e. SHA1 to SHA2 or SHA2 to SHA3 should be announced well in time to all customers. Different browsers have different timelines when it comes to migrating from SHA1 to SHA2. The biggest risk of such seemingly minor changes is on API integration.
- Observe the certificate chain carefully. Just seeing the green page icon in the browser bar is not sufficient. Share the chain with customer if it different from existing certificate chain.
Update - 04/20/2016
I missed one important part in my analysis. I verified the certificate chain using browser but never bothered to look at the chain in the keystore. It turns out the keystore didn't have full certificate chain and that caused clients to fail. If clients had the intermediate certificate in their truststore it would not have mattered. So the fix on our side was to import root CA.
Update - 04/20/2016
Today we went through another issue which was related to SHA1 to SHA2 update. One of the key customers was not prepared and post update they were not able to access out services. The client software was running on a Windows 2003 server that was never patched and was lacking support for SHA2. They were seeing following error while connecting to our service:
The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel.
While this key customer was trying to figure out how to patch their system(which is not easy), we put together a workaround solution for them so that they can continue to use the server. Here is what we did:
- Asked customer to use non-secure port. Since customer connects to our APIs over VPN it was okay to use non-secure port. However, it was not possible because the URL was hardcoded in the code and nobody knew where the source is or how to build it. So we went to option #2
- We setup a new server.
- Installed the required software (which is Java + Tomcat + WAR file)
- Created a new self signed SHA1 certificate for the domain
- Configured tomcat to use the new keystore and self signed certificate
- Shared the certificate with customer to import in their truststore
- Asked customer to update the /etc/hosts (or equivalent for Windows) on their machine to point domain name to the IP of this new server. This avoided the need for changing the hardcoded URL in the code.
Following links were esp useful when troubleshooting and recommending solution to customer to patch their Windows 2003 server: