Troubleshooting "certificate verify failed" errors that occur in the SAS® Cloud Analytic Services server logs in SAS® Viya® 3.x on Linux


This article describes how to resolve certificate verify failed errors that occur in the SAS Cloud Analytic Services (CAS) server logs in SAS Viya 3.x.

For Transport Layer Security (TLS) connections to be successful, the TLS client must be able to verify the certificate presented by the TLS server. For the verification to be successful, the certificate must meet the following requirements:

In addition, in order for the CAS server to start and function correctly, two TLS connections must succeed, which are as follows:

1. when the CAS server's startup client connects to the CAS server's binary port (default 5570)

2. when the CAS server contacts SASLogon for OAuth operations

The following sections outline examples for both of these connections. The log snippets refer to messages that you can find in the /opt/sas/viya/config/var/log/cas/default/cas_<date>_<host>_<pid>.log on the CAS controller machine.

CAS startup client failures

If a Startup client failed message occurs, which is preceded by SSL verify and validation errors, the startup client was not able to connect to the CAS server's binary port.

Here is an example:

ERROR [00000022] MAIN NoUser MAIN [sslopenssl2.c:6909] - OpenSSL error 337047686 (0x1416f086) occurred in SSL_connect/accept at line 4978, the error message is "error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed".
ERROR [00000022] MAIN NoUser MAIN [skstssl.c:1291] - Secure communications error status 807ff008 description "10.100.3.210: The certificate sent from the remote host cannot be validated by any of the public keys in the root certificate store specified by SSLCALISTLOC or SSLCACERTDIR."
INFO  [00000022] MAIN NoUser MAIN [casgen_su.c:814] - ...sas/viya/home/SASFoundation/misc/casluaclnt/lua/swat.lua:193: Could not connect to '..ZsAbIpS4eWdQ0dtH.' on port 5570.
ERROR [00000022] MAIN NoUser MAIN [casgen_su.c:625] - Startup client failed: 0x803FC009.

This error means that important CAS startup tasks did not execute, such as initializing global CASLIBS, including Public, SystemData, and any custom global CASLIBS. As a result, you will not be able to find or use any existing data sources until this issue is resolved.

To determine the cause of the problem, run the following command on the CAS controller machine:

SSL_CERT_FILE= SSL_CERT_DIR= openssl s_client -CAfile /opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts/trustedcerts.pem -connect $(hostname -f):8777 < /dev/null | openssl x509 -noout -text
 
The CAS controller listens on default ports 5570 and 8777, but only 8777 responds to the OpenSSL s_client command. Because these ports use the same certificate, you can read the certificate info by using port 8777. The command uses 'hostname -f' to obtain the FQDN of the local machine as the CAS server host name. If this name doesn't resolve to an IP for your CAS server, use the host name in the first line of your CAS server log. The output of the command should include the results of verifying the certificate followed by the certificate presented by the TLS server.

Then, review the output of the command.

Scenario 1

If the output begins with the following SAS Viya CAs, your CAS server uses the default TLS configuration. A new certificate is obtained from Vault during the CAS startup. 

depth=2 CN = SAS VIYA Root CA
verify return:1
depth=1 CN = SAS VIYA Intermediate CA
verify return:1

In this scenario, the system time on the Vault server is not in sync with the system time on the CAS controller server, which could be the cause of the problem. You can find evidence of this issue by reviewing the rest of the OpenSSL output for the Validity sections.

Here is an example:

        Validity
            Not Before: Sep 13 14:03:50 2023 GMT
 
In addition, the time of the error in the CAS log is 2023-09-13T10:03:22, as shown below: 

2023-09-13T10:03:22,672 ERROR [00000021] MAIN NoUser MAIN [sslopenssl2.c:6909] - OpenSSL error 336134278 (0x14090086) occurred in SSL_connect/accept at line 4978, the error message is "error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed".

When you take into account the time zone offset of GMT-4, you can see that the error occurred at Sep 13 14:03:22 2023 GMT, but the certificate is not valid until almost 30 seconds later at Sep 13 14:03:50 2023 GMT.

If the time on the Vault server is even a few seconds ahead of the time on the CAS controller, the newly obtained certificate will not be valid at the time that the startup client connects to CAS. 

Ensure that all SAS Viya machines in the deployment are synchronizing time using ntpd or similar services. When the clocks are all in sync, restart the CAS server and check if the problem is resolved.

Note: Vault is deployed to the machines assigned to the [consul] group in sas_viya_playbook/inventory.ini.

Scenario 2

A custom CA and an unable to get local issuer certificate message in the OpenSSL output indicate that the root and/or immediate CA certificates are not present in the SAS Viya truststore. As a result, the CAS certificate cannot be trusted.

Here is an example:

depth=1 CN = Example Custom CA
verify error:num=20:unable to get local issuer certificate
 

You should obtain all the necessary root and intermediate CA certificates from your CA or site admins and add them to the SAS Viya truststore. For more information, see Add Certificates to the Truststore in the SAS® Viya® 3.5 Administration Guide. Then, you can test if the certificates were added to the truststore properly by re-executing the OpenSSL s_client command above. 

If all the necessary certifications are in the truststore, you should not receive any Verify errors. You can then restart the CAS server and verify that the CAS startup error is resolved.

CAS fails to connect to SASLogon

If a getKeySet: Could not send request to server message occurs, which is preceded by SSL verify and validation errors, CAS cannot connect to SASLogon through the web server to perform OAuth operations due to a TLS certificate verification problem.

Here is an example:

ERROR [00000025] MAIN NoUser MAIN [sslopenssl2.c:6909] - OpenSSL error 336134278 (0x14090086) occurred in SSL_connect/accept at line 4978, the error message is "error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed".
ERROR [00000025] MAIN NoUser MAIN [skstssl.c:1291] - Secure communications error status 807ff008 description "10.104.30.200: The certificate sent from the remote host cannot be validated by any of the public keys in the root certificate store specified by SSLCALISTLOC or SSLCACERTDIR."
ERROR [00000025] MAIN NoUser MAIN [tkjwk.c:1379] - getKeySet: Could not send request to server.
ERROR [00000025] MAIN NoUser MAIN [tkjwk.c:1380] - Encryption run-time execution error

To determine the cause of the problem, complete the following steps:

1.  Find the cas.servicesbaseurl setting for CAS.

a) On the CAS controller, run the following command:

sudo grep -i servicesbaseurl /opt/sas/viya/config/etc/cas/default/*.lua

Here is an example:

$ sudo grep -i servicesbaseurl /opt/sas/viya/config/etc/cas/default/*.lua
/opt/sas/viya/config/etc/cas/default/casconfig_deployment.lua:cas.servicesbaseurl = 'https://myserver.example.com:443/'
/opt/sas/viya/config/etc/cas/default/casconfig.lua:--cas.servicesbaseurl = ''

Note: Usually, the value is in casconfig_deployment.lua. Otherwise, there might be a custom value in casconfig_usermods.lua. If there is a custom value in casconfig_usermods.lua, that value takes precedence.

2.  Take the host:port found in step one and run the following OpenSSL command on the CAS controller:

SSL_CERT_FILE= SSL_CERT_DIR= openssl s_client -CAfile /opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts/trustedcerts.pem -connect host:port < /dev/null | openssl x509 -noout -text

Do not include https:// in the OpenSSL s_client command.

Here is an example:

$ SSL_CERT_FILE= SSL_CERT_DIR= openssl s_client -CAfile /opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts/trustedcerts.pem -connect myserver.example.com:443 < /dev/null | openssl x509 -noout -text
 

The output should help you understand the cause of the problem. As in the first problem that was described above, if the output displays an unable to get local issuer certificate message, you probably need to add CA certificates to the SAS Viya truststore. For more information, see Add Certificates to the Truststore in the SAS® Viya® 3.5 Administration Guide. Then, you can test if the certificates were added to the truststore properly by re-executing the OpenSSL s_client command above. 

If all the necessary certifications are in the truststore, you should not receive any verify errors. You can then restart the CAS server and verify whether the CAS getKeySet associated errors were resolved. 

If you need further assistance, contact SAS Technical Support. Be prepared to provide the CAS server logs and the OpenSSL output described above.