An internal OpenSearch cluster might fail with "jar hell" and "Permission denied" errors in a new SAS® Viya® platform deployed on Red Hat OpenShift


In a new SAS Viya platform environment deployed on Red Hat OpenShift, your internal OpenSearch cluster that is deployed as part of the SAS Viya platform might fail with jar hell and Permission denied errors against a druid-<version>.jar file. These errors might occur in the logs of your sas-opendistro-* pods as follows:

{"type": "server", "timestamp": "2024-12-05T21:41:05,738Z", "level": "ERROR", "component": "o.o.b.OpenSearchUncaughtExceptionHandler", "cluster.name": "sas-opendistro", "node.name": "sas-opendistro-default-0", "message": "uncaught exception in thread [main]",

"stacktrace": ["org.opensearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to load plugin opensearch-sql due to jar hell",

"at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:185) ~[opensearch-2.17.0.jar:2.17.0]",

"at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:172) ~[opensearch-2.17.0.jar:2.17.0]",

"Caused by: java.io.FileNotFoundException: /usr/share/elasticsearch/plugins/opensearch-sql/druid-1.1.20.jar (Permission denied)",

"at java.base/java.io.RandomAccessFile.open0(Native Method) ~[?:?]",

Similarly, in other pods or services logs that interact with OpenSearch and/or have a dependency on OpenSearch (for example, sas-svi-datahub and sas-svi-sand pods on SAS® Visual Investigator), a Waiting for OpenSearch to start message might occur. This message implies that the pod or service is waiting for OpenSearch to start. 

These errors usually occur when you miss OpenShift pre-installation steps that are specific to OpenSearch—especially service account, UID, and security context constraints (SCC) related steps that are supposed to be completed before the SAS Viya platform is deployed. Those pre-installation steps are documented below. See the version of the document that matches with the version of your SAS Viya platform environment:

Workaround

Confirm that you completed all the applicable steps from the documentation above before you deploy the SAS Viya platform:

  1. Confirm that all the requirements and steps from the OpenSearch Requirements that apply to an internal OpenSearch cluster deployment on a Red Hat OpenShift environment are in place.

  2. If you are using a run-user-transformer.yaml file in your kustomization.yaml to specify a custom user ID instead of using the 1000 default value for the OpenSearch processes, check and confirm that you also updated the uid property of the runAsUser option within the sas-opendistro-scc.yaml file before applying it, as described in Apply and Bind the Security Context Constraints.
    1. Also, exec into one of your sas-opendistro-* pods and run id command. The UID returned by the id command must match the UID specified in the run-user-transformer.yaml and sas-opendistro-scc.yaml files. Note that the UID cannot be changed later after the deployment is complete. You must decide and configure the UID at the time of the initial deployment. See the $deploy/sas-bases/examples/configure-elasticsearch/internal/run-user/README.md file of your deployment assets for additional details.

  3. Double-check and make sure that you edited and applied the sas-opendistro-scc.yaml file properly, as described in Apply and Bind the Security Context Constraints. Run oc -n <Viya namespace> get scc sas-opendistro -o yaml and review the YAML output. Make sure you see the correct uid under runAsUser in the output. Also, if you used the sysctl-transformer.yaml file to set the vm.max_map_count parameter, make sure these special privileges are set to true in the output: allowPrivilegeEscalation and allowPrivilegedContainer.    
     
  4. Confirm that the sas-opendistro SCC is bound to the sas-opendistro service account.

    To check, run oc adm policy who-can use scc sas-opendistro. In the output, under Users, a sas-opendistro service account along with your SAS Viya namespace must occur. For example, you should see the following if your Viya namespace is “viya”:

    Namespace: viya
    Verb:      use
    resource:  securitycontextconstraints.security.openshift.io
    Users: system:serviceaccount:viya:sas-opendistro

Note: Do as follows if you missed any of these steps and/or there is an UID mismatch between the UID that is returned by the id command from one of the sas-opendistro-* pods and the UID in the run-user-transformer.yaml and sas-opendistro-scc.yaml files:

  1. Remove the entire OpenSearch cluster.

  2. Complete all the required steps from the documentation again without missing anything.

  3. Redeploy again so that the cluster is recreated correctly from scratch. If you need to remove the OpenSearch cluster, refer to the Remove an OpenSearch Cluster Resource documentation. (Be sure to use the version of the doc that matches the version of your SAS Viya platform envrionment).

Removing the OpenSearch cluster results in loss of indices related data if anything has been indexed. However, you can get that data back by reindexing everything manually once the cluster has been recreated after you redeploy. You most likely do not have anything indexed if you have a newly deployed SAS Viya platform environment. In this scenario, it should be safe to remove and recreate the cluster from scratch. However, if you do have data, SAS Technical Support recommends taking a full-system backup of the entire SAS Viya platform before removing your OpenSearch cluster. See SAS® Viya® Platform: Backup and Restore for backup and restore documentation. (Be sure to use the version of the doc that matches the version of your SAS Viya platform environment). Once you redeploy and the cluster is recreated from scratch, you can reindex the data manually.