Many pods are stuck in a "Pending" state, and an event identifies a "volume node affinity conflict"


SAS® Viya® 2020 or later is deployed on a Microsoft Azure Kubernetes Service (AKS) cluster. After you run the kubectl apply commands for deploying SAS Viya, nearly all pods of a given node pool remain in a “Pending” state and never proceed to the “Ready” state. Running the kubectl describe command on any of these pods shows the following warning:

Events:   
Type     Reason            Age                From               Message  
----     ------            ----               ----               ------  Warning  FailedScheduling  27s (x2 over 27s)  default-scheduler    0/3 nodes are available: 1 node(s) had volume node affinity conflict, 2 node(s) didn't match node selector.

The error about "volume node affinity conflict" happens when the persistent volume claims that the pod is scheduled on different zones, rather than on one zone. Therefore, the actual pod is not able to be scheduled because it cannot connect to the volume from another zone.  

The issue is caused by mixing node pools with an availability zone configuration with others that do not have a configured availability zone. When creating the node pools, AKS automatically assigns labels to the nodes, depending on their configuration. If an availability zone is not configured, the failure-domain.beta.kubernetes.io/zone is set to faultDomain, which is 0. If a pod requests a persistent volume in a nonzero availability zone, the request cannot be satisfied by the node pool due to the volume node affinity conflict.

To avoid this problem, it is highly recommended that you follow the guidance in the Microsoft documentation, Create an Azure Kubernetes Service (AKS) cluster that uses availability zones

Alternatively, you can circumvent this issue using either of the following methods: