RKE2 - CSI for FlashArray not working

Generated a DaemonSet spec using 1.23.6+rke2r2, cloud, pure flasharray via FC. (link)

The containerd socket is correctly detected/mounted (the k3s path) as seen in the spec

Everything seems to deploy fine except the portworx container fails and logs this:

time="2022-09-01T19:41:07Z" level=info msg="Input arguments: /px-oci-mon -c px-cluster-44ffacd3-508c-4b9b-9d83-9dc9a55c0fdb -a -secret_type k8s -j auto -b --oem esse -x kubernetes"
time="2022-09-01T19:41:07Z" level=info msg="Updated arguments: /px-oci-mon -c px-cluster-44ffacd3-508c-4b9b-9d83-9dc9a55c0fdb -a -secret_type k8s -j auto -b -x kubernetes" install-opts=--upgrade
time="2022-09-01T19:41:07Z" level=info msg="OCI-Monitor computed version v2.11.2-g4faee841-dirty"
time="2022-09-01T19:41:07Z" level=info msg="REAPER: Starting ..."
time="2022-09-01T19:41:07Z" level=info msg="Service handler initialized via as DBus{type:dbus,svc:portworx.service,id:0xc000889a00}"
time="2022-09-01T19:41:07Z" level=info msg="Setting up container handler"
time="2022-09-01T19:41:07Z" level=info msg="> run-host: /bin/sh -c cat /etc/crictl.yaml || cat /var/vcap/store/crictl.yaml"
time="2022-09-01T19:41:07Z" level=warning msg="Could not retrieve my container ID  (will search by labels)" error="Container not found"
time="2022-09-01T19:41:07Z" level=info msg="Locating my container handler"
time="2022-09-01T19:41:07Z" level=info msg="> Attempt to use Docker as container handler failed, trying next..." error="/var/run/docker.sock not a socket-file"
time="2022-09-01T19:41:07Z" level=info msg="> Using ContainerD as container handler"
time="2022-09-01T19:41:07Z" level=info msg="Activating REST server"
time="2022-09-01T19:41:07Z" level=info msg="Activating node-watcher"
time="2022-09-01T19:41:07Z" level=error msg="Could not extract my container's configuration: could not load container : container \"\" in namespace \"k8s.io\": not found"

I should add, the cluster is opearting normally and I can find the container:

root@node-1:/# export CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml
root@node-1:/# /var/lib/rancher/rke2/bin/crictl ps -a
CONTAINER           IMAGE               CREATED             STATE               NAME                            ATTEMPT             POD ID
d892498f255c2       35d7690fd305e       20 seconds ago      Exited              portworx                        12                  891b5be65501e
b75e1da0b87f4       cb03930a2bd42       36 minutes ago      Running             csi-node-driver-registrar       0                   891b5be65501e

Update: I attempted this with the 2.11 operator (which uses different/newer images for stork and autopilot than the 2.11 DaemonSet???) and I get the exact same error in the portworx container.

Can you share the output of kubectl get nodes -o wide ?

Just wanted to confirm if you are running Ubuntu 22.04 LTS then its not supported yet, but there is a fix in plan for upcoming release.

They are on 20.04

NAME     STATUS   ROLES                              AGE   VERSION          INTERNAL-IP     EXTERNAL-IP     OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
node-1   Ready    control-plane,etcd,master,worker   20d   v1.23.6+rke2r2   Ubuntu 20.04.5 LTS   5.15.0-46-generic   containerd://1.5.11-k3s2
node-2   Ready    worker                             20d   v1.23.6+rke2r2   Ubuntu 20.04.5 LTS   5.15.0-46-generic   containerd://1.5.11-k3s2

Run the k8s version lesser than v1.23.0. it would work. and also you may want to make sure on the supported kernel version as well.

Hey just a small update on this. The kubernetes version was the problem. I’m on a version of 1.22 and it’s working great.

As a recommendation, the spec generator in px-central should probably check for this. If you are already parsing the version to check for the presence of +rke2/k3s then checking for a maximum/minimum should also be in place.

Also, (at the time of writing) this page directly states support for 1.23 and 1.24! That should probably be corrected or tested better.