I do not have access to a standard cluster, so trying to leverage Openshift Codeready Containers, follow-on to Minishift, where Openshift runs a cluster in a VM. Specifically, a virt-manager KVM VM on Ubuntu (Pop!_OS) 20.04 LTS.
I am really trying to recreate the steps in https://portworx.com/run-ha-sql-server-red-hat-openshift/ for an HA SQL Server. All goes well until I apply the StorageCluster yaml (oc apply -f px-spec.yaml), afterwhich no portworx pods start. The error event for the StorageCluster is:
Type Reason Age From Message
Warning FailedSync 25s (x7 over 6m25s) storagecluster-controller error connecting to GRPC server [172.25.98.51:9020]: Connection timed out
Not sure what it wants, or if this is even possible in Codeready Containers, but appreciate if anyone has any insight.
Of the two of the most common issues we’ve seen encountered, the first one it seems from your log entry you may be already seeing, is that network ports 9001-9022 via TCP (as well as 9002 via UDP) must to be reachable between each of the nodes running Portworx. It’s likely the iptables firewall that CRC is setting up does not include these ports, resulting in traffic being blocked. You’ll need to make sure these CRC sets up to be open these ports between nodes. (These change as of OpenShift 4.3 to begin with 17001 rather than 9001).
Secondly, the next issue you may want to watch out for is that Portworx requires unused block devices to be present on each Portworx node for it to be able to come properly in normal mode (otherwise it starts in storageless node). This allows it to form the StorageCluster which allows the storage overlay services it is designed with (and we set up a separate dedicated etcd cluster for kvdb purposes, these won’t include storageless nodes). I don’t know much about CRC yet to know if it is configured with these block devices this way, or if not, what needs to be done for them to be added, but can look into it more if needed.
For now, please make sure the minimum needed resources are available, per our documentation available here. Once these needs are met, you’d typically go to the OpenShift web interface, go to Operators, install the Portworx operator and get a StorageCluster spec from install.portworx.com.
I suspect the CRC node being a master as well has a taint to disallow running pods. Also by default if you do not give any placement constraints, the portworx operator will exclude master/infra nodes. That’s because generally Portworx doesn’t run on master nodes as there are no apps running.
Can you paste the output for the following
spec.taints from the openshift node
oc get storagecluster -oyaml
oc get pods -n <storagecluster_namespace>
If the CRC node has a taint, you can add toleration in the StorageCluster spec (spec.placement.tolerations) to tolerate the master taint. Otherwise changing the StorageCluster placement should work too (spec.placement.nodeAffinity)