Readiness probe failed

  1. I have created a block storage on IBM Cloud.
  2. I have attached the storage to a worker node IP.
  3. Installed Portworx from IBM CLoud.
  4. 2 worker nodes do not come in running state because of following error.

Events:
Type Reason Age From Message


Normal Scheduled default-scheduler Successfully assigned kube-system/portworx-vc6np to 10.221.167.174
Normal Pulling 46s kubelet, 10.221.167.174 Pulling image “portworx/oci-monitor:2.5.2”
Normal Pulled 45s kubelet, 10.221.167.174 Successfully pulled image “portworx/oci-monitor:2.5.2”
Normal Created 45s kubelet, 10.221.167.174 Created container portworx
Normal Started 45s kubelet, 10.221.167.174 Started container portworx
Warning Unhealthy 5s (x4 over 35s) kubelet, 10.221.167.174 Readiness probe failed: HTTP probe failed with statuscode: 503

2 Likes

I am facing the same issue with OpenShift 4.4 in AWS.
Were you able to get around this problem?

Can you paste here portworx pod logs from kube-system namespace ?

Vinayak,

Here are the logs, I guess from portworx log you mean portwox api

autopilot-d494f7f4f-tztct 1/1 Running 0 6h13m
portworx-api-m5r2q 1/1 Running 0 43s
portworx-api-pg9sw 1/1 Running 0 43s
portworx-api-rgm47 1/1 Running 0 34s
px-cluster-e3b0849c-d25d-4b26-9e54-013cd3ab0811-867gf 1/2 Running 0 8m3s
px-cluster-e3b0849c-d25d-4b26-9e54-013cd3ab0811-l722q 2/2 Running 0 8m3s
px-cluster-e3b0849c-d25d-4b26-9e54-013cd3ab0811-nc7mc 1/2 Running 0 8m3s
px-csi-ext-8467cd4bb6-2sg6w 3/3 Running 0 6h13m
px-csi-ext-8467cd4bb6-4xhmn 3/3 Running 0 6h13m
px-csi-ext-8467cd4bb6-r67nz 3/3 Running 3 6h13m
stork-9f8c45d44-fvwfc 1/1 Running 0 6h13m
stork-9f8c45d44-hg6r4 1/1 Running 0 6h13m
stork-9f8c45d44-ss5sj 1/1 Running 0 6h13m
stork-scheduler-8689987c6f-57qfw 1/1 Running 0 6h13m
stork-scheduler-8689987c6f-6chdw 1/1 Running 0 6h13m
stork-scheduler-8689987c6f-rvhnr 1/1 Running 0 6h13m

[root@upstreamcontroller portwork]# oc logs portworx-api-pg9sw
[root@upstreamcontroller portwork]# oc logs portworx-api-rgm47
[root@upstreamcontroller portwork]# oc logs portworx-api-m5r2q
time=“2020-07-29T07:56:53Z” level=warning msg=“Timed out while waiting for StartTransientUnit(crio-416d3b9eb933cd6086af74017d39fa24474fa09f502c75acd170277518d90c02.scope) completion signal from dbus. Continuing…”

I am working along koch as this is imp poc for one of our client.

thanks

Logs from px-cluster-e3b0849c-d25d-4b26-9e54-013cd3ab0811-867gf not from api pods.

Issue was related to the node labels for kvdb, only one node was labeled, we added the px/meteadata-node=true on other nodes and it formed the KVDB cluster. Also there we did the complete wipe as initially the ports were not opened on firewall.

You don’t need to label if your cluster size is 3, that is helpful if you have larger cluster size and want to dedicate the nodes for KVDB.

Just to have summary.

  1. 17001-17020 port were opened on AWS
  2. node.openshift.io/os_id=rhcos,px label was added to only one of the worker node due to which cluster was not creating a quorum.

Issue was solved by Portworx team quickly. Thanks Sanjay for quick help here.

1 Like