2 storage nodes in my dev cluster are showing as offline ( storage down ) . Please let me know on how I can bring them up.
px is installed on a single zone openshift cluster ( v4.3 )
Find below snippet of the logs from these nodes:
1st node log
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:03Z” level=info msg=“starting pool expansion watcher…”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Starting REST service on socket : /run/docker/plugins/pxd.sock”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Starting REST service on socket : /var/lib/osd/driver/pxd.sock”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“PX is ready on Node: 35277163-6148-4064-9892-f2dd1d6adcfe. CLI accessible at /opt/pwx/bin/pxctl.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node e9314076-0a58-4148-8482-8411b95d58d2 to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node 10c7773c-a27e-455b-9b14-e08980c7ba80 to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node a81e5f9e-d202-4a14-8d08-90d03dd0a8bd to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node 8b10210c-25ef-4726-ae9b-5c9b458fc4e0 to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node 0f2f74c4-b189-426f-b1e0-ccc3512c1556 to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node 1790a87e-d2f1-4f25-babd-d878934f56ff to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node 2fa52c6d-d81e-4be5-be4a-1aed0c8c0717 to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=info msg=“Detected node f6ab2e39-53e0-4bd4-b8a3-468c9566b46a to be in the cluster.”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=warning msg=“Unable to list containers. Some containers using shared volumes may need to be restarted manually: Docker not yet initialized”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:05Z” level=warning msg=“Unable to list containers. Some containers using shared volumes may need to be restarted manually: Docker not yet initialized”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:07Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:17Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:21Z” level=info msg=“Task to add k8s node watch is canceled”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:27Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T18:52:37Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:37Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T18:52:47Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-large32-00000801.iks.ibm portworx[884771]: time=“2020-09-24T18:52:47Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
2nd node log
time=“2020-09-24T14:09:34-05:00” level=info msg="/etc/systemd/system/portworx.socket content unchanged [e80e04204b7a7d113db36c53f420d635 /etc/systemd/system/portworx.socket]"
time=“2020-09-24T14:09:34-05:00” level=info msg="/etc/systemd/system/portworx-output.service content unchanged [7340d8be39a32f3b7d296ac8275bc2e1 /etc/systemd/system/portworx-output.service]"
time=“2020-09-24T14:09:34-05:00” level=info msg="/etc/systemd/system/portworx.service content unchanged [4c6515ccf9c8cb3f81745795a981723c /etc/systemd/system/portworx.service]"
time=“2020-09-24T19:09:35Z” level=info msg=“runC spec unchanged”
time=“2020-09-24T19:09:35Z” level=info msg=“Portworx service restart not required.”
time=“2020-09-24T19:09:35Z” level=info msg=“Activating node-watcher”
time=“2020-09-24T19:09:35Z” level=info msg=“Portworx service is ACTIVE”
time=“2020-09-24T19:09:35Z” level=info msg=“REST: Changing install-state: ST_INSTALL -> ST_FINISH”
time=“2020-09-24T19:09:35Z” level=info msg=“Start tailing portworx.service logs”
time=“2020-09-24T19:09:36Z” level=info msg="> Starting local log-tailer"
time=“2020-09-24T19:09:36Z” level=info msg="> run-local: /px-log-tail --follow -P @ -tf -u portworx.service -u portworx-output.service -u init.scope -n 20000 -p 65789"
time=“2020-09-24T19:09:36Z” level=info msg=“Install done - MAIN exiting”
time=“2020-09-24T19:09:36Z” level=info msg="-- Flushing logs for PID 65789 [20000 lines] --"
time=“2020-09-24T19:09:36Z” level=info msg="-- Start tailing the logs for portworx.service, portworx-output.service, init.scope --"
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:09:20Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:09:45Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:09:52Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:09:55Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:05Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:15Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:22Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:25Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:35Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:10:45Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:45Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:10:55Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:55Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:58Z” level=error msg=“Failed to create the keys directory (/var/.px/0/.metadata/kvdb_backup): mkdir /var/.px/0/.metadata: input/output error”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:10:58Z” level=error msg=“Failed to cleanup old kvdb dumps: open /var/.px/0/.metadata/kvdb_backup: input/output error”
time=“2020-09-24T19:11:05Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:11:05Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:11:15Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:11:15Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:11:25Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:11:25Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:11:35Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:11:35Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:11:45Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:11:45Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:11:55Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:11:55Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
time=“2020-09-24T19:12:05Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:12:05Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”
time=“2020-09-24T19:12:15Z” level=warning msg=“Could not retrieve PX node status” error=“Node status not OK (STATUS_STORAGE_DOWN)\n”
@kube-bqg6g7nd0lnhi47g0f90-gen2nonprod-default-00000db8.iks.ibm portworx[65780]: time=“2020-09-24T19:12:15Z” level=warning msg=“503 Node status not OK (STATUS_STORAGE_DOWN)” Driver=“Cluster API” ID=nodeHealth Request=“Cluster API”