Been at this for a while and can’t get Portworx Essentials installed on IBM Cloud ROKS (Openshift 4.6).
Here is my spec:
# SOURCE: https://install.portworx.com/?operator=true&mc=false&kbver=1.19.0%2B263ee0d&oem=esse&user=0eeca59c-c7ab-11ea-a2c5-c24e499c7467&b=true&s=%2Fdev%2Fdm-1&m=eth0&d=eth0&c=px-cluster-6604ffbb-4120-4f25-8fbc-b4d803a96530&osft=true&stork=true&st=k8s&rsec=px-essential
kind: StorageCluster
apiVersion: core.libopenstorage.org/v1
metadata:
name: px-cluster-6604ffbb-4120-4f25-8fbc-b4d803a96530
namespace: kube-system
annotations:
portworx.io/install-source: "https://install.portworx.com/?operator=true&mc=false&kbver=1.19.0%2B263ee0d&oem=esse&user=0eeca59c-c7ab-11ea-a2c5-c24e499c7467&b=true&s=%2Fdev%2Fdm-1&m=eth0&d=eth0&c=px-cluster-6604ffbb-4120-4f25-8fbc-b4d803a96530&osft=true&stork=true&st=k8s&rsec=px-essential"
portworx.io/is-openshift: "true"
portworx.io/misc-args: "--oem esse"
spec:
image: portworx/oci-monitor:2.7.0
imagePullPolicy: Always
imagePullSecret: px-essential
kvdb:
internal: true
storage:
devices:
- /dev/dm-1
network:
dataInterface: eth0
mgmtInterface: eth0
secretsProvider: k8s
stork:
enabled: true
args:
webhook-controller: "false"
autopilot:
enabled: true
Current issue is that px-cluster-* pods are running but not ready. Event on one of the pods.
Warning Unhealthy 14s (x114 over 19m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Here’s the last 50 lines of the pod log.
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:08:57Z" level=info msg="Made 1 pools"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:08:57Z" level=info msg="Benchmarking drive /dev/sda"
time="2021-05-12T21:09:01Z" level=warning msg="Could not retrieve PX node status" error="Get http://127.0.0.1:17001/v1/cluster/nodehealth: dial tcp 127.0.0.1:17001: connect: connection refused"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:07Z" level=info msg="fio: test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128\nfio-2.2.10\nStarting 1 process\n\ntest: (groupid=0, jobs=1): err= 0: pid=1877: Wed May 12 21:09:07 2021\n read : io=129528KB, bw=12900KB/s, iops=3224, runt= 10041msec\n slat (usec): min=0, max=2647, avg=11.00, stdev=24.61\n clat (msec): min=1, max=379, avg=39.66, stdev=22.42\n lat (msec): min=1, max=379, avg=39.67, stdev=22.42\n clat percentiles (msec):\n | 1.00th=[ 4], 5.00th=[ 11], 10.00th=[ 19], 20.00th=[ 21],\n | 30.00th=[ 30], 40.00th=[ 31], 50.00th=[ 40], 60.00th=[ 41],\n | 70.00th=[ 50], 80.00th=[ 60], 90.00th=[ 61],95.00th=[ 70],\n | 99.00th=[ 110], 99.50th=[ 159], 99.90th=[ 221], 99.95th=[ 269],\n | 99.99th=[ 330]\n bw (KB /s): min=11402, max=24120, per=99.68%, avg=12858.35, stdev=2660.10\n lat (msec) : 2=0.07%, 4=1.46%, 10=3.46%, 20=12.86%, 50=52.58%\n lat (msec) : 100=28.47%, 250=1.04%, 500=0.06%\n cpu : usr=2.07%, sys=5.48%, ctx=4011, majf=0, minf=166\n IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.8%\n submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%\n complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%\n issued : total=r=32382/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0\n latency : target=0, window=0, percentile=100.00%, depth=128\n\nRun status group 0(all jobs):\n READ: io=129528KB, aggrb=12899KB/s, minb=12899KB/s, maxb=12899KB/s, mint=10041msec, maxt=10041msec\n\nDisk stats (read/write):\n sda: ios=31946/0, merge=32/0, ticks=1257442/0, in_queue=1260582, util=99.07%\n"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:07Z" level=info msg="Storage pool WriteThroughput 12MB/s"
time="2021-05-12T21:09:11Z" level=warning msg="Could not retrieve PX node status" error="Get http://127.0.0.1:17001/v1/cluster/nodehealth: dial tcp 127.0.0.1:17001: connect: connection refused"
time="2021-05-12T21:09:21Z" level=warning msg="Could not retrieve PX node status" error="Get http://127.0.0.1:17001/v1/cluster/nodehealth: dial tcp 127.0.0.1:17001: connect: connection refused"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:22Z" level=error msg="Unable to start internal kvdb on this node" err="failed in initializing drives on this node: Failed to format [-f --nodiscard /dev/sda]: ERROR: unable to open /dev/sda: Device or resource busy" fn=kvdb-provisioner.ProvisionKvdbWithoutLock id=5081f13f-81c2-4082-b5f5-f8b7b17a4776
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:22Z" level=error msg="failed to setup internal kvdb:failed to provision internal kvdb: failed in initializing drives on this node: Failed to format [-f --nodiscard /dev/sda]: ERROR: unable to open /dev/sda: Device or resource busy" func=InitAndBoot package=boot
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:22Z" level=error msg="Could not init boot manager" error="failed to setup internal kvdb: failed to provision internal kvdb: failed in initializing drives on this node: Failed to format [-f --nodiscard /dev/sda]: ERROR: unable to open /dev/sda: Device or resource busy"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: PXPROCS[INFO]: px daemon exited with code: 1
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12 21:09:23,628 INFO exited: pxdaemon (exit status 1; not expected)
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12 21:09:23,630 INFO spawned: 'pxdaemon' with pid 1888
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: PX_STORAGE_IO_FLUSHER=yes
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: Starting as an IOFlusher process : /usr/local/bin/start_pxcontroller_pxstorage.py
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: Process with PID 1888, is a IO Flusher
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12 21:09:23,654 INFO reaped unknown pid 1768
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: PXPROCS[INFO]: Started px-storage with pid 1922
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: bash: connect: Connection refused
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: bash: /dev/tcp/localhost/17006: Connection refused
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: PXPROCS[INFO]: px-storage not started yet...sleeping
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12 21:09:25,215 INFO reaped unknown pid 1863
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: PXPROCS[INFO]: Started px with pid 1939
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: PXPROCS[INFO]: Started watchdog with pid 1940
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12_21:09:26: PX-Watchdog: Starting watcher
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12_21:09:26: PX-Watchdog: Waiting for px process to start
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12_21:09:26: PX-Watchdog: (pid 1939): Begin monitoring
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Registering [kernel] as a volume driver"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Registered the Usage based Metering Agent...."
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Setting log level to info(4)"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=error msg="Cannot listen on UNIX socket: listen unix /run/docker/plugins/pxd.sock: bind: no such file or directory"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=warning msg="Failed to start pxd-dummy: failed to listen on pxd.sock, ingnoring and continuing..."
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="read config from env var" func=init package=boot
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="read config from config.json" func=init package=boot
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Alerts initialized successfullyfor this cluster"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Node is not yet initialized" func=setNodeInfo package=boot
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Generated a new NodeID: 244996ec-254d-4389-b6aa-7abf1aad486c"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Found mgmt interface device:[eth0]"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:27Z" level=info msg="Found data interface device:[eth0]"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:28Z" level=info msg="Using interface device:[eth0] for management..."
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:28Z" level=info msg="Using interface device:[eth0] for data..."
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:28Z" level=info msg="Detected Machine Hardware Type as: xen (Virtual Machine)"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:28Z" level=info msg="Bootstraping internal kvdb service." fn=kv-store.New id=244996ec-254d-4389-b6aa-7abf1aad486c
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: 2021-05-12 21:09:29,109 INFO success: pxdaemon entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
time="2021-05-12T21:09:31Z" level=warning msg="Could not retrieve PX node status" error="Get http://127.0.0.1:17001/v1/cluster/nodehealth: dial tcp 127.0.0.1:17001: connect: connection refused"
time="2021-05-12T21:09:41Z" level=warning msg="Could not retrieve PX node status" error="Get http://127.0.0.1:17001/v1/cluster/nodehealth: dial tcp 127.0.0.1:17001: connect: connection refused"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:09:45Z" level=warning msg="Locked for 15 seconds" Error="ConfigMap is locked" Function=Lock Module=ConfigMap Name=px-bootstrap-pxstoragecluster Owner=6f388ee6-818a-4301-8eb6-be34bdba973b
time="2021-05-12T21:09:51Z" level=warning msg="Could not retrieve PX node status" error="Get http://127.0.0.1:17001/v1/cluster/nodehealth: dial tcp 127.0.0.1:17001: connect: connection refused"
@kube-c2cose7w02fpblcs6i10-roksaiops05-default-00000378.iks.ibm portworx[283510]: time="2021-05-12T21:10:01Z" level=warning msg="Locked for 30 seconds" Error="ConfigMap is locked" Function=Lock Module=ConfigMap Name=px-bootstrap-pxstoragecluster Owner=5fc3109e-e137-40d3-8f42-4ca76bf98acf
time="2021-05-12T21:10:01Z" level=warning msg="Could not retrieve PX node status" error="Get http://127.0.0.1:17001/v1/cluster/nodehealth: dial tcp 127.0.0.1:17001: connect: connection refused"