Portworx pod stuck in CrashLoopBackOff

Hi,
I am trying to install portworx essential/enterprise in one worker node kubernetes and portworx pod stuck in CrashLoopBackOff with the following exception in log,

Could not install portworx/px-essentials:2.6.3: Could not start container 43e67c09e84b2206396d056ce960a54b76783a04f4d0e9ca568aa83b0ea93789 [portworx/px-essentials:2.6.3]: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused “process_linux.go:281: applying cgroup configuration for process caused “failed to write 0-55\n to cpuset.cpus: write /sys/fs/cgroup/cpuset/docker/cpuset.cpus: invalid argument””: unknown

any idea what might be wrong here?

You may want try editing your daemonset (px spec yaml )

  • name: proc1nsmount
    mountPath: /host_proc/1/ns

      - name: proc1nsmount
        hostPath:
          path: /proc/1/ns
    

to

  • name: hostprocmount
    mountPath: /host_proc

      - name: hostprocmount
        hostPath:
          path: /proc

i am using portworx 2.6 and have bottom one in the spec yaml file. i tried changing to the proc1nsmount for both mount and volume definition and got the same exception.

can you post your complete pod log and portworx install spec file?

kubectl logs -n kube-system -l name=portworx --tail=99999

kubectl logs -n kube-system -l name=portworx --tail=99999 -c portworx
time=“2021-03-26T14:19:39Z” level=info msg=“Input arguments: /px-oci-mon -c px-cluster-085a9cc8-6622-4fbf-94a7-33d76b0a2870 -d eno1 -m eno1 -a -secret_type k8s -b --oem esse -x kubernetes”
time=“2021-03-26T14:19:39Z” level=info msg=“Updated arguments: /px-oci-mon -c px-cluster-085a9cc8-6622-4fbf-94a7-33d76b0a2870 -d eno1 -m eno1 -a -secret_type k8s -b -x kubernetes”
time=“2021-03-26T14:19:40Z” level=info msg=“OCI-Monitor computed version v2.6.3-ge63b7d82-dirty”
time=“2021-03-26T14:19:40Z” level=info msg=“REAPER: Starting …”
time=“2021-03-26T14:19:40Z” level=info msg=“Service handler initialized via as DBus{type:dbus,svc:portworx.service,id:0xc000202d40}”
time=“2021-03-26T14:19:40Z” level=info msg=“Activating REST server”
time=“2021-03-26T14:19:40Z” level=info msg=“Locating my container handler”
time=“2021-03-26T14:19:40Z” level=info msg=“Negotiated Docker API version: 1.32”
time=“2021-03-26T14:19:40Z” level=info msg=“Detected NetworkMode container:ef80d53c2ae71fc1d0964b906fcd3eed8fcab38bdcd200c7758d078ec58a5d00 → polling ef80d53c2ae7 for network settings”
time=“2021-03-26T14:19:40Z” level=info msg="> Using Docker as container handler"
time=“2021-03-26T14:19:40Z” level=info msg=“Detected NetworkMode container:ef80d53c2ae71fc1d0964b906fcd3eed8fcab38bdcd200c7758d078ec58a5d00 → polling ef80d53c2ae7 for network settings”
time=“2021-03-26T14:19:40Z” level=info msg=“Detected HostNetwork setting - will track portworx status via REST”
time=“2021-03-26T14:19:40Z” level=info msg=“Parsed Registry/Repo portworx from own image URN portworx/oci-monitor@sha256:1fac588c0702e7076ed897b9758d80df896da1eb1e864674b9bc54cd00ff7e5a”
time=“2021-03-26T14:19:40Z” level=info msg=“Detected custom registry server/repo – installing PX-OCI from portworx/px-essentials:2.6.3”
time=“2021-03-26T14:19:40Z” level=info msg=“Removed env variables: [AUTOPILOT_PORT AUTOPILOT_PORT_9628_TCP AUTOPILOT_PORT_9628_TCP_ADDR AUTOPILOT_PORT_9628_TCP_PORT AUTOPILOT_PORT_9628_TCP_PROTO AUTOPILOT_SERVICE_HOST AUTOPILOT_SERVICE_PORT AUTOPILOT_SERVICE_PORT_AUTOPILOT COREDNS_PORT COREDNS_PORT_53_TCP COREDNS_PORT_53_TCP_ADDR COREDNS_PORT_53_TCP_PORT COREDNS_PORT_53_TCP_PROTO COREDNS_PORT_53_UDP COREDNS_PORT_53_UDP_ADDR COREDNS_PORT_53_UDP_PORT COREDNS_PORT_53_UDP_PROTO COREDNS_PORT_9153_TCP COREDNS_PORT_9153_TCP_ADDR COREDNS_PORT_9153_TCP_PORT COREDNS_PORT_9153_TCP_PROTO COREDNS_SERVICE_HOST COREDNS_SERVICE_PORT COREDNS_SERVICE_PORT_DNS COREDNS_SERVICE_PORT_DNS_TCP COREDNS_SERVICE_PORT_METRICS METRICS_SERVER_PORT METRICS_SERVER_PORT_443_TCP METRICS_SERVER_PORT_443_TCP_ADDR METRICS_SERVER_PORT_443_TCP_PORT METRICS_SERVER_PORT_443_TCP_PROTO METRICS_SERVER_SERVICE_HOST METRICS_SERVER_SERVICE_PORT PATH PORTWORX_API_PORT PORTWORX_API_PORT_9001_TCP PORTWORX_API_PORT_9001_TCP_ADDR PORTWORX_API_PORT_9001_TCP_PORT PORTWORX_API_PORT_9001_TCP_PROTO PORTWORX_API_PORT_9020_TCP PORTWORX_API_PORT_9020_TCP_ADDR PORTWORX_API_PORT_9020_TCP_PORT PORTWORX_API_PORT_9020_TCP_PROTO PORTWORX_API_PORT_9021_TCP PORTWORX_API_PORT_9021_TCP_ADDR PORTWORX_API_PORT_9021_TCP_PORT PORTWORX_API_PORT_9021_TCP_PROTO PORTWORX_API_SERVICE_HOST PORTWORX_API_SERVICE_PORT PORTWORX_API_SERVICE_PORT_PX_API PORTWORX_API_SERVICE_PORT_PX_REST_GATEWAY PORTWORX_API_SERVICE_PORT_PX_SDK PORTWORX_SERVICE_PORT PORTWORX_SERVICE_PORT_9001_TCP PORTWORX_SERVICE_PORT_9001_TCP_ADDR PORTWORX_SERVICE_PORT_9001_TCP_PORT PORTWORX_SERVICE_PORT_9001_TCP_PROTO PORTWORX_SERVICE_PORT_9019_TCP PORTWORX_SERVICE_PORT_9019_TCP_ADDR PORTWORX_SERVICE_PORT_9019_TCP_PORT PORTWORX_SERVICE_PORT_9019_TCP_PROTO PORTWORX_SERVICE_PORT_9020_TCP PORTWORX_SERVICE_PORT_9020_TCP_ADDR PORTWORX_SERVICE_PORT_9020_TCP_PORT PORTWORX_SERVICE_PORT_9020_TCP_PROTO PORTWORX_SERVICE_PORT_9021_TCP PORTWORX_SERVICE_PORT_9021_TCP_ADDR PORTWORX_SERVICE_PORT_9021_TCP_PORT PORTWORX_SERVICE_PORT_9021_TCP_PROTO PORTWORX_SERVICE_SERVICE_HOST PORTWORX_SERVICE_SERVICE_PORT PORTWORX_SERVICE_SERVICE_PORT_PX_API PORTWORX_SERVICE_SERVICE_PORT_PX_KVDB PORTWORX_SERVICE_SERVICE_PORT_PX_REST_GATEWAY PORTWORX_SERVICE_SERVICE_PORT_PX_SDK PX_LIGHTHOUSE_PORT PX_LIGHTHOUSE_PORT_443_TCP PX_LIGHTHOUSE_PORT_443_TCP_ADDR PX_LIGHTHOUSE_PORT_443_TCP_PORT PX_LIGHTHOUSE_PORT_443_TCP_PROTO PX_LIGHTHOUSE_PORT_80_TCP PX_LIGHTHOUSE_PORT_80_TCP_ADDR PX_LIGHTHOUSE_PORT_80_TCP_PORT PX_LIGHTHOUSE_PORT_80_TCP_PROTO PX_LIGHTHOUSE_SERVICE_HOST PX_LIGHTHOUSE_SERVICE_PORT PX_LIGHTHOUSE_SERVICE_PORT_HTTP PX_LIGHTHOUSE_SERVICE_PORT_HTTPS STORK_SERVICE_PORT STORK_SERVICE_PORT_443_TCP STORK_SERVICE_PORT_443_TCP_ADDR STORK_SERVICE_PORT_443_TCP_PORT STORK_SERVICE_PORT_443_TCP_PROTO STORK_SERVICE_PORT_8099_TCP STORK_SERVICE_PORT_8099_TCP_ADDR STORK_SERVICE_PORT_8099_TCP_PORT STORK_SERVICE_PORT_8099_TCP_PROTO STORK_SERVICE_SERVICE_HOST STORK_SERVICE_SERVICE_PORT STORK_SERVICE_SERVICE_PORT_EXTENDER STORK_SERVICE_SERVICE_PORT_WEBHOOK TILLER_DEPLOY_PORT TILLER_DEPLOY_PORT_44134_TCP TILLER_DEPLOY_PORT_44134_TCP_ADDR TILLER_DEPLOY_PORT_44134_TCP_PORT TILLER_DEPLOY_PORT_44134_TCP_PROTO TILLER_DEPLOY_SERVICE_HOST TILLER_DEPLOY_SERVICE_PORT TILLER_DEPLOY_SERVICE_PORT_TILLER]”
time=“2021-03-26T14:19:40Z” level=info msg=“Preparing to download Portworx image…”
time=“2021-03-26T14:19:40Z” level=info msg=“REST: Changing install-state: ST_UNKNOWN → ST_INSTALL”
time=“2021-03-26T14:19:40Z” level=info msg=“Detected initial install”
time=“2021-03-26T14:19:40Z” level=info msg=“Detected imagePullPolicy Always”
time=“2021-03-26T14:19:40Z” level=info msg=“Attempting to retrieve latest portworx/px-essentials:2.6.3 image (pullPolicy Always)”
time=“2021-03-26T14:19:40Z” level=info msg=“Using anonymous Docker credentials”
time=“2021-03-26T14:19:40Z” level=info msg=“Skipping digest checks”
{“status”:“Pulling from portworx/px-essentials”,“id”:“2.6.3”}
{“status”:“Digest: sha256:fd278d0c0ec9c297dfd1ef1476d00fb5da1d74262b39b93e1db39369ce73de40”}
{“status”:“Status: Image is up to date for portworx/px-essentials:2.6.3”}
time=“2021-03-26T14:19:41Z” level=info msg=“Got requested Portworx image portworx/px-essentials:2.6.3 with digest sha256:fd278d0c0ec9c297dfd1ef1476d00fb5da1d74262b39b93e1db39369ce73de40”
time=“2021-03-26T14:19:41Z” level=info msg=“Installing/Upgrading Portworx OCI files (restart pending)”
time=“2021-03-26T14:19:41Z” level=info msg=“Cleaning up /opt/pwx/oci/inst-staging directory (if any)”
time=“2021-03-26T14:19:41Z” level=info msg=“Removing old container px-oci-installer (if any)”
time=“2021-03-26T14:19:41Z” level=info msg=“Creating container from image portworx/px-essentials:2.6.3”
time=“2021-03-26T14:19:41Z” level=info msg=“Starting container 46a47877d0769554a0f505bd5ec477316ade4d8e79e1e8dd4ebafcd31c48a055 [portworx/px-essentials:2.6.3]”
time=“2021-03-26T14:19:41Z” level=error msg=“Could not install portworx/px-essentials:2.6.3” error=“Could not start container 46a47877d0769554a0f505bd5ec477316ade4d8e79e1e8dd4ebafcd31c48a055 [portworx/px-essentials:2.6.3]: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused “process_linux.go:281: applying cgroup configuration for process caused \“failed to write 0-55\\n to cpuset.cpus: write /sys/fs/cgroup/cpuset/docker/cpuset.cpus: invalid argument\””: unknown”
time=“2021-03-26T14:19:41Z” level=error msg=“Could not install portworx/px-essentials:2.6.3: Could not start container 46a47877d0769554a0f505bd5ec477316ade4d8e79e1e8dd4ebafcd31c48a055 [portworx/px-essentials:2.6.3]: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused “process_linux.go:281: applying cgroup configuration for process caused \“failed to write 0-55\\n to cpuset.cpus: write /sys/fs/cgroup/cpuset/docker/cpuset.cpus: invalid argument\””: unknown”

Usage: /px-oci-mon [options]

options:
–endpoint ip:port Start REST service at specific endpoint
–dev Deploy PX-Developer rather than PX-Enterprise build
–sync Will issue sync operation before stopping/restarting the PX-OCI service
–drain-all Will drain ALL PX-dependent pods before upgrade (dfl. only managed nodes get drained)
–log Will use logfile instead of Docker-log
–ignore-preexec-fail Will ignore failed Pre/Post -exec scriplets
–disable-service-log Disables following portworx-service log
–max-log-lines <#> Specify number of log-lines when flushing portworx service logs
–cnthandler Force a container handler ( is docker or containerd)
–svcandler Force a service handler ( is dbus, dbusp, nssysd or nsups)
–debug Increase logs-verbosity to debug-level

  •                  Any additional options will be passed on to px-runc
    

NOTE that any options not explicitly listed above, will be passed directly to px-runc.
For details please see http://docs.portworx.com/runc

this is the link to the spec file
https://install.portworx.com/?mc=false&kbver=1.17.0&oem=esse&user=db12953f-81e4-11eb-a2c5-c24e499c7467&b=true&m=eno2&d=eno2&c=px-cluster-085a9cc8-6622-4fbf-94a7-33d76b0a2868&stork=true&csi=true&lh=true&st=k8s
and below are the diff between the original spec file and my change,

diff px-orig.yaml px-essential.yaml 
360c360
<             ["-c", "px-cluster-085a9cc8-6622-4fbf-94a7-33d76b0a2868", "-d", "eno2", "-m", "eno2", "-a", "-secret_type", "k8s", "-b", "--oem", "esse", 
---
>             ["-c", "px-cluster-085a9cc8-6622-4fbf-94a7-33d76b0a2870", "-d", "eno1", "-m", "eno1", "-a", "-secret_type", "k8s", "-b", "--oem", "esse", 
399c399
<             - name: procmount
---
>             - name: hostprocmount
400a401,402
>             - name: proc1nsmount
>               mountPath: /host_proc/1/ns
464c466
<         - name: procmount
---
>         - name: hostprocmount
466a469,471
>         - name: proc1nsmount
>           hostPath:
>             path: /proc/1/ns
568c573
<   replicas: 3
---
>   replicas: 1
1290c1295
<   replicas: 3
---
>   replicas: 1
1461c1466
<   replicas: 3
---
>   replicas: 1

FYI. based on the logs, looks like you are hitting

i don’t think these two are the same issue. what i hit is writing failure to cpuset.cpus while the link is about setting quotas. anyway i updated the kernel to version 5.4.108-0504108-generic which supposes to contain the bug fix. but still have the exception.

What is the Cgroup Driver setting in your Docker service definition? The link references having it set to systemd and needing it set to cgroupfs.

Just a thought.

There was no cgroup driver setting in the docker service. And i have changed to use cgroupfs,

   CGroup: /system.slice/docker.service
       └─16373 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=cgroupfs

got same exception.