Openshift on Virutalized vSphere Failure

We have openshift installed on top of vSphere and we were hoping to use portworx to provision some blockstorage.

OpenShift 4.7
vSphere: 6.7.0.44000

Following the Openshift Console install instructions I’m always stuck on the storage cluster initializing.

A few things I noticed. The yaml created while following the first step doesn’t use Openshift’s default namespace (openshift-operators) for making the operator available to all projects and instead tries to use kube-system. Addtionally, when adding the yaml, in step 3, you can not use the entirety of the yaml from the generated system. Not a huge deal, but it doesn’t work as written.

I tried creating the operator in the kube-system namespace, but the storage nodes are still stuck on initializing. Ultimately the system events show an error: Could not talk to Docker/Containerd/CRI: Could not initialize container handler - please ensure ‘/var/run/docker.sock’, ‘/run/containerd/containerd.sock’ or ‘/var/run/crio/crio.sock’ are mounted.”

thanks for your feedback. we will evaluate the same and correct it accordingly. assuming you have already taken care of all the prerequisites as per the documentation. can you please get the below details to help on the same further?

  1. As per this docs page - you should be selecting the kube-system namespaces because portworx related other objects created under this namespace to install the portworx correctly.
  2. Can you please post the generated spec yaml? which is not working for you?
  3. Let’s get the complete pod log using below command so that we will know clearly what’s going on.
    kubectl logs -n kube-system -l name=portworx --tail=99999
1 Like

I uninstalled the operator and started it again so I could grab some screenshots.

  1. The default namespace when installing the operator from the operator hub isn’t kube-system:
    This is easily changed, but we can probably fix it so kube-system is default.

Sorry I’m gonna try to break this up because the system think I’m posting links.

  1. Here’s a scraped version of the yaml the website created:
kind: StorageCluster
apiVersion: core <dot> libopenstorage <dot> org/v1
metadata:
  name: px-cluster
  namespace: kube-system
  annotations:
    portworx.io/install-source: "<redacted>"
    portworx.io/is-openshift: "true"
    portworx.io/misc-args: "--oem esse"
spec:
  image: portworx/oci-monitor:2.5.7
  imagePullPolicy: Always
  kvdb:
    internal: true
  cloudStorage:
    deviceSpecs:
    - type=thin,size=500
  secretsProvider: k8s
  stork:
    enabled: true
  userInterface:
    enabled: true
  autopilot:
    enabled: true
    providers:
    - name: default
      type: prometheus
      params:
        url: <redacted>
  monitoring:
    prometheus:
      exportMetrics: true
  featureGates:
    CSI: "true"
  env:
  - name: VSPHERE_INSECURE
    value: "true"
  - name: VSPHERE_USER
    valueFrom:
      secretKeyRef:
        name: px-vsphere-secret
        key: VSPHERE_USER
  - name: VSPHERE_PASSWORD
    valueFrom:
      secretKeyRef:
        name: px-vsphere-secret
        key: VSPHERE_PASSWORD
  - name: VSPHERE_VCENTER
    value: "<redacted>"
  - name: VSPHERE_VCENTER_PORT
    value: "443"
  - name: VSPHERE_DATASTORE_PREFIX
    value: "<redacted>"
  - name: VSPHERE_INSTALL_MODE
    value: "shared"
---
apiVersion: v1
kind: Secret
metadata:
  name: px-essential
  namespace: kube-system
data:
  px-essen-user-id: <redacted>
  px-osb-endpoint: <redacted>

As I mentioned, you have to redact the last few lines to get rid of the “Secret” object in the yaml or there’s an error.
image

  1. The log collection command you gave doesn’t work with openshift. :frowning:
$ kubectl logs -n kube-system -l name=portworx --tail=99999
error: expected 'logs (POD | TYPE/NAME) [CONTAINER_NAME]'.
POD or TYPE/NAME is a required argument for the logs command
See 'kubectl logs -h' for help and examples.

$ oc logs -n kube-system -l name=portworx --tail=99999
error: You must provide one or more resources by argument or filename.
Example resource specifications include:
   '-f rsrc.yaml'
   '--filename=rsrc.json'
   '<resource> <name>'
   '<resource>'

Here’s some details on the pods that got spun up with the generated yaml:

$  oc get pods -o wide -n kube-system -l name=portworx
NAME               READY     STATUS             RESTARTS   AGE       IP               NODE                     NOMINATED NODE   READINESS GATES
px-cluster-f6dzw   1/2       CrashLoopBackOff   2          32s       192.168.15.134   dev-64jpw-worker-jrrsd   <none>           <none>
px-cluster-gp8nr   1/2       Error              2          32s       192.168.15.18    dev-64jpw-worker-bxcxd   <none>           <none>
px-cluster-z5vv6   1/2       CrashLoopBackOff   2          32s       192.168.15.136   dev-64jpw-worker-bc4md   <none>           <none>

Here’s the logs from px-cluster-gp8nr:

time="2021-04-06T04:29:45Z" level=info msg="Input arguments: /px-oci-mon -c px-cluster -x kubernetes -b -s type=thin,size=500 -secret_type k8s -r 17001 --oem esse -marketplace_name OperatorHub"
time="2021-04-06T04:29:45Z" level=info msg="Updated arguments: /px-oci-mon -c px-cluster -x kubernetes -b -s type=thin,size=500 -secret_type k8s -r 17001 -marketplace_name OperatorHub"
time="2021-04-06T04:29:45Z" level=info msg="OCI-Monitor computed version v2.5.7-g88a03c4f-dirty"
time="2021-04-06T04:29:45Z" level=info msg="Service handler initialized via as DBus{type:dbus,svc:portworx.service,id:0xc00052fa20}"
time="2021-04-06T04:29:45Z" level=info msg="Activating REST server"
time="2021-04-06T04:29:45Z" level=info msg="Locating my container handler"
time="2021-04-06T04:29:45Z" level=info msg="> Attempt to use Docker as container handler failed" error="/var/run/docker.sock not a socket-file"
time="2021-04-06T04:29:45Z" level=info msg="> Attempt to use ContainerD as container handler failed" error="stat /run/containerd/containerd.sock: no such file or directory"
time="2021-04-06T04:29:45Z" level=info msg="> Attempt to use k8s-CRI as container handler failed" error="CRI-Socket path not specified"
time="2021-04-06T04:29:45Z" level=error msg="Could not instantiate container client" error="Could not initialize container handler"
time="2021-04-06T04:29:45Z" level=error msg="Could not talk to Docker/Containerd/CRI: Could not initialize container handler - please ensure '/var/run/docker.sock', '/run/containerd/containerd.sock' or '/var/run/crio/crio.sock' are mounted."

Usage: /px-oci-mon [options]

options:
   --endpoint <ip:port>   Start REST service at specific endpoint
   --dev                  Deploy PX-Developer rather than PX-Enterprise build
   --sync                 Will issue sync operation before stopping/restarting the PX-OCI service
   --drain-all            Will drain ALL PX-dependent pods before upgrade (dfl. only managed nodes get drained)
   --log <file>           Will use logfile instead of Docker-log
   --ignore-preexec-fail  Will ignore failed Pre/Post -exec scriplets
   --disable-service-log  Disables following portworx-service log
   --max-log-lines <#>    Specify number of log-lines when flushing portworx service logs
   --cnthandler <typ>     Force a container handler (<typ> is docker or containerd)
   --svcandler <typ>      Force a service handler (<typ> is dbus, dbusp, nssysd or nsups)
   --debug                Increase logs-verbosity to debug-level
   *                      Any additional options will be passed on to px-runc

NOTE that any options not explicitly listed above, will be passed directly to px-runc.
For details please see  <redacted because link limit>

Turned out that when I had generated the yaml, it called for an older version of the OCI-Monitor, and that needed to be 2.6 for compatibility. Looks like the webpage now defaults to that version, so it’s unlikely others will have that issue.