Trying to install Portworx essential 2.5 on an OCP 4.5 cluster on AWS. It is failing with following error:
$ oc exec $PX_POD -n kube-system – /opt/pwx/bin/pxctl status
Defaulting container name to portworx.
Use ‘oc describe pod/portworx-8fvf9 -n kube-system’ to see all of the containers in this pod.
PX is not running on this host: Could not reach 'HealthMonitor’List of last known failures:Type ID Resource Severity Count LastSeen FirstSeen Description
NODE FileSystemDependency ip-10-0-171-59 ALARM 1 Oct 21 07:33:43 UTC 2020 Oct 21 07:33:43 UTC 2020 Failed to find patch fs dependency on remote site for kernel 4.18.0-193.24.1.el8_2.dt1.x86_64, exiting…
NODE PortworxMonitorSchedulerInitializationFailed ip-10-0-171-59 ALARM 1 Oct 21 07:02:59 UTC 2020 Oct 21 07:02:59 UTC 2020 Could not init scheduler ‘kubernetes’: Could not find my node in Kubernetes cluster: Get https://172.30.0.1:443/api/v1/nodes: dial tcp 172.30.0.1:443: connect: no route to host
command terminated with exit code 1
I don’t see an option to attach a text file here, tried with the upload button but it says: “Sorry, the file you are trying to upload is not authorized (authorized extensions: jpg, jpeg, png, gif, heic, heif).”
Can you install the latest version 2.6 ? looks like it is failing to detect the latest kernel version with 2.5.7. Can you give a try with latest release of 2.6.1.3 which has support for the latest kernel.
No its not port issue, its kernel dependency issue, Failed to find patch fs dependency on remote site for kernel 4.18.0-193.24.1.el8_2.dt1.x86_64, exiting..
Can you tell what is the operator version you have installed ?
It is still failing after i copied px.ko to all three worker nodes and restarted the portworx service.
[core@ip-10-0-132-7 ~] ls -ltr /var/lib/osd/pxfs/latest/8.px.ko
-rw-r--r--. 1 root root 2391816 Oct 22 08:17 /var/lib/osd/pxfs/latest/8.px.ko
[core@ip-10-0-132-7 ~] date
Thu Oct 22 08:45:44 UTC 2020
attaching logs for the same, it contains yesterday’s logs as well.
Now px volumes are getting created but it is not stable, it de-attaches instantly. Attaching all the logs.
PX_POD=(oc get pods -l name=portworx -n kube-system -o jsonpath=’{.items[0].metadata.name}’)
$ oc exec $PX_POD -n kube-system – /opt/pwx/bin/pxctl status
Defaulting container name to portworx.
Use ‘oc describe pod/portworx-7pprp -n kube-system’ to see all of the containers in this pod.
PX is not running on this host: Could not reach ‘HealthMonitor’
List of last known failures:
Type ID Resource Severity Count LastSeen FirstSeen Description
NODE NodeStartFailure 7d9de820-b8b6-4171-a94a-6f2d812e9029 ALARM 1 Oct 23 09:10:56 UTC 2020 Oct 23 09:10:56 UTC 2020 Failed to start Portworx: failed in internal kvdb setup: failed to create a kvdb connection to peer internal kvdb nodes dial tcp 10.0.135.69:9019: connect: connection refused. Make sure peer kvdb nodes are healthy.
NODE KvdbConnectionFailed 7d9de820-b8b6-4171-a94a-6f2d812e9029 ALARM 1 Oct 23 09:10:50 UTC 2020 Oct 23 09:10:50 UTC 2020 Internal Kvdb: failed to create a kvdb connection to peer internal kvdb nodes dial tcp 10.0.135.69:9019: connect: connection refused. Make sure peer kvdb nodes are healthy.
NODE PortworxMonitorInstallFailed ip-10-0-151-22 ALARM 1 Oct 23 08:59:40 UTC 2020 Oct 23 08:59:40 UTC 2020 Could not finalize OCI install: Timeout
NODE PortworxMonitorSchedulerInitializationFailed ip-10-0-151-22 ALARM 1 Oct 23 08:54:35 UTC 2020 Oct 23 08:54:35 UTC 2020 Could not init scheduler ‘kubernetes’: Could not find my node in Kubernetes cluster: Get https://172.30.0.1:443/api/v1/nodes: dial tcp 172.30.0.1:443: connect: no route to host
command terminated with exit code 1
#Portworx Operator will install pods only on nodes that have the label node-role.kubernetes.io/compute=true
I am labeling the nodes using a shell script:
WORKER_NODES=oc get nodes | grep worker | awk '{print $1}'
for wnode in ${WORKER_NODES[@]}; do
oc label nodes $wnode node-role.kubernetes.io/compute=true
done
Following ports are open for master and worker security groups:
aws ec2 authorize-security-group-ingress --group-id $WORKER_GROUP_ID --protocol tcp --port 17001-17020 --source-group $MASTER_GROUP_ID