OnPrem Internal KVDB installation problem, connection refused on node's port 9019

Max_Sid · June 7, 2020, 7:16am

Hi,
I tried installing Portworx Essential OnPrem without internal kvdb which was successfully deployed. As it’s not recommended I tried installing with internal kvdb specifying \seperate device(/dev/sdb). However it couldn’t start service responsible for kvdb. All ports(9001-9022) are open. What could prevent starting internal kvdb?

adityadani · June 8, 2020, 8:19pm

Can we get more details about the failure?
Could you post the output of the following commands?

/opt/pwx/bin/pxctl status
/opt/pwx/bin/pxctl alerts show

Just to clarify, you first installed Portworx without internal kvdb? Did you provide an external etcd endpoint to Portworx in that case?

Max_Sid · June 8, 2020, 9:03pm

Thank you for your reply, Here are the output:
Status

Failed to start Portworx: failed in internal kvdb setup: failed to create a kvdb connection to peer internal kvdb nodes [[http–10.1.11.29:9019]]: dial tcp 10.1.11.29:9019: connect: connection refused. Make sure peer kvdb nodes are healthy.
Internal Kvdb: failed to create a kvdb connection to peer internal kvdb nodes [[http–10.1.11.29:9019]]: dial tcp 10.1.11.29:9019: connect: connection refused. Make sure peer kvdb nodes are healthy.

Alerts
PX is not running on this host

By saying internal kvdb I mean I skipped specifying seperate device(/dev/sdb). So it shared the same volume as Portworx volume IO. Which is not recommended.

Extra output from px-log-tail:
PXPROCS: Started watchdog with pid 17526
PX-Watchdog: Waiting for px process to start
PX-Watchdog: (pid 17429): Begin monitoring
level=info msg=“Registered the Usage based Metering Agent…”
level=info msg=“Registering [kernel] as a volume driver”
level=info msg=“Setting log level to info(4)”
level=info msg=“read config from env var” func=init package=boot
level=info msg=“read config from config.json” func=init package=boot
level=info msg=“Alerts initialized successfully for this cluster”
level=error msg=“Could not init boot manager” error=“failed in internal kvdb setup: failed to create a kvdb connection to peer internal kvdb nodes [[http://10.1.11.29:9019]]: dial tcp 10.1.11.29:9019: connect: connection refused. Make sure peer kvdb nodes are healthy.”
PXPROCS: px daemon exited with code: 1
INFO exited: pxdaemon (exit status 1; not expected)
INFO spawned: ‘pxdaemon’ with pid 17475
INFO reaped unknown pid 17418
PXPROCS: Started px-storage with pid 17506
bash: connect: Connection refused
bash: /dev/tcp/localhost/9009: Connection refused
PXPROCS: px-storage not started yet…sleeping

Hyper-V VMs, MacSpoof On, Ubuntu Server 18.04, Kubernetes deployed by Kubespray.

adityadani · June 8, 2020, 9:13pm

Between the two installations did you wipe the Portworx cluster?

For configuration change like adding a separate device for internal kvdb you will need to do the following:

Uninstall Portworx. For uninstalling follow this doc. Make sure you unlink your old PX-Essentials cluster .
Re-install Portworx cluster with the correct spec.

Max_Sid · June 8, 2020, 9:46pm

Yes I’ve faced that problem and already solved. The problem is there is no log output that it has successfully deployed internal kvdb on specified device(/dev/sdb). All three nodes’ output are the same.
The spec: https://install.portworx.com/2.5?mc=false&kbver=1.18.3&oem=esse&user=1085a1d0-a328-11ea-97e6-f6e09c7a4e5e&b=true&s=%2Fdev%2Fsdc&j=auto&kd=%2Fdev%2Fsdb&c=px-cluster-02b69ee3-d853-4230-b41c-4fb82d2de69e&stork=true&lh=true&st=k8s

adityadani · June 8, 2020, 10:01pm

You are right. We have improved the CLI to display the current used kvdb drive in an upcoming PX release. You will be able to see the device being used as a part of

/opt/pwx/bin/pxctl status

Until then, to ensure that internal kvdb is using your device you can run the following command

kubectl exec -it <portworx-pod> -- blkid | grep kvdb

The device you provided should get listed in the above command’s output. Portworx is going to fingerprint the drives you provided and in this case you will see a label called kvdbvol on the drive /dev/sdc.

Max_Sid · June 8, 2020, 10:38pm

Unfortunately there was no output. blkid output is:
/dev/sdb: PTUUID=“dc94207e-8466-9047-8049-17b8b2ec8d4e” PTTYPE=“gpt”
not assigned as you can see. May be there is manual way, to solve this problem.

adityadani · June 8, 2020, 10:45pm

Do you have ssh access to this node? Can you post the complete blkid output here?

Max_Sid · June 8, 2020, 11:05pm

Here you go:
/dev/loop0: TYPE=“squashfs”
/dev/loop1: TYPE=“squashfs”
/dev/loop2: TYPE=“squashfs”
/dev/sda1: UUID=“280C-F93F” TYPE=“vfat” PARTUUID=“42e325da-da6b-4074-baf9-b0bffffba21a”
/dev/sda2: UUID=“48a82b21-79cc-461e-819d-c6a25ded2733” TYPE=“ext4” PARTUUID=“9e9b5dcc-a792-4049-92d1-9468380428d5”
/dev/sda3: UUID=“24563c14-cc8d-4f3d-8d07-fac27929d74d” TYPE=“ext4” PARTUUID=“d0752743-fe76-3649-b431-a5f3ad02f6e2”
/dev/sdc: LABEL=“mdpoolid=0,pxpool=0,mdvol” UUID=“31d1aeae-85ff-47f9-82b4-39a2d21aafa0” UUID_SUB=“8629adf2-9ad3-4840-b37d-23ea846ef18a” TYPE=“btrfs”
/dev/sda4: PARTUUID=“90a443c6-4d8e-6142-9468-62449e40caf7”
/dev/sdb: PTUUID=“dc94207e-8466-9047-8049-17b8b2ec8d4e” PTTYPE=“gpt”

adityadani · June 8, 2020, 11:29pm

The above output indicates that this node is not initialized completely. Did you go through the Uninstall process? Once you run the uninstall process you should not see any labels. Following command will cleanup your Portworx installation

curl -fsL https://install.portworx.com/px-wipe | bash

The following labels should get removed after uninstall

/dev/sdc: LABEL=“mdpoolid=0,pxpool=0,mdvol” UUID=“31d1aeae-85ff-47f9-82b4-39a2d21aafa0” UUID_SUB=“8629adf2-9ad3-4840-b37d-23ea846ef18a” TYPE=“btrfs”

Max_Sid · June 9, 2020, 2:09am

I did, wipe the cluster between reinstallation. /dev/sdc/ is for Portworx Volume IO.

-s /dev/sdc

in the spec file.

adityadani · June 9, 2020, 8:48pm

If the cluster is still in the same state after wipe and re-install, can you provide the logs from this node 10.1.11.29 ? You can get the logs by running the following command on the node

journalctl -lau portworx-output > px.log

Max_Sid · June 10, 2020, 5:24am

I’ve wiped px-cluster again. And this time it was scolding that partitions was there from previous install. I formatted /dev/sdb and /dev/sdc. Again no effect. so I did:

sudo dd if=/dev/zero of=/dev/sdb bs=1M

sudo dd if=/dev/zero of=/dev/sdc bs=1M

after that I wiped px-cluster again. I’ve generated spec with no auto journaling on devices. Finally after deploy it get up works like a charm. So the problem was px-wipe.sh script wasn’t enough. Extra wipe of disk was needed. Thank you for your help.

Topic		Replies	Views
Portworx installation fails with 'Failed to start Portworx: failed to setup internal kvdb Portworx Install install , kvdb	3	1071	February 22, 2022
Failed in internal kvdb setup: Kvdb took too long to start kvdb	8	3773	April 29, 2020
Fail to start Portworx Portworx on Kubernetes install	1	1450	May 26, 2021
Could not init boot manager (error="Configuration check failed: Discovery URL is not specified") PX-Internal KVDB kvdb	0	757	March 7, 2022
On-prem kubernetes install failed Portworx Install	3	1500	August 1, 2022

OnPrem Internal KVDB installation problem, connection refused on node's port 9019

Related topics