Internal KVDB considerations for cloud deployment

Hi,

We are considering using internal KVDB for Portworx on openshift cluster (cloud deployment).
I went through the doc here which has pretty good info - Internal KVDB
Would like to discuss few things in detail

  1. Storage for KVDB
    We currently attach block vols to nodes for PX storage. Can KVDB use these attached block vols for persistent storage ? Doc says it is not recommended to have KVDb consume storage from PX pool. Does this mean we shouldn’t have worker node be both a metadata node and storage node ?

  2. HA for KVDB

Thru labelling, if we manage to have workers that have KVDB disks to be spread across 3 different Availability zones (AZs), can we expect KVDB to be resilient for zonal worker failures (volumes have more than 1 replication factor) ?

  1. Worker upgrades/ unexpected failures

Docs indicate backups are stored on worker node. In our env, workers are replaced with new worker nodes during upgrades. Assuming the worst case scenario that all workers KVDB uses are down, is there a path to recovery ?

  1. Worst case scenario: Where we are unable to recover PX cluster/KVDB
    In this case, if we manage to bring a new installation of PX, would there be data loss ? Assuming we have scheduled snapshots to cloud object storage, can we recover the volumes from cloudsnaps ?
  1. Storage for KVDB- We currently attach block vols to nodes for PX storage. Can KVDB use these attached block vols for persistent storage ? Doc says it is not recommended to have KVDb consume storage from PX pool. Does this mean we shouldn’t have worker node be both a metadata node and storage node ? - You can use single driver for both metadata and storage for non-prod env but its recommended to separate drives for both storage and kvdb
  2. HA for KVDB-Thru labelling, if we manage to have workers that have KVDB disks to be spread across 3 different Availability zones (AZs), can we expect KVDB to be resilient for zonal worker failures (volumes have more than 1 replication factor) ? - Even if you dont label the nodes, portworx internal KVDB will be placed in different AZ automatically, if you have different AZ.
  3. Worker upgrades/ unexpected failures- Docs indicate backups are stored on worker node. In our env, workers are replaced with new worker nodes during upgrades. Assuming the worst case scenario that all workers KVDB uses are down, is there a path to recovery ? - Yes, If at-least one kvdb backup able to retrieve.
  4. Worst case scenario: Where we are unable to recover PX cluster/KVDB - In this case, if we manage to bring a new installation of PX, would there be data loss ? Assuming we have scheduled snapshots to cloud object storage, can we recover the volumes from cloudsnaps ? - if you have cloud snapshots, we can retrieve the px-volumes

Note: it also recommended to use the separate drive(min 64gb) for internal kvdb in prod env.

1 Like