In environments where KVDBs are not isolated, does Stork preferentially place pods on nodes where KVDBs are not running?
In my test environment sv00[6-9] are Pxworks(ver 2.12.0) storage nodes, of which KVDB is running on sv00[6-8].
root@kf2-0a-sv004:~/portworx# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kf2-0a-sv004 Ready control-plane 31d v1.24.0 172.31.0.4 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv005 Ready control-plane 31d v1.24.0 172.31.0.5 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv006 Ready <none> 19d v1.24.0 172.31.0.6 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv007 Ready <none> 30d v1.24.0 172.31.0.7 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv008 Ready <none> 30d v1.24.0 172.31.0.8 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv009 Ready <none> 30d v1.24.0 172.31.0.9 <none> Ubuntu 20.04.5 LTS 5.4.0-132-generic containerd://1.6.8
root@kf2-0a-sv004:~/portworx# kubectl -n kube-system get storagenodes
NAME ID STATUS VERSION AGE
kf2-0a-sv006 030044f4-b323-441b-b425-18f855a86919 Online 2.12.0-02bd5b0 19d
kf2-0a-sv007 73ac8b63-1e1a-41d1-be5e-cf9c0061436e Online 2.12.0-02bd5b0 30d
kf2-0a-sv008 567afddc-38c0-4073-b716-316e2187fa00 Online 2.12.0-02bd5b0 30d
kf2-0a-sv009 15049660-23e2-4dc0-98c8-bf46d88722eb Online 2.12.0-02bd5b0 30d
root@kf2-0a-sv005:~# kubectl exec -n kube-system px-cluster-77b55e68-a3d6-4ecb-a23a-3c9425cfb7a1-85ttp -- /opt/pwx/bin/pxctl service kvdb members
Defaulted container "portworx" out of: portworx, csi-node-driver-registrar
Kvdb Cluster Members:
ID PEER URLs CLIENT URLs LEADER HEALTHY DBSIZE
567afddc-38c0-4073-b716-316e2187fa00 [http://portworx-2.internal.kvdb:9018] [http://172.31.0.8:9019] true true 2.8 MiB
030044f4-b323-441b-b425-18f855a86919 [http://portworx-3.internal.kvdb:9018] [http://172.31.0.6:9019] false true 2.8 MiB
73ac8b63-1e1a-41d1-be5e-cf9c0061436e [http://portworx-1.internal.kvdb:9018] [http://172.31.0.7:9019] false true 2.8 MiB
Then, I created 30 pods in Deployment using Portworks CSI’s StorageClass with replicas: 30 and schedulerName: stork settings.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: px-check-fs-pvc
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 50Gi
storageClassName: px-csi-db
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-registry
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
replicas: 30
selector:
matchLabels:
k8s-app: kube-registry
template:
metadata:
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
schedulerName: stork
containers:
- name: registry
image: registry:2
imagePullPolicy: Always
resources:
limits:
cpu: 100m
memory: 100Mi
env:
# Configuration reference: https://docs.docker.com/registry/configuration/
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_HTTP_SECRET
value: "Ple4seCh4ngeThisN0tAVerySecretV4lue"
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
ports:
- containerPort: 5000
name: registry
protocol: TCP
livenessProbe:
httpGet:
path: /
port: registry
readinessProbe:
httpGet:
path: /
port: registry
volumes:
- name: image-store
persistentVolumeClaim:
claimName: px-check-fs-pvc
readOnly: false
When the volume’s Replica sets were all KVDB nodes (sv00[6-8]), the pods were evenly placed.
Also, when the number of replicas was reduced, the pods were terminated equally from all nodes.
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv006
10
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
9
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
11
#
# change replicas 30 -> 15
#
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
6
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
4
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv006
5
If the volume Replica set is sv007,008,009 and only some of them are KVDB nodes (sv007,008), the pods were preferentially placed on the non-KVDB node sv009.
(Stork logs show Score as 100 for all three nodes.)
Also, when the number of replicas was reduced, pods were preferentially terminated on node sv009, which is not a KVDB node.
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
5
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
9
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv009
16
root@kf2-0a-sv004:~/portworx# kubectl logs -n kube-system stork-764546c9f4-mmrqp | grep -E "kube-registry-6586fd9889-25zzg" | more
...
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv009 Score:5}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv008 Score:100}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv006 Score:100}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv007 Score:100}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
...
#
# change replicas 30 -> 15
#
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
9
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
6
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv009
0