In environments where KVDBs are not isolated, does Stork preferentially place pods on nodes where KVDBs are not running?
In my test environment sv00[6-9]
are Pxworks(ver 2.12.0) storage nodes, of which KVDB is running on sv00[6-8]
.
root@kf2-0a-sv004:~/portworx# kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kf2-0a-sv004 Ready control-plane 31d v1.24.0 172.31.0.4 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv005 Ready control-plane 31d v1.24.0 172.31.0.5 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv006 Ready <none> 19d v1.24.0 172.31.0.6 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv007 Ready <none> 30d v1.24.0 172.31.0.7 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv008 Ready <none> 30d v1.24.0 172.31.0.8 <none> Ubuntu 20.04.5 LTS 5.4.0-131-generic containerd://1.6.8
kf2-0a-sv009 Ready <none> 30d v1.24.0 172.31.0.9 <none> Ubuntu 20.04.5 LTS 5.4.0-132-generic containerd://1.6.8
root@kf2-0a-sv004:~/portworx# kubectl -n kube-system get storagenodes
NAME ID STATUS VERSION AGE
kf2-0a-sv006 030044f4-b323-441b-b425-18f855a86919 Online 2.12.0-02bd5b0 19d
kf2-0a-sv007 73ac8b63-1e1a-41d1-be5e-cf9c0061436e Online 2.12.0-02bd5b0 30d
kf2-0a-sv008 567afddc-38c0-4073-b716-316e2187fa00 Online 2.12.0-02bd5b0 30d
kf2-0a-sv009 15049660-23e2-4dc0-98c8-bf46d88722eb Online 2.12.0-02bd5b0 30d
root@kf2-0a-sv005:~# kubectl exec -n kube-system px-cluster-77b55e68-a3d6-4ecb-a23a-3c9425cfb7a1-85ttp -- /opt/pwx/bin/pxctl service kvdb members
Defaulted container "portworx" out of: portworx, csi-node-driver-registrar
Kvdb Cluster Members:
ID PEER URLs CLIENT URLs LEADER HEALTHY DBSIZE
567afddc-38c0-4073-b716-316e2187fa00 [http://portworx-2.internal.kvdb:9018] [http://172.31.0.8:9019] true true 2.8 MiB
030044f4-b323-441b-b425-18f855a86919 [http://portworx-3.internal.kvdb:9018] [http://172.31.0.6:9019] false true 2.8 MiB
73ac8b63-1e1a-41d1-be5e-cf9c0061436e [http://portworx-1.internal.kvdb:9018] [http://172.31.0.7:9019] false true 2.8 MiB
Then, I created 30 pods in Deployment using Portworks CSI’s StorageClass with replicas: 30
and schedulerName: stork
settings.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: px-check-fs-pvc
spec:
accessModes:
- ReadWriteMany
volumeMode: Filesystem
resources:
requests:
storage: 50Gi
storageClassName: px-csi-db
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-registry
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
replicas: 30
selector:
matchLabels:
k8s-app: kube-registry
template:
metadata:
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
schedulerName: stork
containers:
- name: registry
image: registry:2
imagePullPolicy: Always
resources:
limits:
cpu: 100m
memory: 100Mi
env:
# Configuration reference: https://docs.docker.com/registry/configuration/
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_HTTP_SECRET
value: "Ple4seCh4ngeThisN0tAVerySecretV4lue"
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
ports:
- containerPort: 5000
name: registry
protocol: TCP
livenessProbe:
httpGet:
path: /
port: registry
readinessProbe:
httpGet:
path: /
port: registry
volumes:
- name: image-store
persistentVolumeClaim:
claimName: px-check-fs-pvc
readOnly: false
When the volume’s Replica sets were all KVDB nodes (sv00[6-8]
), the pods were evenly placed.
Also, when the number of replicas was reduced, the pods were terminated equally from all nodes.
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv006
10
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
9
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
11
#
# change replicas 30 -> 15
#
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
6
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
4
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv006
5
If the volume Replica set is sv007,008,009
and only some of them are KVDB nodes (sv007,008
), the pods were preferentially placed on the non-KVDB node sv009
.
(Stork logs show Score as 100 for all three nodes.)
Also, when the number of replicas was reduced, pods were preferentially terminated on node sv009
, which is not a KVDB node.
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
5
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
9
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv009
16
root@kf2-0a-sv004:~/portworx# kubectl logs -n kube-system stork-764546c9f4-mmrqp | grep -E "kube-registry-6586fd9889-25zzg" | more
...
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv009 Score:5}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv008 Score:100}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv006 Score:100}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
time="2022-11-15T08:36:20Z" level=debug msg="{Host:kf2-0a-sv007 Score:100}" Namespace=default Owner=ReplicaSet/kube-registry-6586fd9889 PodName=kube-registry-6586fd9889-25zzg
...
#
# change replicas 30 -> 15
#
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv007
9
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv008
6
root@kf2-0a-sv004:~/portworx# kubectl get pods -o wide -l k8s-app=kube-registry | grep -c sv009
0