Thursday, December 12, 2019

How to use nodeSelector to constrain POD csi-controller-kdf-0 to only be able to run on particular Node(s)

Goal:

This article explains how to use nodeSelector to constrain POD csi-controller-kdf-0 to only be able to run on particular Node(s).

Env:

MapR 6.1 (secured)
MapR CSI 1.0.0
Kubernetes Cluster in GKE

Use case:

For MapR CSI, we want the POD from StatefulSet "csi-controller-kdf" to only run on specific node(s).

Solution:

1. List current nodes from Kubernetes cluster

$ kubectl get nodes
NAME                                                STATUS   ROLES    AGE   VERSION
gke-standard-cluster-1-default-pool-f6e6e4c1-45ql   Ready    <none>   22m   v1.13.11-gke.14
gke-standard-cluster-1-default-pool-f6e6e4c1-fbhp   Ready    <none>   22m   v1.13.11-gke.14
gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5   Ready    <none>   22m   v1.13.11-gke.14
gke-standard-cluster-1-default-pool-f6e6e4c1-r20n   Ready    <none>   22m   v1.13.11-gke.14
gke-standard-cluster-1-default-pool-f6e6e4c1-xr3s   Ready    <none>   22m   v1.13.11-gke.14

For example, we want the POD from StatefulSet "csi-controller-kdf" to only run on node "gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5".

2. Attach a label to this node

kubectl label nodes gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5 for-csi-controller=true
Here the label key is "for-csi-controller" and the label value is "true".
Verify that the label is attached on that node:
$ kubectl get nodes -l for-csi-controller=true
NAME                                                STATUS   ROLES    AGE   VERSION
gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5   Ready    <none>   34m   v1.13.11-gke.14

3. Modify csi-maprkdf-v1.0.0.yaml

cp csi-maprkdf-v1.0.0.yaml csi-maprkdf-v1.0.0_modified.yaml
vi csi-maprkdf-v1.0.0_modified.yaml
Add below to the bottom of the definiton for StatefulSet "csi-controller-kdf"
      nodeSelector:
        for-csi-controller: "true"
One full example for StatefulSet "csi-controller-kdf" is:
kind: StatefulSet
apiVersion: apps/v1beta1
metadata:
  name: csi-controller-kdf
  namespace: mapr-csi
spec:
  serviceName: "kdf-provisioner-svc"
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-controller-kdf
    spec:
      serviceAccount: csi-controller-sa
      containers:
        - name: csi-attacher
          image: quay.io/k8scsi/csi-attacher:v1.0.1
          args:
            - "--v=5"
            - "--csi-address=$(ADDRESS)"
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          imagePullPolicy: "Always"
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        - name: csi-provisioner
          image: quay.io/k8scsi/csi-provisioner:v1.0.1
          args:
            - "--provisioner=com.mapr.csi-kdf"
            - "--csi-address=$(ADDRESS)"
            - "--volume-name-prefix=mapr-pv"
            - "--v=5"
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          imagePullPolicy: "Always"
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        - name: csi-snapshotter
          image: quay.io/k8scsi/csi-snapshotter:v1.0.1
          imagePullPolicy: "Always"
          args:
            - "--snapshotter=com.mapr.csi-kdf"
            - "--csi-address=$(ADDRESS)"
            - "--snapshot-name-prefix=mapr-snapshot"
            - "--v=5"
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        - name: liveness-probe
          image: quay.io/k8scsi/livenessprobe:v1.0.1
          imagePullPolicy: "Always"
          args:
            - "--v=5"
            - "--csi-address=$(ADDRESS)"
            - "--connection-timeout=60s"
            - "--health-port=9809"
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        - name: mapr-kdfprovisioner
          image: maprtech/csi-kdfprovisioner:1.0.0
          imagePullPolicy: "Always"
          args :
            - "--nodeid=$(NODE_ID)"
            - "--endpoint=$(CSI_ENDPOINT)"
            - "-v=5"
          env:
            - name: NODE_ID
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: CSI_ENDPOINT
              value: unix://plugin/csi.sock
          ports:
          - containerPort: 9809
            name: healthz
            protocol: TCP
          livenessProbe:
            failureThreshold: 20
            httpGet:
              path: /healthz
              port: healthz
            initialDelaySeconds: 10
            timeoutSeconds: 3
            periodSeconds: 5
          volumeMounts:
            - name: socket-dir
              mountPath: /plugin
            - name: k8s-log-dir
              mountPath: /var/log/csi-maprkdf
            - name: timezone
              mountPath: /etc/localtime
              readOnly: true
      volumes:
        - name: socket-dir
          emptyDir: {}
        - name: k8s-log-dir
          hostPath:
            path: /var/log/csi-maprkdf
            type: DirectoryOrCreate
        - name: timezone
          hostPath:
            path: /etc/localtime
      nodeSelector:
        for-csi-controller: "true"

 4. Create StatefulSet "csi-controller-kdf" using the modified version when configuring MapR CSI

kubectl apply -f csi-maprkdf-v1.0.0_modified.yaml
Other steps to configure MapR CSI are the same as this blog.

5. Verify that POD "csi-controller-kdf-0" is running on that specific node

$ kubectl get pods -n mapr-csi -o wide  |grep csi-controller-kdf-0
csi-controller-kdf-0       5/5     Running   0          56m   xx.xx.xx.4     gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5   <none>           <none>

Disaster Recovery Test: 

1. Drain this specific node and evict all the PODs except those for DaemonSets.

$ kubectl drain gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5 --ignore-daemonsets --delete-local-data
node/gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/fluentd-gcp-v3.2.0-hzrq7, kube-system/prometheus-to-sd-jxhrm, mapr-csi/csi-nodeplugin-kdf-ssbxp
evicting pod "csi-controller-kdf-0"
evicting pod "kube-dns-79868f54c5-rggws"
pod/csi-controller-kdf-0 evicted
pod/kube-dns-79868f54c5-rggws evicted
node/gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5 evicted

2. Check if the POD "csi-controller-kdf-0" will be rescheduled on other nodes or not.

$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                                           READY   STATUS    RESTARTS   AGE    IP            NODE                                                NOMINATED NODE   READINESS GATES
...
mapr-csi      csi-controller-kdf-0                                           0/5     Pending   0          16m    <none>        <none>                                              <none>           <none>
...
As we can see, the POD "csi-controller-kdf-0" will be pending and can not be rescheduled on other nodes.
This proves that the label is working.

3. Mark the specific node available again

kubectl uncordon gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5

4. Verify that POD "csi-controller-kdf-0" is running on the specific node again

$ kubectl get pods --all-namespaces -o wide |grep -i csi-controller-kdf-0
mapr-csi      csi-controller-kdf-0                                           5/5     Running   0          17m    xx.xx.xx.5     gke-standard-cluster-1-default-pool-f6e6e4c1-hzh5   <none>           <none>

5.  Verify the mount point is working in the test POD

$ kubectl exec -ti testpod -n testns -- ls -altr /mapr
total 6
drwxrwxrwt    3 5000     5000             1 Nov 25 11:17 kafka-streams
drwxrwxrwt    3 5000     5000             1 Nov 25 11:18 ksql
drwxrwxrwx    3 5000     5000             2 Dec  6 12:38 spark
drwxr-xr-x    1 root     root          4096 Dec 12 22:11 ..
drwxr-xr-x    5 5000     5000             3 Dec 12 23:45 .

References:

https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

No comments:

Post a Comment

Popular Posts