Wednesday, December 4, 2019

How to mount a PersistentVolume for Static Provisioning using MapR CSI in GKE

Goal:

This article explains the detailed steps on how to mount a PersistentVolume for Static Provisioning using MapR Container Storage Interface(CSI) in Google Kubernetes Engine(GKE).

Env:

MapR 6.1 (secured)
MapR CSI 1.0.0
Kubernetes Cluster in GKE

Use Case:

We have a secured MapR Cluster (v6.1) and want to expose the storage to applications running in a Kubernetes cluster(GKE in this example).
In this example, we plan to expose a MapR volume named "mapr.apps" (mounted as /apps) to a sample POD in Kubernetes Cluster.
Inside the POD, it will be mounted as /mapr instead.

Solution:

1. Create a Kubernetes cluster named "standard-cluster-1" in GKE

You can use GUI or gcloud commands.

2. Fetch the credentials for the Kubernetes cluster

gcloud container clusters get-credentials standard-cluster-1 --zone us-central1-a
After that, make sure "kubectl cluster-info" returns correct cluster information.
This step is to make kubectl work and connect to the correct Kubernetes cluster.

3. Bind cluster-admin role to Google Cloud user

kubectl create clusterrolebinding user-cluster-admin-binding --clusterrole=cluster-admin --user=xxx@yyy.com
Note: "xxx@yyy.com" is the your Google Cloud user.
Here we grant cluster admin role to the user to avoid any permission error in the next step when we create MapR CSI ClusterRole and ClusterRoleBinding. 

4. Download MapR CSI Driver custom resource definition

Please refer to the latest documentation: https://mapr.com/docs/home/CSIdriver/csi_downloads.html 
git clone https://github.com/mapr/mapr-csi
cd ./mapr-csi/deploy/kubernetes/
kubectl create -f csi-maprkdf-v1.0.0.yaml
Below Kubernetes objects are created:
namespace/mapr-csi created
serviceaccount/csi-nodeplugin-sa created
clusterrole.rbac.authorization.k8s.io/csi-nodeplugin-cr created
clusterrolebinding.rbac.authorization.k8s.io/csi-nodeplugin-crb created
serviceaccount/csi-controller-sa created
clusterrole.rbac.authorization.k8s.io/csi-attacher-cr created
clusterrolebinding.rbac.authorization.k8s.io/csi-attacher-crb created
clusterrole.rbac.authorization.k8s.io/csi-controller-cr created
clusterrolebinding.rbac.authorization.k8s.io/csi-controller-crb created
daemonset.apps/csi-nodeplugin-kdf created
statefulset.apps/csi-controller-kdf created

5. Verify the PODs/DaemonSet/StatefulSet are running under namespace "mapr-csi"

PODs:
$ kubectl get pods -n mapr-csi -o wide
NAME                       READY   STATUS    RESTARTS   AGE     IP              NODE                                                NOMINATED NODE   READINESS GATES
csi-controller-kdf-0       5/5     Running   0          5m58s   xx.xx.xx.1      gke-standard-cluster-1-default-pool-aaaaaaaa-1111   <none>           <none>
csi-nodeplugin-kdf-9gmqc   3/3     Running   0          5m58s   xx.xx.xx.2      gke-standard-cluster-1-default-pool-aaaaaaaa-2222   <none>           <none>
csi-nodeplugin-kdf-qhhbh   3/3     Running   0          5m58s   xx.xx.xx.3      gke-standard-cluster-1-default-pool-aaaaaaaa-3333   <none>           <none>
csi-nodeplugin-kdf-vrq4g   3/3     Running   0          5m58s   xx.xx.xx.4      gke-standard-cluster-1-default-pool-aaaaaaaa-4444   <none>           <none>
DaemonSet:
$ kubectl get DaemonSet -n mapr-csi
NAME                 DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
csi-nodeplugin-kdf   3         3         3       3            3           <none>          8m58s
StatefulSet:
$ kubectl get StatefulSet -n mapr-csi
NAME                 READY   AGE
csi-controller-kdf   1/1     9m42s

6. Create a test namespace named "testns" for future test PODs

kubectl create namespace testns

7. Create a Secret for MapR ticket

7.a Logon MapR Cluster, and locate the ticket file using "maprlogin print" or generate a new ticket file using "maprlogin password".
For example, here we are using "mapr" user's ticket file located at /tmp/maprticket_5000.
7.b Convert the ticket into base64 representation and save the output.
cat /tmp/maprticket_5000 | base64
7.c Create a YAML file named "mapr-ticket-secret.yaml" for the Secret named "mapr-ticket-secret" in namespace "testns".
apiVersion: v1
kind: Secret
metadata:
  name: mapr-ticket-secret
  namespace: testns
type: Opaque
data:
  CONTAINER_TICKET: CHANGETHIS!
Note: "CHANGETHIS!" should be replaced by the output we saved in step 7.b. Make sure it is in a single line.
7.d Create this Secret.
kubectl create -f mapr-ticket-secret.yaml

8. Change the GKE default Storage Class

This is because GKE default Storage Class is named "standard".
If we do not change it, in the next steps, it will automatically create a new PV bound to our PVC.
8.a Confirm the default Storage Class is named "standard" in GKE.
$ kubectl get storageclass -o yaml
apiVersion: v1
items:
- allowVolumeExpansion: true
  apiVersion: storage.k8s.io/v1
  kind: StorageClass
  metadata:
    annotations:
      storageclass.kubernetes.io/is-default-class: "true"
    creationTimestamp: "2019-12-04T19:38:38Z"
    labels:
      addonmanager.kubernetes.io/mode: EnsureExists
      kubernetes.io/cluster-service: "true"
    name: standard
    resourceVersion: "285"
    selfLink: /apis/storage.k8s.io/v1/storageclasses/standard
    uid: ab77d472-16cd-11ea-abaf-42010a8000ad
  parameters:
    type: pd-standard
  provisioner: kubernetes.io/gce-pd
  reclaimPolicy: Delete
  volumeBindingMode: Immediate
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
8.b Create a YAML file named "my_storage_class.yaml" for Storage Class named "mysc".
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: mysc
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
8.c Create the Storage Class.
kubectl create -f my_storage_class.yaml
8.d Verify both Storage Classes.
$ kubectl get storageclass
NAME                 PROVISIONER                    AGE
mysc                 kubernetes.io/no-provisioner   8s
standard (default)   kubernetes.io/gce-pd           8h
8.e Change default Storage Class to "mysc".
kubectl patch storageclass mysc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
kubectl patch storageclass standard -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
8.f Verify both Storage Classes again.
$ kubectl get storageclass
NAME             PROVISIONER                    AGE
mysc (default)   kubernetes.io/no-provisioner   2m3s
standard         kubernetes.io/gce-pd           8h

9. Create a YAML file named "test-simplepv.yaml" for PersistentVolume (PV) named "test-simplepv"

apiVersion: v1
kind: PersistentVolume
metadata:
  name: test-simplepv
  namespace: testns
  labels:
    name: pv-simplepv-test
spec:
  storageClassName: mysc
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  capacity:
    storage: 1Gi
  csi:
    nodePublishSecretRef:
      name: "mapr-ticket-secret"
      namespace: "testns"
    driver: com.mapr.csi-kdf
    volumeHandle: mapr.apps
    volumeAttributes:
      volumePath: "/apps"
      cluster: "mycluster.cluster.com"
      cldbHosts: "mycldb.node.internal"
      securityType: "secure"
      platinum: "false"
Make sure the CLDB host can be accessed by the Kubernetes Cluster nodes.
And also the PV is using our own Storage Class "mysc".
Create the PV:
kubectl create -f test-simplepv.yaml

10. Create a YAML file named "test-simplepvc.yaml" for PersistentVolumeClaim (PVC) named "test-simplepvc"

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-simplepvc
  namespace: testns
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1G
Create the PVC:
kubectl create -f test-simplepvc.yaml
Right now, the PVC should be in "Pending" status which is fine.
$ kubectl get pv -n testns
NAME            CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
test-simplepv   1Gi        RWO            Delete           Available           mysc                    11s

$ kubectl get pvc -n testns
NAME             STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-simplepvc   Pending                                      mysc           11s

11. Create a YAML file named "testpod.yaml" for a POD named "testpod"

apiVersion: v1
kind: Pod
metadata:
  name: testpod
  namespace: testns
spec:
  securityContext:
    runAsUser: 5000
    fsGroup: 5000
  containers:
  - name: busybox
    image: busybox
    args:
    - sleep
    - "1000000"
    resources:
      requests:
        memory: "2Gi"
        cpu: "500m"
    volumeMounts:
    - mountPath: /mapr
      name: maprcsi
  volumes:
    - name: maprcsi
      persistentVolumeClaim:
        claimName: test-simplepvc
Create the POD:
kubectl create -f testpod.yaml

After that, both PV and PVC should be "Bound":
$ kubectl get pvc -n testns
NAME             STATUS   VOLUME          CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-simplepvc   Bound    test-simplepv   1Gi        RWO            mysc           82s

$ kubectl get pv -n testns
NAME            CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                   STORAGECLASS   REASON   AGE
test-simplepv   1Gi        RWO            Delete           Bound    testns/test-simplepvc   mysc                    89s

12. Logon the POD to verify

kubectl exec -ti testpod -n testns -- bin/sh
Then try to read and write:
/ $ mount -v |grep mapr
posix-client-basic on /mapr type fuse.posix-client-basic (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
/ $ ls -altr /mapr
total 6
drwxrwxrwt    3 5000     5000             1 Nov 26 16:49 kafka-streams
drwxrwxrwt    3 5000     5000             1 Nov 26 16:49 ksql
drwxrwxrwx    3 5000     5000            15 Dec  4 17:10 spark
drwxr-xr-x    5 5000     5000             3 Dec  5 04:27 .
drwxr-xr-x    1 root     root          4096 Dec  5 04:40 ..
/ $ touch /mapr/testfile
/ $ rm /mapr/testfile

13. Clean up

kubectl delete -f testpod.yaml
kubectl delete -f test-simplepvc.yaml
kubectl delete -f test-simplepv.yaml
kubectl delete -f my_storage_class.yaml
kubectl delete -f mapr-ticket-secret.yaml
kubectl delete -f csi-maprkdf-v1.0.0.yaml

Common issues:

1. In step 4 when creating MapR CSI ClusterRoleBinding, it fails with below error message:
user xxx@yyy.com (groups=["system:authenticated"]) is attempting to grant rbac permissions not currently held
This is because Google Cloud user "xxx@yyy.com" does not have the permissions.
One solution is to do step 3 which is to grant cluster admin role to this user.

2. After PV and PVC are created, PVC is bound to a new PV named "pvc-...." instead of our PV named "test-simplepv".
For example:
$  kubectl get pvc -n testns
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-simplepvc   Bound    pvc-e9a0f512-16f6-11ea-abaf-42010a8000ad   1Gi        RWO            standard       16m

$  kubectl get pv -n testns
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                     STORAGECLASS   REASON   AGE
pvc-e9a0f512-16f6-11ea-abaf-42010a8000ad   1Gi        RWO            Delete           Bound       mapr-csi/test-simplepvc   standard                17m
test-simplepv                              1Gi        RWO            Delete           Available    
This is because GKE has a default Storage Class "standard" which can create a new PV bound to our PVC.
For example, we can confirm this using below command:
$  kubectl get pvc test-simplepvc -o=yaml -n testns
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/gce-pd
  creationTimestamp: "2019-12-05T00:33:52Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: test-simplepvc
  namespace: testns
  resourceVersion: "61729"
  selfLink: /api/v1/namespaces/testns/persistentvolumeclaims/test-simplepvc
  uid: e9a0f512-16f6-11ea-abaf-42010a8000ad
spec:
  accessModes:
  - ReadWriteOnce
  dataSource: null
  resources:
    requests:
      storage: 1G
  storageClassName: standard
  volumeMode: Filesystem
  volumeName: pvc-e9a0f512-16f6-11ea-abaf-42010a8000ad
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  phase: Bound
One solution is to do step 8 which is to change the GKE default Storage Class.

Troubleshooting:

DaemonSet "csi-nodeplugin-kdf" has 3 kinds of containers:
[csi-node-driver-registrar liveness-probe mapr-kdfplugin]
StatefulSet "csi-controller-kdf" has 5 kinds of containers:
[csi-attacher csi-provisioner csi-snapshotter liveness-probe mapr-kdfprovisioner]

So we can view all of the container logs to see if there is any error.
For example:
kubectl logs csi-nodeplugin-kdf-vrq4g -c csi-node-driver-registrar -n mapr-csi
kubectl logs csi-nodeplugin-kdf-vrq4g -c liveness-probe -n mapr-csi
kubectl logs csi-nodeplugin-kdf-vrq4g -c mapr-kdfplugin -n mapr-csi

kubectl logs csi-controller-kdf-0 -c csi-provisioner -n mapr-csi
kubectl logs csi-controller-kdf-0 -c csi-attacher -n mapr-csi
kubectl logs csi-controller-kdf-0 -c csi-snapshotter -n mapr-csi
kubectl logs csi-controller-kdf-0 -c mapr-kdfprovisioner -n mapr-csi
kubectl logs csi-controller-kdf-0 -c liveness-probe -n mapr-csi

Reference:

https://mapr.com/docs/home/CSIdriver/csi_overview.html
https://mapr.com/docs/home/CSIdriver/csi_installation.html
https://mapr.com/docs/home/CSIdriver/csi_example_static_provisioning.html

No comments:

Post a Comment

Popular Posts