diff --git a/README.md b/README.md index f640e2721..4f11a4002 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,6 @@ CSI driver for provisioning Local PVs backed by ZFS and more. ZFS-LocalPV was declared GA in Dec 2020. Many users are running ZFS-LocalPV in production, see what our [adopters](./Adopters.md) are saying. - ## Project Tracker See [roadmap](https://github.com/orgs/openebs/projects/10), [e2e-wiki](https://github.com/openebs/zfs-localpv/wiki/ZFS-LocalPV-e2e-test-cases) and [e2e-test](https://github.com/openebs/e2e-tests/projects/7). @@ -23,8 +22,7 @@ See [roadmap](https://github.com/orgs/openebs/projects/10), [e2e-wiki](https://g ### Prerequisites -Before installing ZFS driver please make sure your Kubernetes Cluster -must meet the following prerequisites: +Before installing the ZFS-LocalPV driver please make sure your Kubernetes Cluster meets the following prerequisites: 1. all the nodes must have zfs utils installed 2. ZPOOL has been setup for provisioning the volume @@ -34,42 +32,39 @@ must meet the following prerequisites: ### Supported System -K8S : 1.20+ - -OS : Ubuntu, CentOS - -ZFS : 0.7, 0.8 +| Name | Version | +| :--- | :--- | +| K8S | 1.20+ | +| OS | Ubuntu, CentOS | +| ZFS | 0.7, 0.8 | Check the [features](./docs/features.md) supported for each k8s version. ### Setup -All the node should have zfsutils-linux installed. We should go to the -each node of the cluster and install zfs utils -``` +All the node should have zfsutils-linux installed. We should go to the each node of the cluster and install zfs utils : +```bash $ apt-get install zfsutils-linux ``` Go to each node and create the ZFS Pool, which will be used for provisioning the volumes. You can create the Pool of your choice, it can be striped, mirrored or raidz pool. -If you have the disk(say /dev/sdb) then you can use the below command to create a striped pool: - -``` -zpool create zfspv-pool /dev/sdb +If you have the disk(say /dev/sdb) then you can use the below command to create a striped pool : +```bash +$ zpool create zfspv-pool /dev/sdb ``` You can also create mirror or raidz pool as per your need. Check https://github.com/openzfs/zfs for more information. If you don't have the disk, then you can create the zpool on the loopback device which is backed by a sparse file. Use this for testing purpose only. -``` -truncate -s 100G /tmp/disk.img -zpool create zfspv-pool `sudo losetup -f /tmp/disk.img --show` +```bash +$ truncate -s 100G /tmp/disk.img +$ zpool create zfspv-pool `losetup -f /tmp/disk.img --show` ``` Once the ZFS Pool is created, verify the pool via `zpool status` command, you should see something like this : - -``` -$ sudo zpool status +```bash +$ zpool status pool: zfspv-pool state: ONLINE scan: none requested @@ -88,9 +83,8 @@ https://github.com/openebs/zfs-localpv/blob/HEAD/docs/faq.md#6-how-to-add-custom ### Installation -We can install the latest release of OpenEBS ZFS driver by running the following command. - -``` +We can install the latest release of OpenEBS ZFS driver by running the following command: +```bash $ kubectl apply -f https://openebs.github.io/charts/zfs-operator.yaml ``` @@ -99,39 +93,26 @@ We can also install it via kustomize using `kubectl apply -k deploy/yamls`, chec **NOTE:** If you are running a custom Kubelet location, or a Kubernetes distribution that uses a custom Kubelet location, the `kubelet` directory must be changed at all relevant places in the YAML powering the operator (both the `openebs-zfs-controller` and `openebs-zfs-node`). - For `microk8s`, we need to change the kubelet directory to `/var/snap/microk8s/common/var/lib/kubelet/`, we need to replace `/var/lib/kubelet/` with `/var/snap/microk8s/common/var/lib/kubelet/` at all the places in the operator yaml and then we can apply it on microk8s. - - For `k0s`, the default directory (`/var/lib/kubelet`) should be changed to `/var/lib/k0s/kubelet`. - - For `RancherOS`, the default directory (`/var/lib/kubelet`) should be changed to `/opt/rke/var/lib/kubelet`. -Verify that the ZFS driver Components are installed and running using below command : - -``` +Verify that the ZFS driver Components are installed and running using below command. Depending on number of nodes, you will see one zfs-controller pod and zfs-node daemonset running on the nodes : +```bash $ kubectl get pods -n kube-system -l role=openebs-zfs -``` - -Depending on number of nodes, you will see one zfs-controller pod and zfs-node daemonset running -on the nodes. - -``` NAME READY STATUS RESTARTS AGE openebs-zfs-controller-0 5/5 Running 0 5h28m openebs-zfs-node-4d94n 2/2 Running 0 5h28m openebs-zfs-node-gssh8 2/2 Running 0 5h28m openebs-zfs-node-twmx8 2/2 Running 0 5h28m - ``` -Once ZFS driver is installed we can provision a volume. - +Once ZFS driver is installed and running we can provision a volume. ### Deployment #### 1. Create a Storage class -``` -$ cat sc.yaml - +```yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: @@ -145,10 +126,10 @@ parameters: provisioner: zfs.csi.openebs.io ``` -The storage class contains the volume parameters like recordsize(should be power of 2), compression, dedup and fstype. You can select what are all -parameters you want. In case, zfs properties paramenters are not provided, the volume will inherit the properties from the ZFS Pool. +The storage class contains the volume parameters like recordsize(should be power of 2), compression, dedup and fstype. You can select what are all parameters you want. In case, zfs properties paramenters are not provided, the volume will inherit the properties from the ZFS Pool. + The *poolname* is the must argument. It should be noted that *poolname* can either be the root dataset or a child dataset e.g. -``` +```yaml poolname: "zfspv-pool" poolname: "zfspv-pool/child" ``` @@ -157,14 +138,15 @@ Also the dataset provided under `poolname` must exist on *all the nodes* with th ##### ext2/3/4 or xfs or btrfs as FsType -If we provide fstype as ext2/3/4 or xfs or btrfs, the driver will create a ZVOL, which is a blockdevice carved out of ZFS Pool. -This blockdevice will again formatted as corresponding filesystem(ext2/3/4 or xfs). In this way applications will get desired filesystem. -Here, in this case there will be a filesystem layer on top of ZFS filesystem, and applications may not get the optimal performance. -The sample storage class for ext4 fstype is provided below :- +If we provide fstype as one of ext2/3/4 or xfs or btrfs, the driver will create a ZVOL, which is a blockdevice carved out of ZFS Pool. +This blockdevice will be formatted with corresponding filesystem before it's used by the driver. -``` -$ cat sc.yaml +> **Note** +> This means there will be a filesystem layer on top of ZFS volume, and applications may not get optimal performance. + +A sample storage class for `ext4` fstype is provided below : +```yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: @@ -180,15 +162,13 @@ provisioner: zfs.csi.openebs.io Here please note that we are providing `volblocksize` instead of `recordsize` since we will create a ZVOL, for which we can choose the blocksize with which we want to create the block device. Here, please note that for ZFS, volblocksize should be power of 2. -##### zfs as FsType +##### ZFS as FsType -In case if we provide "zfs" as the fstype, the zfs driver will create ZFS DATASET in the ZFS Pool, which is the zfs filesystem. -Here, there will not be any extra layer between application and storage, and applications can get the optimal performance. -The sample storage class for zfs fstype is provided below :- +In case if we provide "zfs" as the fstype, the ZFS driver will create ZFS DATASET in the ZFS Pool, which is the ZFS filesystem. Here, there will not be any extra layer between application and storage, and applications can get the optimal performance. -``` -$ cat sc.yaml +The sample storage class for ZFS fstype is provided below : +```yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: @@ -209,7 +189,7 @@ Here please note that we are providing `recordsize` which will be used to create If ZFS pool is available on certain nodes only, then make use of topology to tell the list of nodes where we have the ZFS pool available. As shown in the below storage class, we can use allowedTopologies to describe ZFS pool availability on nodes. -``` +```yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: @@ -235,18 +215,20 @@ The above storage class tells that ZFS pool "zfspv-pool" is available on nodes z Please note that the provisioner name for ZFS driver is "zfs.csi.openebs.io", we have to use this while creating the storage class so that the volume provisioning/deprovisioning request can come to ZFS driver. ##### Scheduler - + The ZFS driver has its own scheduler which will try to distribute the PV across the nodes so that one node should not be loaded with all the volumes. Currently the driver supports two scheduling algorithms: VolumeWeighted and CapacityWeighted, in which it will try to find a ZFS pool which has less number of volumes provisioned in it or less capacity of volume provisioned out of a pool respectively, from all the nodes where the ZFS pools are available. To know about how to select scheduler via storage-class See [this](https://github.com/openebs/zfs-localpv/blob/HEAD/docs/storageclasses.md#storageclass-with-k8s-scheduler). -Once it is able to find the node, it will create a PV for that node and also create a ZFSVolume custom resource for the volume with the NODE information. The watcher for this ZFSVolume -CR will get all the information for this object and creates a ZFS dataset(zvol) with the given ZFS property on the mentioned node. +Once it is able to find the node, it will create a PV for that node and also create a ZFSVolume custom resource for the volume with the NODE information. The watcher for this ZFSVolume CR will get all the information for this object and creates a ZFS dataset(zvol) with the given ZFS property on the mentioned node. The scheduling algorithm currently only accounts for either the number of ZFS volumes or total capacity occupied from a zpool and does not account for other factors like available cpu or memory while making scheduling decisions. + So if you want to use node selector/affinity rules on the application pod, or have cpu/memory constraints, kubernetes scheduler should be used. To make use of kubernetes scheduler, you can set the `volumeBindingMode` as `WaitForFirstConsumer` in the storage class. + This will cause a delayed binding, i.e kubernetes scheduler will schedule the application pod first and then it will ask the ZFS driver to create the PV. -The driver will then create the PV on the node where the pod is scheduled. -``` +The driver will then create the PV on the node where the pod is scheduled : + +```yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: @@ -263,14 +245,13 @@ volumeBindingMode: WaitForFirstConsumer ``` Please note that once a PV is created for a node, application using that PV will always get scheduled to that particular node only, as PV will be sticky to that node. + The scheduling algorithm by ZFS driver or kubernetes will come into picture only during the deployment time. Once the PV is created, the application can not move anywhere as the data is there on the node where the PV is. -#### 2. Create the PVC - -``` -$ cat pvc.yaml +#### 2. Create a PVC +```yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: @@ -286,15 +267,15 @@ spec: Create a PVC using the storage class created for the ZFS driver. Here, the allocated volume size will be rounded off to the nearest Mi or Gi notation, check the [faq](./docs/faq.md#7-why-the-zfs-volume-size-is-different-than-the-reqeusted-size-in-pvc) section for more details. -If we are using the immediate binding in the storageclass then we can check the kubernetes resource for the corresponding zfs volume, other wise in late binding case, we can check the same after pod has been scheduled. +If we are using the immediate binding in the storageclass then we can check the kubernetes resource for the corresponding ZFS volume, otherwise in late binding case, we can check the same after pod has been scheduled : -``` +```bash $ kubectl get zv -n openebs NAME ZPOOL NODE SIZE STATUS FILESYSTEM AGE pvc-34133838-0d0d-11ea-96e3-42010a800114 zfspv-pool zfspv-node1 4294967296 Ready zfs 4s ``` -``` +```bash $ kubectl describe zv pvc-34133838-0d0d-11ea-96e3-42010a800114 -n openebs Name: pvc-34133838-0d0d-11ea-96e3-42010a800114 Namespace: openebs @@ -324,24 +305,22 @@ Status: Events: ``` -The ZFS driver will create a ZFS dataset(or zvol as per fstype in the storageclass) on the node zfspv-node1 for the mentioned ZFS pool and the dataset name will same as PV name. -Go to the node zfspv-node1 and check the volume :- +The ZFS driver will create a ZFS dataset (or zvol as per fstype in the storageclass) on the node zfspv-node1 for the mentioned ZFS pool and the dataset name will same as PV name. -``` +Go to the node zfspv-node1 and check the volume : + +```bash $ zfs list NAME USED AVAIL REFER MOUNTPOINT zfspv-pool 444K 362G 96K /zfspv-pool zfspv-pool/pvc-34133838-0d0d-11ea-96e3-42010a800114 96K 4.00G 96K legacy - ``` #### 3. Deploy the application Create the deployment yaml using the pvc backed by ZFS-LocalPV storage. -``` -$ cat fio.yaml - +```yaml apiVersion: v1 kind: Pod metadata: @@ -370,44 +349,44 @@ by the application for reading/writting the data and space is consumed from the ZFS Volume Property can be changed like compression on/off can be done by just simply editing the kubernetes resource for the corresponding zfs volume by using below command : -``` -kubectl edit zv pvc-34133838-0d0d-11ea-96e3-42010a800114 -n openebs +```bash +$ kubectl edit zv pvc-34133838-0d0d-11ea-96e3-42010a800114 -n openebs ``` You can edit the relevant property like make compression on or make dedup on and save it. This property will be applied to the corresponding volume and can be verified using below command on the node: -``` -zfs get all zfspv-pool/pvc-34133838-0d0d-11ea-96e3-42010a800114 +```bash +$ zfs get all zfspv-pool/pvc-34133838-0d0d-11ea-96e3-42010a800114 ``` #### 5. Deprovisioning for deprovisioning the volume we can delete the application which is using the volume and then we can go ahead and delete the pv, as part of deletion of pv this volume will also be deleted from the ZFS pool and data will be freed. -``` +```bash $ kubectl delete -f fio.yaml pod "fio" deleted $ kubectl delete -f pvc.yaml persistentvolumeclaim "csi-zfspv" deleted ``` -

CAUTION:

- -Follow below practice while running kernel ZFS along with cStor on the same set of nodes -- Disable zfs-import-scan.service service that will avoid importing all pools by scanning all the available devices in the system, disabling scan service will avoid importing pools that are not created by kernel. Disabling scan service will not cause harm since zfs-import-cache.service is enabled and it is the best way to import pools by looking at cache file during boot time. -```sh -sudo systemctl stop zfs-import-scan.service -sudo systemctl disable zfs-import-scan.service -``` -- Always maintain upto date /etc/zfs/zpool.cache while performing operations any day2 operations on zfs pools(zpool set cachefile=/etc/zfs/zpool.cache ). - -Note: Following above two step kernel ZFS will not import the pools created by cStor +> ***Warning*** +> If you are running running kernel ZFS and cStor on the same set of nodes, the following two points are best practice: +> +> Disable zfs-import-scan.service service that will avoid importing all pools by scanning all the available devices in the system, disabling scan service will avoid importing pools that are not created by kernel. Disabling scan service will not cause harm since zfs-import-cache.service is enabled and it is the best way to import pools by looking at cache file during boot time. +> ```bassh +> $ systemctl stop zfs-import-scan.service +> $ systemctl disable zfs-import-scan.service +> ``` +> +> Always maintain upto date /etc/zfs/zpool.cache while performing operations any day2 operations on zfs pools(zpool set cachefile=/etc/zfs/zpool.cache ). +> +> Note: After performing the above steps, the kernel ZFS will not import pools created by cStor -Features ---- +## Features - [x] Access Modes - [x] ReadWriteOnce