Commit Graph

82 Commits

Author SHA1 Message Date
Mehran Kholdi 877e90e034 Expose volume stats as prometheus metrics
This should help in:

- Keeping track of deleted PVs with `Retain` policy
- Detecting disk overprovisioning
2021-07-05 00:00:10 +04:30
Mehran Kholdi 2b6a0a33b8 Refactor: Extract utility functions out of metrics module 2021-07-04 23:15:50 +04:30
Mehran Kholdi c651f69e9c Specifiy fs type in mount commands 2021-07-04 23:15:50 +04:30
Mehran Kholdi 2fb84efb6d Neat: reformat code using black 2021-07-02 20:31:34 +04:30
Mehran Kholdi 7717264801 Update CSI proto to 1.5.0 2021-07-02 20:31:34 +04:30
Mehran Kholdi e585684502 Release 0.5.0 2021-07-01 23:48:23 +04:30
Mehran Kholdi 6d8c7738f3 Do not create volumes smaller than 16MiB
XFS fails in formatting the volume with the following error:

```
agsize (2560 blocks) too small, need at least 4096 blocks
```
2021-07-01 23:48:23 +04:30
Mehran Kholdi eff26e8c3e Drop support for k8s <1.19
So that we can:
* Rely on existence of newer features
* Update external components' images
2021-07-01 23:48:23 +04:30
Mehran Kholdi c454a51ccd Nit: cleanup e2e test scripts 2021-07-01 23:48:23 +04:30
Mehran Kholdi 4d6d83c24a Support xfs filesystem 2021-07-01 22:34:20 +04:30
Mehran Kholdi 7c7e8eb4ce btrfs: Change default subvol upon creation
The default root subvol comes with its own limitations and it might be
better off changing the default subvol upon creation. This should also
let us create hidden subvols that may be used for storing snapshots,
without exposing them to the end-user.
2021-07-01 22:34:20 +04:30
Mehran Kholdi c11646e08c btrfs: Mount with `flushoncommit` flag
Not an expert on this, but my understanding is that without this flag,
outages will result in a state that despite being consistent, most
applications are not mature enough to handle. Namely, we ran benchmarks
that reproduced appearance of zero-length files upon sudden poweroffs.

Databases should be fine since they know well about the guarantees the
filesystem must provide, but not applications are databases. So let's
play safe about this.

See:
- https://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/
- https://github.com/Zygo/bees/issues/68#issuecomment-403262059
2021-06-26 03:51:40 +04:30
Mehran Kholdi d52f8ffbe0 ext4: Do not reserve free space for root user upon creation
PVCs are data volumes most of the times, and reserving space for system
tasks is probably unnecessary.

The user can still modify a specific PVC's reserved blocks through the
`tune2fs` command.
2021-06-26 03:51:40 +04:30
Mehran Kholdi 87e78705b1 Report "available" space rather than "free" space in volume stats
These two numbers may differ, and having the wrong number may result in
a volume having no useable space, while the metrics suggest it does.
2021-06-26 03:51:40 +04:30
Mehran Kholdi 1cd4ca3d1f Refactor: code cleanup 2021-06-26 02:51:02 +04:30
Mehran Kholdi 89de295293 Fix race condition by making the scrub function idempotent
Under certain situtations, a race condition could lead to pvc deletion
tasks getting stuck in a failing state.
2021-06-26 02:50:39 +04:30
Mehran Kholdi 8db829ed6e Update dependencies 2021-06-26 01:14:00 +04:30
Mehran Kholdi fd2e59929b Fix bug with online resizing btrfs filesystems having non-default subvol
```
Command 'losetup -c /dev/loop0[/default]' returned non-zero exit status 1.
```
2021-03-01 13:45:25 +03:30
Mehran Kholdi 5dc8afc0a6 Fix bug that was preventing btrfs filesystems from being resized 2021-03-01 08:38:31 +03:30
Mehran Kholdi d203eba5a9 Release 0.4.4 2021-02-26 17:56:47 +03:30
Mehran Kholdi 5edcdff216 Fix #5: Actually delete PVC image files 2021-02-26 16:10:10 +03:30
Mehran Kholdi 8bbb30a2e1 Release 0.4.3 2021-02-13 02:40:42 +03:30
Hanieh Marvi bd68bd6e64 Fix typo 2021-02-13 02:03:04 +03:30
Hanieh Marvi ba7f4c1b7f Remove requests from tasks
So pods do not stay in pending state because of lack of resources.
2021-02-13 02:03:04 +03:30
Hanieh Marvi 8424536588 Set resources for sidecar container 2021-02-13 02:03:04 +03:30
Mehran Kholdi ab50217ea5 Release 0.4.2 2021-01-16 04:01:22 +03:30
Mehran Kholdi b4faf9d7cb Expose volume metrics through gRPC calls rather than metrics endpoint 2021-01-16 03:58:08 +03:30
Mehran Kholdi c58dd14bf7 Extract blockdevice-to-filesystem logic from rawfile servicer
Summary: So that it's possible to use it with any other blockdevice provider.

Test Plan: N/A

Reviewers: sina_rad, h.marvi, mhyousefi, s.afshari

Differential Revision: https://phab.hamravesh.ir/D870
2021-01-16 03:58:08 +03:30
Mehran Kholdi 01a35354b6 Fix a bug where broken symlinks where not being cleaned up
See: https://docs.python.org/3/library/pathlib.html#pathlib.Path.exists
"Note If the path points to a symlink, exists() returns whether the symlink points to an existing file or directory."
2021-01-16 03:45:09 +03:30
Mehran Kholdi c2110108cb Change conditions upon which e2e test are run 2020-11-28 04:50:30 +03:30
Mehran Kholdi 9bafb101ac Remove liveness probes 2020-11-28 04:50:11 +03:30
Mehran Kholdi 05c661165f Fix ci setup script
So that it does not explicitly depend on travis
2020-11-08 01:46:08 +03:30
Mehran Kholdi b88fd0cfdf Release 0.4.1 2020-09-11 20:45:17 +04:30
Mehran Kholdi 23c7912977 Update chart's default values 2020-09-11 20:44:56 +04:30
Mehran Kholdi a2cf384d4f Make logs less noisy 2020-09-11 20:44:40 +04:30
Mehran Kholdi 6fde8e0271 Update external csi sidecar containers 2020-09-11 20:44:29 +04:30
Mehran Kholdi 848d87453f Change default provisioner name from `rawfile.hamravesh.com` to `rawfile.csi.openebs.io` 2020-08-15 01:36:05 +04:30
Mehran Kholdi 13a16e70f6 Use semantic versioning 2020-08-14 20:11:07 +04:30
Mehran Kholdi c895312131 Implement `GET_VOLUME_STATS` capability 2020-07-18 09:50:05 +04:30
Mehran Kholdi 83e184c58a Remove redundant ARGs in dockerfile 2020-07-18 09:46:51 +04:30
Mehran Kholdi 77862b85e2 Support custom fsTypes
Test Plan:
- Deploy using `feature-fstype` image tag
- Create the following storage class:
```
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: btrfs-sc
parameters:
  fsType: btrfs
provisioner: rawfile.hamravesh.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
```
- Create and use a pvc backed by the new storage class
- Exec into the pod and verify that the mounted volume is a `btrfs` filesystem indeed

Reviewers: h.marvi, sina_rad, mhyousefi, bghadiri

Reviewed By: h.marvi, mhyousefi, bghadiri

Differential Revision: https://phab.hamravesh.ir/D833
2020-07-18 09:46:05 +04:30
Mehran Kholdi 0095c0e83a Implement metadata schema migration
Summary: Trying to add custom `fsType`s, new metadata fields need to be stored. We need a mechanism to migrate existing volume metadata.

Test Plan:
- Install the chart using an older image tag like `36fc480`
- Create and use a pvc
- Verify that the volume's metadata file, located at `/var/csi/rawfile/pvc-.../disk.meta` does not contain the `schema_version` field
- Upgrade the chart to use the image tag `feature-schema-migration`
- Wait until all node pods are upgraded
- Verify that the volume's metadata file contains the new `schema_version` field

Reviewers: bghadiri, h.marvi, mhyousefi, sina_rad

Reviewed By: bghadiri, h.marvi, mhyousefi

Differential Revision: https://phab.hamravesh.ir/D832
2020-07-18 09:41:24 +04:30
Mehran Kholdi d606f8f064 [skip-e2e] Use slugified branch name as docker image tag 2020-07-13 21:18:09 +04:30
Mehran Kholdi 36fc480d28 [skip-e2e] Use openebs's docker image in helm chart 2020-07-13 20:05:40 +04:30
Mehran Kholdi 7a4d90a315 Run e2e tests in parallel when possible 2020-07-11 23:14:13 +04:30
Mehran Kholdi ab407f3349 Setup CI: Build, run e2e tests, and push images to docker hub 2020-07-11 23:14:13 +04:30
Mehran Kholdi 16f92da467 Add "Motivation" section to README 2020-07-11 19:03:23 +04:30
fossabot ee4d9ae660 Add license scan report and status
Signed off by: fossabot <badges@fossa.com>
2020-07-11 19:03:04 +04:30
Mehran Kholdi 8b1be18a15 Hotfix: Make side effects idempotent 2020-06-14 05:10:32 +04:30
Mehran Kholdi d1c0d49cf0 Support online volume expansion
Summary:
Online volume expansion is a 2 phase process:

1. The backing storage, in this case the raw file, needs to be resized. (i.e. `truncate -s`)
2. The node should be notified, so that it can both refresh its device capacity (i.e. `losetup -c`) and resize the filesystem (`resize2fs`) accordingly.

Although in our case both steps could be performed on the node itself, for the sake of following the semantics of how volume expansion works, we perform step 1 from the controller, and step 2 from the node.

Also, the `external-resizer` component is added which watches for PVC size updates, and notifies the CSI controller about it.

Test Plan:
Setup:
- Deploy
- Create a rawfile-backed pvc, and attach a Deployment to it
- Keep an eye on `rawfile` pod logs in `kube-system` namespace to see if any errors pop out during all scenarios

Scenario 1:
- Increase the size of the pvc
- Exec into the pod and verify that the volume is resized indeed (using `df`)

Scenario 2:
- Decrease deployment's replica to 0
- Increase the size of the pvc. Wait for a couple of minutes.
- Increase deployment's replica to 1
- Exec into the pod and verify that the volume is resized indeed.

Reviewers: bghadiri, mhyousefi, h.marvi, sina_rad

Reviewed By: bghadiri, mhyousefi, sina_rad

Differential Revision: https://phab.hamravesh.ir/D817
2020-06-14 03:35:17 +04:30