Summary: Trying to add custom `fsType`s, new metadata fields need to be stored. We need a mechanism to migrate existing volume metadata.
Test Plan:
- Install the chart using an older image tag like `36fc480`
- Create and use a pvc
- Verify that the volume's metadata file, located at `/var/csi/rawfile/pvc-.../disk.meta` does not contain the `schema_version` field
- Upgrade the chart to use the image tag `feature-schema-migration`
- Wait until all node pods are upgraded
- Verify that the volume's metadata file contains the new `schema_version` field
Reviewers: bghadiri, h.marvi, mhyousefi, sina_rad
Reviewed By: bghadiri, h.marvi, mhyousefi
Differential Revision: https://phab.hamravesh.ir/D832
Summary:
Online volume expansion is a 2 phase process:
1. The backing storage, in this case the raw file, needs to be resized. (i.e. `truncate -s`)
2. The node should be notified, so that it can both refresh its device capacity (i.e. `losetup -c`) and resize the filesystem (`resize2fs`) accordingly.
Although in our case both steps could be performed on the node itself, for the sake of following the semantics of how volume expansion works, we perform step 1 from the controller, and step 2 from the node.
Also, the `external-resizer` component is added which watches for PVC size updates, and notifies the CSI controller about it.
Test Plan:
Setup:
- Deploy
- Create a rawfile-backed pvc, and attach a Deployment to it
- Keep an eye on `rawfile` pod logs in `kube-system` namespace to see if any errors pop out during all scenarios
Scenario 1:
- Increase the size of the pvc
- Exec into the pod and verify that the volume is resized indeed (using `df`)
Scenario 2:
- Decrease deployment's replica to 0
- Increase the size of the pvc. Wait for a couple of minutes.
- Increase deployment's replica to 1
- Exec into the pod and verify that the volume is resized indeed.
Reviewers: bghadiri, mhyousefi, h.marvi, sina_rad
Reviewed By: bghadiri, mhyousefi, sina_rad
Differential Revision: https://phab.hamravesh.ir/D817
Test Plan: - Check `:9100/metrics` for existence of inode-related metrics
Reviewers: h.marvi, mhyousefi
Reviewed By: h.marvi
Differential Revision: https://phab.hamravesh.ir/D816
Summary:
Formerly, we were updating the metrics every 15 seconds. We were facing a couple of issues doing it manually:
- Outdated metrics in case of a one-time crash
- Metrics getting exposed for deleted PVs
Instead of fixing the bugs, I preferred to do it the right way. As per `python-prometheus` docs:
> Sometimes it is not possible to directly instrument code, as it is not in your control. This requires you to proxy metrics from other systems. To do so you need to create a custom collector...
Test Plan:
- Deploy on a cluster with existing rawfile PVs
- Send request to `:9100/metrics` and assert that metrics are exposed
- Delete a PV, and assert that its metrics disappear
Reviewers: h.marvi, bghadiri, sina_rad, mhyousefi
Reviewed By: h.marvi, bghadiri, sina_rad
Differential Revision: https://phab.hamravesh.ir/D815
Summary: Before this, we directly mounted the the rawfile on the mountpoint. In this revision the `STAGE_UNSTAGE_VOLUME` capability is implemented, meaning that the volume is first mounted to a staging path, and then `bind`-mounted to the actual path. This way we can free up loopback devices when they are not needed.
Test Plan:
- Create a pvc, and use it inside a pod
- Run `losetup -l` on the node running the pod, and assert the creation of a loop device
- Delete the pod, but not the pvc
- Run `losetup -l` on the same node, and assert the removal of the loop device
Reviewers: h.marvi, bghadiri
Differential Revision: https://phab.hamravesh.ir/D806