1
0
forked from k-space/kube
kube/CLUSTER.md

171 lines
8.2 KiB
Markdown
Raw Normal View History

# Kubernetes cluster
Kubernetes hosts run on [PVE Cluster](https://wiki.k-space.ee/en/hosting/proxmox). Hosts are listed in Ansible [inventory](ansible/inventory.yml).
## `kubectl`
- Authorization [ACLs](cluster-role-bindings.yml)
- [Troubleshooting `no such host`](#systemd-resolved-issues)
Authenticate to auth.k-space.ee:
```bash
kubectl krew install oidc-login
mkdir -p ~/.kube
cat << EOF > ~/.kube/config
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1EVXdNakEzTXpVMU1Wb1hEVE15TURReU9UQTNNelUxTVZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS2J2CjY3UFlXVHJMc3ZCQTZuWHUvcm55SlVhNnppTnNWTVN6N2w4ekhxM2JuQnhqWVNPUDJhN1RXTnpUTmZDanZBWngKTmlNbXJya1hpb2dYQWpVVkhSUWZlYm81TFIrb0JBOTdLWlcrN01UMFVJRXBuWVVaaTdBRHlaS01vcEJFUXlMNwp1SlU5UDhnNUR1T29FRHZieGJSMXFuV1JZRXpteFNmSFpocllpMVA3bFd4emkxR243eGRETFZaMjZjNm0xR3Y1CnViRjZyaFBXK1JSVkhiQzFKakJGeTBwRXdhYlUvUTd0Z2dic0JQUjk5NVZvMktCeElBelRmbHhVanlYVkJ3MjEKU2d3ZGI1amlpemxEM0NSbVdZZ0ZrRzd0NTVZeGF3ZmpaQjh5bW4xYjhUVjkwN3dRcG8veU8zM3RaaEE3L3BFUwpBSDJYeDk5bkpMbFVGVUtSY1A4Q0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZKNnZKeVk1UlJ1aklQWGxIK2ZvU3g2QzFRT2RNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQ04zcGtCTVM3ekkrbUhvOWdTZQp6SzdXdjl3bXlCTVE5Q3crQXBSNnRBQXg2T1VIN0d1enc5TTV2bXNkYjkrYXBKMHBlZFB4SUg3YXZ1aG9SUXNMCkxqTzRSVm9BMG9aNDBZV3J3UStBR0dvdkZuaWNleXRNcFVSNEZjRXc0ZDRmcGl6V3d0TVNlRlRIUXR6WG84V2MKNFJGWC9xUXNVR1NWa01PaUcvcVVrSFpXQVgyckdhWXZ1Tkw2eHdSRnh5ZHpsRTFSUk56TkNvQzVpTXhjaVRNagpackEvK0pqVEFWU2FuNXZnODFOSmthZEphbmNPWmEwS3JEdkZzd1JJSG5CMGpMLzh3VmZXSTV6czZURU1VZUk1ClF6dU01QXUxUFZ4VXZJUGhlMHl6UXZjWDV5RlhnMkJGU3MzKzJBajlNcENWVTZNY2dSSTl5TTRicitFTUlHL0kKY0pjPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
server: https://master.kube.k-space.ee:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: oidc
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: oidc
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- oidc-login
- get-token
- --oidc-issuer-url=https://auth.k-space.ee/
- --oidc-client-id=passmower.kubelogin
- --oidc-use-pkce
- --oidc-extra-scope=profile,email,groups
- --listen-address=127.0.0.1:27890
command: kubectl
env: null
provideClusterInfo: false
EOF
# Test it:
kubectl get nodes # opens browser for authentication
```
### systemd-resolved issues
```sh
Unable to connect to the server: dial tcp: lookup master.kube.k-space.ee on 127.0.0.53:53: no such host
```
```
Network → VPN → `IPv4` → Other nameservers (Muud nimeserverid): `172.21.0.1`
Network → VPN → `IPv6` → Other nameservers (Muud nimeserverid): `2001:bb8:4008:21::1`
Network → VPN → `IPv4` → Search domains (Otsingudomeenid): `kube.k-space.ee`
Network → VPN → `IPv6` → Search domains (Otsingudomeenid): `kube.k-space.ee`
```
## Cluster formation
Created Ubuntu 22.04 VM-s on Proxmox with local storage.
Added some ARM64 workers by using Ubuntu 22.04 server on Raspberry Pi.
After machines have booted up and you can reach them via SSH:
```
# Disable Ubuntu caching DNS resolver
systemctl disable systemd-resolved.service
systemctl stop systemd-resolved
rm -fv /etc/resolv.conf
cat > /etc/resolv.conf << EOF
nameserver 1.1.1.1
nameserver 8.8.8.8
EOF
# Disable multipathd as Longhorn handles that itself
systemctl mask multipathd snapd
systemctl disable --now multipathd snapd bluetooth ModemManager hciuart wpa_supplicant packagekit
# Permit root login
sed -i -e 's/PermitRootLogin no/PermitRootLogin without-password/' /etc/ssh/sshd_config
systemctl reload ssh
cat ~ubuntu/.ssh/authorized_keys > /root/.ssh/authorized_keys
userdel -f ubuntu
apt-get install -yqq linux-image-generic
apt-get remove -yq cloud-init linux-image-*-kvm
```
On master:
```
kubeadm init --token-ttl=120m --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint "master.kube.k-space.ee:6443" --upload-certs --apiserver-cert-extra-sans master.kube.k-space.ee --node-name master1.kube.k-space.ee
```
For the `kubeadm join` command specify FQDN via `--node-name $(hostname -f)`.
Set AZ labels:
```
for j in $(seq 1 9); do
for t in master mon worker storage; do
kubectl label nodes ${t}${j}.kube.k-space.ee topology.kubernetes.io/zone=node${j}
done
done
```
After forming the cluster add taints:
```bash
for j in $(seq 1 9); do
kubectl label nodes worker${j}.kube.k-space.ee node-role.kubernetes.io/worker=''
done
for j in $(seq 1 4); do
kubectl taint nodes mon${j}.kube.k-space.ee dedicated=monitoring:NoSchedule
kubectl label nodes mon${j}.kube.k-space.ee dedicated=monitoring
done
for j in $(seq 1 4); do
kubectl taint nodes storage${j}.kube.k-space.ee dedicated=storage:NoSchedule
kubectl label nodes storage${j}.kube.k-space.ee dedicated=storage
done
```
For `arm64` nodes add suitable taint to prevent scheduling non-multiarch images on them:
```bash
kubectl taint nodes worker9.kube.k-space.ee arch=arm64:NoSchedule
```
For door controllers:
```
for j in ground front back; do
kubectl taint nodes door-${j}.kube.k-space.ee dedicated=door:NoSchedule
kubectl label nodes door-${j}.kube.k-space.ee dedicated=door
kubectl taint nodes door-${j}.kube.k-space.ee arch=arm64:NoSchedule
done
```
To reduce wear on storage:
```
echo StandardOutput=null >> /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
systemctl restart kubelet
```
## Technology mapping
Our self-hosted Kubernetes stack compared to AWS based deployments:
| Hipster startup | Self-hosted hackerspace | Purpose |
|-------------------|-------------------------------------|---------------------------------------------------------------------|
| AWS ALB | Traefik | Reverse proxy also known as ingress controller in Kubernetes jargon |
| AWS AMP | Prometheus Operator | Monitoring and alerting |
| AWS CloudTrail | ECK Operator | Log aggregation |
| AWS DocumentDB | MongoDB Community Operator | Highly available NoSQL database |
| AWS EBS | Longhorn | Block storage for arbitrary applications needing persistent storage |
| AWS EC2 | Proxmox | Virtualization layer |
| AWS ECR | Harbor | Docker registry |
| AWS EKS | kubeadm | Provision Kubernetes master nodes |
| AWS NLB | MetalLB | L2/L3 level load balancing |
| AWS RDS for MySQL | MySQL Operator | Provision highly available relational databases |
| AWS Route53 | Bind and RFC2136 | DNS records and Let's Encrypt DNS validation |
| AWS S3 | Minio Operator | Highly available object storage |
| AWS VPC | Calico | Overlay network |
| Dex | Passmower | ACL mapping and OIDC provider which integrates with GitHub/Samba |
| GitHub Actions | Drone | Build Docker images |
| GitHub | Gitea | Source code management, issue tracking |
| GitHub OAuth2 | Samba (Active Directory compatible) | Source of truth for authentication and authorization |
| Gmail | Wildduck | E-mail |