kube/CLUSTER.md
rasmus 66034d2463 docs: mega refactor
Also bunch of edits at wiki.k-space.ee
2024-07-30 10:51:34 +03:00

8.2 KiB

Kubernetes cluster

Kubernetes hosts run on PVE Cluster. Hosts are listed in Ansible inventory.

kubectl

Authenticate to auth.k-space.ee:

kubectl krew install oidc-login
mkdir -p ~/.kube

cat << EOF > ~/.kube/config
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1EVXdNakEzTXpVMU1Wb1hEVE15TURReU9UQTNNelUxTVZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS2J2CjY3UFlXVHJMc3ZCQTZuWHUvcm55SlVhNnppTnNWTVN6N2w4ekhxM2JuQnhqWVNPUDJhN1RXTnpUTmZDanZBWngKTmlNbXJya1hpb2dYQWpVVkhSUWZlYm81TFIrb0JBOTdLWlcrN01UMFVJRXBuWVVaaTdBRHlaS01vcEJFUXlMNwp1SlU5UDhnNUR1T29FRHZieGJSMXFuV1JZRXpteFNmSFpocllpMVA3bFd4emkxR243eGRETFZaMjZjNm0xR3Y1CnViRjZyaFBXK1JSVkhiQzFKakJGeTBwRXdhYlUvUTd0Z2dic0JQUjk5NVZvMktCeElBelRmbHhVanlYVkJ3MjEKU2d3ZGI1amlpemxEM0NSbVdZZ0ZrRzd0NTVZeGF3ZmpaQjh5bW4xYjhUVjkwN3dRcG8veU8zM3RaaEE3L3BFUwpBSDJYeDk5bkpMbFVGVUtSY1A4Q0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZKNnZKeVk1UlJ1aklQWGxIK2ZvU3g2QzFRT2RNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBQ04zcGtCTVM3ekkrbUhvOWdTZQp6SzdXdjl3bXlCTVE5Q3crQXBSNnRBQXg2T1VIN0d1enc5TTV2bXNkYjkrYXBKMHBlZFB4SUg3YXZ1aG9SUXNMCkxqTzRSVm9BMG9aNDBZV3J3UStBR0dvdkZuaWNleXRNcFVSNEZjRXc0ZDRmcGl6V3d0TVNlRlRIUXR6WG84V2MKNFJGWC9xUXNVR1NWa01PaUcvcVVrSFpXQVgyckdhWXZ1Tkw2eHdSRnh5ZHpsRTFSUk56TkNvQzVpTXhjaVRNagpackEvK0pqVEFWU2FuNXZnODFOSmthZEphbmNPWmEwS3JEdkZzd1JJSG5CMGpMLzh3VmZXSTV6czZURU1VZUk1ClF6dU01QXUxUFZ4VXZJUGhlMHl6UXZjWDV5RlhnMkJGU3MzKzJBajlNcENWVTZNY2dSSTl5TTRicitFTUlHL0kKY0pjPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
    server: https://master.kube.k-space.ee:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: oidc
  name: default
current-context: default
kind: Config
preferences: {}
users:
- name: oidc
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - oidc-login
      - get-token
      - --oidc-issuer-url=https://auth.k-space.ee/
      - --oidc-client-id=passmower.kubelogin
      - --oidc-use-pkce
      - --oidc-extra-scope=profile,email,groups
      - --listen-address=127.0.0.1:27890
      command: kubectl
      env: null
      provideClusterInfo: false
EOF

# Test it:
kubectl get nodes # opens browser for authentication

systemd-resolved issues

Unable to connect to the server: dial tcp: lookup master.kube.k-space.ee on 127.0.0.53:53: no such host
Network → VPN → `IPv4` → Other nameservers (Muud nimeserverid): `172.21.0.1`
Network → VPN → `IPv6` → Other nameservers (Muud nimeserverid): `2001:bb8:4008:21::1`
Network → VPN → `IPv4` → Search domains (Otsingudomeenid): `kube.k-space.ee`
Network → VPN → `IPv6` → Search domains (Otsingudomeenid): `kube.k-space.ee`

Cluster formation

Created Ubuntu 22.04 VM-s on Proxmox with local storage. Added some ARM64 workers by using Ubuntu 22.04 server on Raspberry Pi.

After machines have booted up and you can reach them via SSH:

# Disable Ubuntu caching DNS resolver
systemctl disable systemd-resolved.service
systemctl stop systemd-resolved
rm -fv /etc/resolv.conf
cat > /etc/resolv.conf << EOF
nameserver 1.1.1.1
nameserver 8.8.8.8
EOF

# Disable multipathd as Longhorn handles that itself
systemctl mask multipathd snapd
systemctl disable --now multipathd snapd bluetooth ModemManager hciuart wpa_supplicant packagekit

# Permit root login
sed -i -e 's/PermitRootLogin no/PermitRootLogin without-password/' /etc/ssh/sshd_config
systemctl reload ssh
cat ~ubuntu/.ssh/authorized_keys > /root/.ssh/authorized_keys
userdel -f ubuntu
apt-get install -yqq linux-image-generic
apt-get remove -yq cloud-init linux-image-*-kvm

On master:

kubeadm init --token-ttl=120m --pod-network-cidr=10.244.0.0/16 --control-plane-endpoint "master.kube.k-space.ee:6443" --upload-certs --apiserver-cert-extra-sans master.kube.k-space.ee --node-name master1.kube.k-space.ee

For the kubeadm join command specify FQDN via --node-name $(hostname -f).

Set AZ labels:

for j in $(seq 1 9); do
  for t in master mon worker storage; do
    kubectl label nodes ${t}${j}.kube.k-space.ee topology.kubernetes.io/zone=node${j}
  done
done

After forming the cluster add taints:

for j in $(seq 1 9); do
  kubectl label nodes worker${j}.kube.k-space.ee node-role.kubernetes.io/worker=''
done

for j in $(seq 1 4); do
  kubectl taint nodes mon${j}.kube.k-space.ee dedicated=monitoring:NoSchedule
  kubectl label nodes mon${j}.kube.k-space.ee dedicated=monitoring
done

for j in $(seq 1 4); do
  kubectl taint nodes storage${j}.kube.k-space.ee dedicated=storage:NoSchedule
  kubectl label nodes storage${j}.kube.k-space.ee dedicated=storage
done

For arm64 nodes add suitable taint to prevent scheduling non-multiarch images on them:

kubectl taint nodes worker9.kube.k-space.ee arch=arm64:NoSchedule

For door controllers:

for j in ground front back; do
  kubectl taint nodes door-${j}.kube.k-space.ee dedicated=door:NoSchedule
  kubectl label nodes door-${j}.kube.k-space.ee dedicated=door
  kubectl taint nodes door-${j}.kube.k-space.ee arch=arm64:NoSchedule
done

To reduce wear on storage:

echo StandardOutput=null >> /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
systemctl restart kubelet

Technology mapping

Our self-hosted Kubernetes stack compared to AWS based deployments:

Hipster startup Self-hosted hackerspace Purpose
AWS ALB Traefik Reverse proxy also known as ingress controller in Kubernetes jargon
AWS AMP Prometheus Operator Monitoring and alerting
AWS CloudTrail ECK Operator Log aggregation
AWS DocumentDB MongoDB Community Operator Highly available NoSQL database
AWS EBS Longhorn Block storage for arbitrary applications needing persistent storage
AWS EC2 Proxmox Virtualization layer
AWS ECR Harbor Docker registry
AWS EKS kubeadm Provision Kubernetes master nodes
AWS NLB MetalLB L2/L3 level load balancing
AWS RDS for MySQL MySQL Operator Provision highly available relational databases
AWS Route53 Bind and RFC2136 DNS records and Let's Encrypt DNS validation
AWS S3 Minio Operator Highly available object storage
AWS VPC Calico Overlay network
Dex Passmower ACL mapping and OIDC provider which integrates with GitHub/Samba
GitHub Actions Drone Build Docker images
GitHub Gitea Source code management, issue tracking
GitHub OAuth2 Samba (Active Directory compatible) Source of truth for authentication and authorization
Gmail Wildduck E-mail