Kubespray v1.28 · waji // devops notes

Before proceeding, we need HAProxy and KeepAlived to get LB for our 3 Control Plane nodes

I will be setting up 3 Control Plane K8s using Kubespray on Ubuntu 20.04 LTS servers on my Mini PC

IP range for K8s Nodes & HAProxy Nodes

192.168.219.245 → Master 1 (also HAProxy)

192.168.219.246 → Master 2 (also HAProxy)

192.168.219.247 → Master 3 (also HAProxy)

KeepAlived VIP → 192.168.219.250

After getting ubuntu servers ready on each node & also updating APT packages we will continue from below

Installing Haproxy & Keepalived

$ sudo apt install haproxy keepalived

We need to add the following in the sysctl.conf file

$ echo 'net.ipv4.ip_nonlocal_bind=1' | sudo tee -a /etc/sysctl.conf
net.ipv4.ip_nonlocal_bind=1

## Apply sysctl settings
$ sudo sysctl --system

Configure /etc/haproxy/haproxy.cfg

global
  log /dev/log  local0
  log /dev/log  local1 notice
  daemon

defaults
  mode          tcp
  log           global
  option        tcplog
  option        dontlognull
  option        dontlog-normal
  option        log-health-checks
  retries       3
  timeout http-request      10s
  timeout queue             1m
  timeout connect           10s
  timeout client            1m
  timeout server            1m
  timeout http-keep-alive   10s
  timeout check             10s
  maxconn                   3000

frontend stats
    bind 192.168.219.250:1936
    mode http
    log  global
    maxconn 10
    stats enable
    stats hide-version
    stats refresh 10s
    stats show-node
    stats show-desc Statistics for Kubernetes cluester
    stats auth admin:password
    stats uri /stats

listen api-server-6443
    bind 192.168.219.250:6443
    mode tcp
    balance roundrobin
    server master01 192.168.219.245:6443 check inter 1s
    server master02 192.168.219.246:6443 check inter 1s
    server master03 192.168.219.247:6443 check inter 1s

A simpler setup (not tested tho)

frontend k8s-api
    bind 192.168.219.40:6443
    bind 127.0.0.1:6443
    mode tcp
    option tcplog
    timeout client 300000
    default_backend k8s-api

backend k8s-api
    mode tcp
    option tcplog
    option tcp-check
        timeout server 300000
    balance roundrobin
    default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100

        server apiserver1 192.168.219.41:6443 check
        server apiserver2 192.168.219.42:6443 check
        server apiserver3 192.168.219.43:6443 check

Then we need to enable haproxy

$ sudo systemctl enable --now haproxy
Synchronizing state of haproxy.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable haproxy

$ sudo systemctl status haproxy
● haproxy.service - HAProxy Load Balancer
     Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-03-13 15:43:10 UTC; 6min ago

Now we need to setup VRRP

/etc/keepalived/keepalived.conf

Just be sure to check the network interface as well, as in my case, it was enp1s0 instead of eth0

For master01

global_defs {
    process_names
    # check that the script can only be edited by root
    enable_script_security
    # systemctl does only work with root
    script_user root
    # dynamic_interfaces
    vrrp_version 3
    # after switching to MASTER state 5 gratuitous arp (garp) are send and
    # after 5 seconds another 5 garp are send. (For the switches to update
    # the arp table)
    # the following option disables the second time 5 garp are send (as this
    # is not necessary with modern switches)
    vrrp_min_garp true

    # disables non compliant features (e.g. unicast_peers)
    # vrrp_strict

    # default and assigned by IANA for VRRP
    # vrrp_multicast_group4 224.0.0.18
    # optimization option for advanced use
    # max_auto_priority
}
vrrp_script chk_haproxy {
    # Note: use su -c to check if that commmand works under the
    #       keepalived_script user.
    # simple way of checking haproxy process:
    # script "/usr/bin/killall -0 haproxy"
    # the more intelligent way of checking the haproxy process
    script "/usr/bin/systemctl is-active --quiet haproxy"
    fall 2                               # 2 fails required for failure
    rise 2                               # 2 OKs required to consider the
                                         #   process up after failure
    interval 5                           # check every 5 seconds
    weight 51                            # add 50 points rc=0
}
vrrp_instance VI_1 {
    state MASTER                # MASTER on haproxy1, BACKUP on haproxy2
    interface enp1s0              # network interface to monitor
    virtual_router_id 1         # unique, same across peers
    priority 100                # most relevant for electing master (for master
                                #   50 more then on the other machines)
    advert_int 1                # specify the advertisement interval in seconds
    # check that 10er network is up
    track_interface {
        enp1s0 weight 50
    }
    # check that haproxy is up
    track_script {
        chk_haproxy
    }
    #authentication {            # non compliant but maybe good with unicast
    #    auth_type PASS
    #    auth_pass pass4kee      # 8 characters
    #}
    # Defaults to primary ip on the interface.
    # Does not really matter as the answer is received anyways with multicast.
    # You can hide the location of VRRPD by changing this source IP address
    #mcast_src_ip 192.168.123.123
    # unicast is not compliant (therefore not used but would be more simple)
    unicast_src_ip 192.168.10.61 # the IP address of this machine
    # the IP address of peer machines
    unicast_peer {
        192.168.219.246
        192.168.219.247            
    }
    virtual_ipaddress {
        192.168.219.250/24
    }
}

For master02

vrrp_instance VI_1 {
    state BACKUP                # MASTER on haproxy1, BACKUP on haproxy2
    interface enp1s0              # network interface to monitor
    virtual_router_id 1         # unique, same across peers
    priority 60
   
## Skipped the rest as they are the same
    unicast_peer {
        192.168.219.245
        192.168.219.247            
    }
    virtual_ipaddress {
        192.168.219.250/24
    }
}

For master03

vrrp_instance VI_1 {
    state BACKUP                # MASTER on haproxy1, BACKUP on haproxy2
    interface enp1s0              # network interface to monitor
    virtual_router_id 1         # unique, same across peers
    priorit
    
## Skipped the rest as they are the same
    unicast_peer {
        192.168.219.245
        192.168.219.246            
    }
    virtual_ipaddress {
        192.168.219.250/24
    }
}

Enable and check status

$ sudo systemctl enable --now keepalived
Synchronizing state of keepalived.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable keepalived

$ sudo systemctl status keepalived
● keepalived.service - Keepalive Daemon (LVS and VRRP)
     Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-03-13 16:01:06 UTC; 1s ago

So now we can actually check if VRRP is working correctly by the following test

# Check wheres .250 first
# master 01
$ ip --brief add
lo               UNKNOWN        127.0.0.1/8 ::1/128
enp1s0           UP             192.168.219.245/24 192.168.219.250/24 fe80::24d:8bff:fe61:10a5/64

# master 02
$ ip --brief add
lo               UNKNOWN        127.0.0.1/8 ::1/128
enp1s0           UP             192.168.219.246/24 fe80::294:4fff:fe68:ae/64

# Master 03
$ ip --brief add
lo               UNKNOWN        127.0.0.1/8 ::1/128
enp1s0           UP             192.168.219.247/24 fe80::24d:8bff:fe61:10d7/64


# Stop keepalived from master01
# Master 02
$ ip --brief add
lo               UNKNOWN        127.0.0.1/8 ::1/128
enp1s0           UP             192.168.219.246/24 192.168.219.250/24 fe80::294:4fff:fe68:ae/64

# Stop keepalived from master02
# Master 03
$ ip --brief add
lo               UNKNOWN        127.0.0.1/8 ::1/128
enp1s0           UP             192.168.219.247/24 192.168.219.250/24 fe80::24d:8bff:fe61:10d7/64

Seems good now lets continue with preparing kubespray

I have a bastion server actually ready on 192.168.219.101 which already has Python and kubespray requirements installed

$ python3 -V
Python 3.10.12



## To install requirements and initialize
$ git clone https://github.com/kubernetes-sigs/kubespray.git
$ cd kubespray
$ pip install -r requirements.txt

## Need to make our custom configs later so copy the sample inventory
$ cp -rfp inventory/sample inventory/mycluster

Now we edit our hosts.yaml file

$ vi inventory/mycluster/hosts.yml

all:
  hosts:
    master01:
      ansible_host: 192.168.219.245
      ip: 192.168.219.245
      access_ip: 192.168.219.245
    worker01:
      ansible_host: 192.168.219.246
      ip: 192.168.219.246
      access_ip: 192.168.219.246
    worker02:
      ansible_host: 192.168.219.247
      ip: 192.168.219.247
      access_ip: 192.168.219.247
  children:
    kube_control_plane:
      hosts:
        master01:
    kube_node:
      hosts:
        worker01:
        worker02:
    etcd:
      hosts:
        master01:
    k8s_cluster:
      children:
        kube_control_plane:
        kube_node:
    calico_rr:
      hosts: {}

We can verify we can talk with our nodes using the above hosts.yml file

$ ansible all -i inventory/mycluster/hosts.yml -m ping -u waji -k
SSH password:
[WARNING]: Skipping callback plugin 'ara_default', unable to load
master01 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
master03 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
master02 | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}

Also, before we continue with kubespray settings and installation, we need to setup Passwordless SSH

For that purpose,

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/waji/.ssh/id_rsa):
Created directory '/home/waji/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/waji/.ssh/id_rsa
Your public key has been saved in /home/waji/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:Qafe7kh70ayI8GM1pgUhiWCbNZ09pXys4N7y9wVXBPc waji@DESKTOP-LAJ2REG
The key's randomart image is:
+---[RSA 3072]----+
|.o +..o o..  ..o |
|. = +oo+oo    o .|
| o   o ++o     .E|
|    . o.oo    .  |
|     . oS .+ .   |
|    o . =.. =    |
|     = B.o.o .   |
|      O..=o .    |
|     . o+.o.     |
+----[SHA256]-----+

## Copy this key to all nodes
$ ssh-copy-id ma01
$ ssh-copy-id ma02
$ ssh-copy-id ma03

Test the ansible ad-hoc ping command from before again without -k option to verify passwordless SSH is working

$ ansible all -i inventory/mycluster/hosts.yml -m ping
[WARNING]: Skipping callback plugin 'ara_default', unable to load
master03 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
master01 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
master02 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

Now lets configure kubespray

## Edit kube version to install
sed -i -r 's/^kube_version: .*/kube_version: v1.28.4/' inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml

## Edit API-server's endpoint to our VRRP VIP
sed -i -r 's/## apiserver_loadbalancer_domain_name: .*/apiserver_loadbalancer_domain_name: "192.168.219.250"/' inventory/mycluster/group_vars/all/all.yml
sed -i -r 's/# loadbalancer_apiserver:/loadbalancer_apiserver:/' inventory/mycluster/group_vars/all/all.yml
sed -i -r 's/#\s*address: .*/  address: 192.168.219.250/' inventory/mycluster/group_vars/all/all.yml
sed -i -r 's/#\s*port: .*/  port: 6443/' inventory/mycluster/group_vars/all/all.yml

For ETCD, we have 3 modes

Host, Docker & Kubeadm

Docker mode is deprecated as far as I know, host is for running etcd as a systemd, Kubeadm is for running etcd as a static pod

we can define this in all.yml

echo "etcd_kubeadm_enabled: true" >> inventory/mycluster/group_vars/all/all.yml

## Also enable monitoring
cat << EOF >> inventory/mycluster/group_vars/etcd.yml
## Settings for etcd deployment type
etcd_deployment_type: kubeadm
etcd_metrics_service_labels:
  k8s-app: etcd
  app.kubernetes.io/managed-by: Kubespray
  app: kube-prometheus-stack-kube-etcd
  release: prometheus-stack
EOF

For CNI, we have many options such as

flannel, calico, cilium and weave

I will be using flannel, as I want something simpler this time

sed -i -r 's/^(kube_network_plugin:).*/\1 flannel/g' inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml

Additionally, i want to edit the kube svc and Pod CIDR range

sed -i '/kube_service_addresses:/s/.*/kube_service_addresses: 10.96.0.0\/18/' inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml
sed -i '/^#/!s/^kube_pods_subnet:.*/kube_pods_subnet: 172.31.0.0\/16/' inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml

okay now finally, we need to check for firewall, enable the br_netfilter overlay on boot, turn off selinux and disable swap

As I am using Ubuntu, it doesn’t have firewall enabled by default, neither does it have selinux. So i will just disable swap and enable the overlay

## Enable the overlay
$ ansible all -i inventory/mycluster/hosts.yml -m copy -a 'dest=/etc/modules-load.d/k8s.conf content="br_netfilter\noverlay\n"' -b -v
Using /home/waji/kubespray/ansible.cfg as config file
[WARNING]: Skipping callback plugin 'ara_default', unable to load
master02 | CHANGED => {
    "changed": true,
    "checksum": "4e930f0397e1b134b08ec046c1abde0ed190bb08",
    "dest": "/etc/modules-load.d/k8s.conf",
    "gid": 0,
    "group": "root",
    "md5sum": "d417ba7a530e342d3c303d735498d0d6",
    "mode": "0644",
    "owner": "root",
    "size": 21,
    "src": "/home/waji/.ansible/tmp/ansible-tmp-1710349192.4430883-79337-186833797041384/source",
    "state": "file",
    "uid": 0
}
master03 | CHANGED => {
    "changed": true,
    "checksum": "4e930f0397e1b134b08ec046c1abde0ed190bb08",
    "dest": "/etc/modules-load.d/k8s.conf",
    "gid": 0,
    "group": "root",
    "md5sum": "d417ba7a530e342d3c303d735498d0d6",
    "mode": "0644",
    "owner": "root",
    "size": 21,
    "src": "/home/waji/.ansible/tmp/ansible-tmp-1710349192.4532826-79338-232871764589651/source",
    "state": "file",
    "uid": 0
}
master01 | CHANGED => {
    "changed": true,
    "checksum": "4e930f0397e1b134b08ec046c1abde0ed190bb08",
    "dest": "/etc/modules-load.d/k8s.conf",
    "gid": 0,
    "group": "root",
    "md5sum": "d417ba7a530e342d3c303d735498d0d6",
    "mode": "0644",
    "owner": "root",
    "size": 21,
    "src": "/home/waji/.ansible/tmp/ansible-tmp-1710349192.4424303-79336-212603643092991/source",
    "state": "file",
    "uid": 0
}


## Disable swap

$ ansible all -i inventory/mycluster/hosts.yml -m shell -a "swapoff -a && sed -i '/swap/d' /etc/fstab" -b -v
Using /home/waji/kubespray/ansible.cfg as config file
[WARNING]: Skipping callback plugin 'ara_default', unable to load
master03 | CHANGED | rc=0 >>

master02 | CHANGED | rc=0 >>

master01 | CHANGED | rc=0 >>

We will now do a reboot before we run the kubespray ansible playbook

$ ansible all -i inventory/mycluster/hosts.yml -m ansible.builtin.reboot -b -v
Using /home/waji/kubespray/ansible.cfg as config file
[WARNING]: Skipping callback plugin 'ara_default', unable to load
master02 | CHANGED => {
    "changed": true,
    "elapsed": 139,
    "rebooted": true
}
master03 | CHANGED => {
    "changed": true,
    "elapsed": 151,
    "rebooted": true
}
master01 | CHANGED => {
    "changed": true,
    "elapsed": 151,
    "rebooted": true
}

Finally, we will run the ansible playbook to install k8s by kubespray. It will take a while.

ansible-playbook --flush-cache -i inventory/mycluster/hosts.yml --become --become-user=root cluster.yml

We can always reset a cluster using

ansible-playbook --flush-cache -i inventory/mycluster/hosts.yml --become --become-user=root reset.yml

Failed with 3 HA Master so went with

192.168.219.245 → Master 1

192.168.219.246 → Worker 1

192.168.219.247 → Worker 2

After a successful install, we would be able to use kubectl command to talk with our kubernetes API

$ sudo su - root

# kubectl get no
NAME       STATUS   ROLES           AGE   VERSION
master01   Ready    control-plane   18h   v1.28.4
worker01   Ready    <none>          18h   v1.28.4
worker02   Ready    <none>          18h   v1.28.4

Now we can copy the kubeconfig to the non-root user and also setup kubectl autocompletion & alias

# cp -R .kube/config /home/waji/.kube/config

$ vi ~/.bashrc
...
## Add the below
alias k=kubectl
source <(kubectl completion bash)
complete -o default -F __start_kubectl k

$ source ~/.bashrc

$ k get no
NAME       STATUS   ROLES           AGE   VERSION
master01   Ready    control-plane   18h   v1.28.4
worker01   Ready    <none>          18h   v1.28.4
worker02   Ready    <none>          18h   v1.28.4

Now let’s label our non labeled worker nodes

$ k label node worker01 node-role.kubernetes.io/worker=worker
$ k label node worker02 node-role.kubernetes.io/worker=worker

$ k get no
NAME       STATUS   ROLES           AGE   VERSION
master01   Ready    control-plane   18h   v1.28.4
worker01   Ready    worker          18h   v1.28.4
worker02   Ready    worker          18h   v1.28.4

We can check our important kubernetes components deployed as well

$ k get po -n kube-system
NAME                               READY   STATUS    RESTARTS   AGE
coredns-77f7cc69db-cplhj           1/1     Running   0          71s
coredns-77f7cc69db-nqssp           1/1     Running   0          18h
dns-autoscaler-8576bb9f5b-rlggf    1/1     Running   0          61s
etcd-master01                      1/1     Running   0          18h
kube-apiserver-master01            1/1     Running   1          18h
kube-controller-manager-master01   1/1     Running   2          18h
kube-flannel-jqhjs                 1/1     Running   0          18h
kube-flannel-nvh9s                 1/1     Running   0          18h
kube-flannel-vbltt                 1/1     Running   0          18h
kube-proxy-2spzr                   1/1     Running   0          18h
kube-proxy-86bq7                   1/1     Running   0          18h
kube-proxy-f758f                   1/1     Running   0          18h
kube-scheduler-master01            1/1     Running   1          18h
nginx-proxy-worker01               1/1     Running   0          18h
nginx-proxy-worker02               1/1     Running   0          18h
nodelocaldns-f2mq7                 1/1     Running   0          18h
nodelocaldns-hg46s                 1/1     Running   0          18h
nodelocaldns-pghfr                 1/1     Running   0          18h

We now actually have to run a test workload to test if our cluster is actually working without any issues

cat << EOF > busybox-test.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: busybox-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
      - name: busybox
        image: docker.io/library/busybox:latest
        imagePullPolicy: IfNotPresent
        args:
        - sleep
        - "3600"
EOF

$ k apply -f busybox-test.yaml
deployment.apps/busybox-deployment created

$ $ k get po
NAME                                  READY   STATUS    RESTARTS   AGE
busybox-deployment-54f6c864b6-2fvzz   1/1     Running   0          42s

Cool we are good to go now!

Issues Faced

On kubespray ansible playbook command, there an error can occur depending on ansible-core version

TASK [kubernetes/preinstall : Stop if either kube_control_plane or kube_node group is empty] ***
FAILED! => {"msg": "The conditional check 'groups.get('kube_control_plane')' failed. The error was: Conditional is marked as unsafe, and cannot be evaluated."}

To solve this there are 2 options,

Remove this task from roles/kubernetes/preinstall/tasks/0040-verify-settings.yml
Edit the that: "groups.get('{{ item }}')" with that: "groups.get( item )"

I personally used 2.

For more info and reference: https://github.com/kubernetes-sigs/kubespray/issues/10688