Difference between revisions of "K8s Installation/Process"

From PKC
Jump to navigation Jump to search
 
(14 intermediate revisions by 2 users not shown)
Line 143: Line 143:
  Environment="cgroup-driver=systemd/cgroup-driver=cgroupfs"
  Environment="cgroup-driver=systemd/cgroup-driver=cgroupfs"
Then, switch back to regular user by <code>ctrl+D</code>
Then, switch back to regular user by <code>ctrl+D</code>
Updated '''[31-Aug-2021]'''<br>
Based on attempt to deploy kubernetes on IONOS Cloud Service, <code>sudo kubeadm init</code> is not working. Failure is detected at starting kubelet systemd services. Inspecting on <code>sudo journalctl -xeu kubelet</code>, displays below result.
<syntaxhighlight>
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.834359  16902 docker_service.go:242] "Hairpin mode is set" hairpinMode=hairpin-veth
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.834492  16902 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.836740  16902 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.836809  16902 docker_service.go:257] "Docker cri networking managed by the network plugin" networkPluginName="cni"
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.836928  16902 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
"Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\"
</syntaxhighlight>
Article found in github, pointing that cgroups uses to run the container inside docker service is not compatible with systemd, despite the cgroup-driver instruction already put into file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf. [Please see above]
One article found in https://www.unixcloudfusion.in/2021/08/solved-kubelet-isnt-running-or.html, shows that in this particular case, need to create file to instruct docker to use cgroupdriver = systemd, below are the command.
sudo vi /etc/docker/daemon.json
Insert below lines:
{
  "exec-opts": ["native.cgroupdriver=systemd"]
}
Proceed with docker restart and check status
systemctl restart docker
systemctl status docker
then, redo the kubeadm init command, and proceed with other command to complete the installation. Please also noted that this step needs to be performed on worker node as well.


==Only do this for the Master Node==
==Only do this for the Master Node==
It is time to use the <code>kubeadm init</code> command. Note that the output of <code>kubeadm init</code> should be recorded for other workers to join the cluster.   
It is time to use the <code>kubeadm init</code> command. Note that the output of <code>kubeadm init</code> should be recorded for other workers to join the cluster.   
  sudo kubeadm init
  sudo kubeadm init
In the event to add more node, or if one needs to retrieve join command, run below command on master node.
kubeadm token create --print-join-command
Or one may specify more parameter using this instruction:  
Or one may specify more parameter using this instruction:  
  sudo kubeadm init --pod-network-cidr=<depends on calico or flannel pod network> --apiserver-advertise-address=<ip-address-of-master>
  sudo kubeadm init --pod-network-cidr=<depends on calico or flannel pod network> --apiserver-advertise-address=<ip-address-of-master>
Line 180: Line 206:
At the end, run this command to get the worker node, or slave to join the cluster.
At the end, run this command to get the worker node, or slave to join the cluster.
  sudo kubeadm join 172.31.20.148:6443 --token h5c0bs.nt4vtekd1eb7qupd --discovery-token-ca-cert-hash sha256:088a39a92611a81f18f16bd6ab6b1b438a7015b68392fa8559487c9240c1d1b6
  sudo kubeadm join 172.31.20.148:6443 --token h5c0bs.nt4vtekd1eb7qupd --discovery-token-ca-cert-hash sha256:088a39a92611a81f18f16bd6ab6b1b438a7015b68392fa8559487c9240c1d1b6
===Notes on re-joining the worker nodes===
In the event that worker nodes is stop working, or one need to reset and rejoin the worker nodes, first step are to reset the kubeadm state on worker node by issued below command
kubeadm reset
Then, carefully read the output, which will displayed which folder that should be examine and removed using root account. Once all the folder and files is correctly removed, finds out the join command by printing join command on master node
kubeadm token create --print-join-command
which will display the join command, that we can use on worker node to join the master node, which will displayed some text like this.
kubeadm join 172.31.20.148:6443 --token ouaq5t.fscjirohg150p3ns --discovery-token-ca-cert-hash sha256:088a39a92611a81f18f16bd6ab6b1b438a7015b68392fa8559487c9240c1d1b6
once the command is correctly run, check the result on master node
kubectl get nodes
which will display all node in the kubernetes cluster


=Check Installation Results=
=Check Installation Results=
Line 204: Line 245:
# https://computingforgeeks.com/deploy-kubernetes-cluster-on-ubuntu-with-kubeadm/
# https://computingforgeeks.com/deploy-kubernetes-cluster-on-ubuntu-with-kubeadm/
# https://blog.knoldus.com/how-to-install-kubernetes-on-ubuntu-20-04-kubeadm-and-minikube/
# https://blog.knoldus.com/how-to-install-kubernetes-on-ubuntu-20-04-kubeadm-and-minikube/
# [[Video/Setup Kubernetes on AWS]]


=References=
=References=

Latest revision as of 21:58, 31 August 2021

  1. A detailed manual procedure to install K8s software and PKC data content.
  2. Incrementally document installation process in PKC, and slowly migrate manual processes into automated processes by Jenkins, Ansible, and Terraform. These actions should be hyperlinked in corresponding PKC pages.


Installation Outline

After watching many videos on installing Kubernetes, the following outline is extracted from Edureka's video tutorial. One should also have browsed through this book[1] at least once.

The suggested initial configuration is to start with one master node and one worker node (Kubernetes Cluster minimal configuration).

The master node must have at least 2 CPU cores, 2 Gb of memory.
The worker node (slave) should have at least 1 CPU core and 2 Gb of memory.

If one needs to add more machine, one can do it after the above two machines are working.

Ideally, set up the domain names of these two nodes with their public IPv4 addresses.

Set the right Security Groups

{{#lst:Input/K8s Installation/Security Groups|Security Groups}}

Procedure for both Master and Slave

To automate this repeated process on many machines, consider using Ansible or Terraform. The usual initial action is to update, but make sure that you switch to the super user mode and keep being in that role throughout most of the installation.

sudo su
apt update

Swap Space must be turned off

Then, turn off swap space

swapoff -a

The /etc/fstab file must be edited to remove the line entry that specifies a /swapoff directory. It is possible that the /swapoff line doesn't exist already. (Ubuntu 20.04 Practice)

nano /etc/fstab

Install Net-Tools

Before inspecting the IP addresses of these machines, you will need to first install Net-Tools.

apt install net-tools

Afterwards, run the instruction:

ifconfig

This will show some information, such as follows:

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
        inet 172.31.20.148  netmask 255.255.240.0  broadcast 172.31.31.255
        inet6 fe80::5d:ddff:fea4:9331  prefixlen 64  scopeid 0x20<link>
        ether 02:5d:dd:a4:93:31  txqueuelen 1000  (Ethernet)
        RX packets 14905  bytes 20660726 (20.6 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2247  bytes 270976 (270.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 198  bytes 17010 (17.0 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 198  bytes 17010 (17.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Update Hostname, Hosts, and Setup Static IP

Then, update the hostname file at the /etc/hostname location:

nano /etc/hostname

Change the master node and worker node to kmaster and worker1 respectively. The string that shows up in this file /etc/hostname, will eventually show up in the command line interface. Execute command hostname -F /etc/hostname and new hostname will show up in a terminal app with prompts looking like: root@kmaster: or root@worker1: after re-login or execute new bash shell.

Static IP address

Notice the inet value: 172.31.20.148. This is the static and private IP address, that can be used even after one reboots the machine. One may set up the static IP address here:

nano /etc/network/interfaces

Then, put in the following textual content:

auto enp0s8
iface enp0s8 inet static
address <IP-Address-of-masters and slaves>

In the above mentioned example, <IP-Address-of-masters and slaves> is 172.31.20.148. Note that this step needs to be executed for each master and slave/worker node.

Statically defined host names

After setting up the network interfaces, one needs to set up a static look up table for the /etc/hosts file:

nano /etc/hosts


Then, put in the following textual content: Note that this step needs to be executed for each master and slave/worker node.

127.0.0.1 localhost
172.31.28.100 worker1
172.31.20.148 kmaster

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Note that one may put in multiple entries at once, in this case, two entries, w1 and kmaster, are registered in the /etc/hosts file.

At this time, one may issue the command reboot, to restart the machine.

reboot

After rebooting, the machines should show the host name in command line. More importantly, use ifconfig to check if the locally-defined IP address has been kept stable.

Install SSH Server and Docker

If ssh server is not available, it can be set up by:

sudo apt-get install openssh-server

Afterwards, one may install docker

sudo apt-get update
sudo apt-get install -y docker.io

Then, one must make sure curl and other things are available:

sudo apt-get update && sudo apt-get install -y apt-transport-https curl

Then, switch to root and use curl to install the following

sudo -su root 
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -

kubeadm, kubelet and kubectl

Then, one needs to add a file to the Debian package list:

cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF

Then, run the following two commands:

apt-get update
apt-get install -y kubelet kubeadm kubectl

These two commands could be replaced by compressing all the above into one instruction

echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" \
| sudo tee -a /etc/apt/sources.list.d/kubernetes.list \
&& sudo apt-get update

Update the kubeadm.conf file

The following file must be edited with an extra data entry:

nano /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

In this file, add the following entry to the last line.

Environment="cgroup-driver=systemd/cgroup-driver=cgroupfs"

Then, switch back to regular user by ctrl+D

Updated [31-Aug-2021]
Based on attempt to deploy kubernetes on IONOS Cloud Service, sudo kubeadm init is not working. Failure is detected at starting kubelet systemd services. Inspecting on sudo journalctl -xeu kubelet, displays below result.

Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.834359   16902 docker_service.go:242] "Hairpin mode is set" hairpinMode=hairpin-veth
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.834492   16902 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.836740   16902 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.836809   16902 docker_service.go:257] "Docker cri networking managed by the network plugin" networkPluginName="cni"
Aug 31 20:35:21 kmaster kubelet[16902]: I0831 20:35:21.836928   16902 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
"Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\"

Article found in github, pointing that cgroups uses to run the container inside docker service is not compatible with systemd, despite the cgroup-driver instruction already put into file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf. [Please see above]

One article found in https://www.unixcloudfusion.in/2021/08/solved-kubelet-isnt-running-or.html, shows that in this particular case, need to create file to instruct docker to use cgroupdriver = systemd, below are the command.

sudo vi /etc/docker/daemon.json

Insert below lines:

{
 "exec-opts": ["native.cgroupdriver=systemd"]
}

Proceed with docker restart and check status

systemctl restart docker
systemctl status docker

then, redo the kubeadm init command, and proceed with other command to complete the installation. Please also noted that this step needs to be performed on worker node as well.

Only do this for the Master Node

It is time to use the kubeadm init command. Note that the output of kubeadm init should be recorded for other workers to join the cluster.

sudo kubeadm init

In the event to add more node, or if one needs to retrieve join command, run below command on master node.

kubeadm token create --print-join-command

Or one may specify more parameter using this instruction:

sudo kubeadm init --pod-network-cidr=<depends on calico or flannel pod network> --apiserver-advertise-address=<ip-address-of-master>

An example based on Calico Pod Network (192.168.0.0/16) is shown here:

 sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=172.31.20.148

An example based on Flannel Pod Network (10.244.0.0/16) is shown here:

 sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=172.31.20.148

After the Kubernetes master node has been successfully initialized, one must run the following three instructions before proceeding:

 mkdir -p $HOME/.kube
 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
 sudo chown $(id -u):$(id -g) $HOME/.kube/config

The reverse operation of kubeadm init is kubeadm reset in root mode.

If the Internal IP was specified wrong, the process will stuck. Then you need to reset in root mode. (Not sure whether to clean the .kube directory or the .kube/config file)

Pod network based on Calico (or Flannel)

First download the calico.yaml file: (The following instruction is different from the original video.)

curl https://docs.projectcalico.org/manifests/calico.yaml -O

Use the kubectl apply command.

kubectl apply -f calico.yaml

An alternative Kubernetes networking substrate is Flannel. The URL to get the YAML file is here:

curl https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml -O

Then you may just perform kubectl apply command to install the network substrate:

kubectl apply -f kube-flannel.yaml

Kubernetes Dashboard deployed on Master Node

To avoid running the Dashboard on worker nodes, the following command needs to be launched before any worker nodes joining the cluster: (It is important to note that the hyperlink in the original video is not working, the following instruction has been tested as of July 21, 2021)

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml

A service account must be created to make dashboard available

kubectl create serviceaccount dashboard -n default 

Then, it needs to create the dashboard-admin

kubectl create clusterrolebinding dashboard-admin -n default --clusterrole=cluster-admin --serviceaccount=default:dashboard

Then, one needs to get the secret key to be pasted into the web-based interface of Kubernetes Dashboard.

kubectl get secret $(kubectl get serviceaccount dashboard -o jsonpath="{.secrets[0].name}") -o jsonpath="{.data.token}" | base64 --decode

This will generate a long token string, something like this (yours must be different, since it is a dynamically generated):

eyJhbGciOiJSUzI1NiIsImtpZCI6IjRPcQhLUU54QXE1R1dSdnhsZmdMaUxMN0NMdk8wZ1ZEUjhUMVFLZkdVaEkifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImRhc2hib2FyZC10b2tlbi1wd3FwNSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJkYXNoYm9hcmQiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJhOGRkMDE0OC1jNDQzLTRkODctYTVjOC01MjJmYmRjZGMxMGQiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6ZGVmYXVsdDpkYXNoYm9hcmQifQ.jbjvJgnLCMhsWbdg_7A83PJZ3sLskS9MPTFEZGvUr8eC-I0tosPXgMBkiWgoAgPcFNpYjXWRN3Ia66vYTEHAc0kqSDsZAMbUP48pszBQR0InPk_7tt7kQn3Scx6FEhkRbxVXaiVqYaafxLdOQlbAP_Xz9KOTjq2L-RU0Pxf83FsAITpJbTVmX7oz_trZSEeP1knqjnnKKn3ppYoYWpAwD-FDDXlPwqvHidxLB-Db9rbxkVGSI2yAibtW6mgldStEC9uv64zBpleUVMIw7ys6a9LxuHuVNg29oZrWaIL7qi6rvvQIqvzcneRRUSC2E7pV-Jl7x80leyRL8SRUiKhBsg

To allow access to this dashboard, the following command must be run:

kubectl proxy


Only at the Slave

At the end, run this command to get the worker node, or slave to join the cluster.

sudo kubeadm join 172.31.20.148:6443 --token h5c0bs.nt4vtekd1eb7qupd --discovery-token-ca-cert-hash sha256:088a39a92611a81f18f16bd6ab6b1b438a7015b68392fa8559487c9240c1d1b6

Notes on re-joining the worker nodes

In the event that worker nodes is stop working, or one need to reset and rejoin the worker nodes, first step are to reset the kubeadm state on worker node by issued below command

kubeadm reset

Then, carefully read the output, which will displayed which folder that should be examine and removed using root account. Once all the folder and files is correctly removed, finds out the join command by printing join command on master node

kubeadm token create --print-join-command

which will display the join command, that we can use on worker node to join the master node, which will displayed some text like this.

kubeadm join 172.31.20.148:6443 --token ouaq5t.fscjirohg150p3ns --discovery-token-ca-cert-hash sha256:088a39a92611a81f18f16bd6ab6b1b438a7015b68392fa8559487c9240c1d1b6

once the command is correctly run, check the result on master node

kubectl get nodes

which will display all node in the kubernetes cluster

Check Installation Results

Check the final result on master node, all the nodes need to be in Ready status: $ kubectl get nodes

NAME      STATUS   ROLES                  AGE    VERSION
kmaster   Ready    control-plane,master   121m   v1.21.3
worker1   Ready    <none>                 36s    v1.21.3

To see all the pods, use the following instruction:

kubectl get pods -o wide --all-namespaces

To let a worker leave the cluster, run

kubeadm reset

on the worker node and run

kubectl delete node worker1

Kubernetes on Ubuntu by Edureka

A complete 6 hour lecture series can be found here.

|UWg3ORRRF60}}

Some online installation tutorials

  1. Video/Installing Kubernetes
  2. https://blog.alexellis.io/kubernetes-in-10-minutes/
  3. https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
  4. https://computingforgeeks.com/deploy-kubernetes-cluster-on-ubuntu-with-kubeadm/
  5. https://blog.knoldus.com/how-to-install-kubernetes-on-ubuntu-20-04-kubeadm-and-minikube/
  6. Video/Setup Kubernetes on AWS

References

  1. Arundel, John; Domingus, Justin (2019). Cloud Native DevOps with Kubernetes. O'Reilly Media. ISBN 978-1-492-04076-7. 

Related Pages

K8s_Installation