Table of Contents
- What is Kubernetes?
- Kubernetes Architecture
- Manage K8s Compoenents - kubectl & Config File
- Kubernetes Installation Steps
- Preparing Servers
- Working with kubectl
- Kubernetes Networking
- Finishing Cluster Bootstrapping...
- Labels & Selectors + Service Networking
-
Make application accessible from outside the cluster
- NodePort Service Type
- Loadbalancer Service Type
- External Access with Ingress
- Setup Ingress
-
Users & Permissions
- User & Groups in Kubernetes
- Other Authorization Modes
- Certificates in Kubernetes
- Certificates API
- Create User Account
- Service Account & permissions
- Troubleshooting In Kubernetes
-
Multiple Containers in a Pod
- Init and Sidecar Containers
- Add Sidecar and Init Containers
- Exposing Pod and Cluster Vars to Containers
- Data Persistence
- ConfigMap & Secret
-
Resource Requests & Limits
- What are Resource Requests * Limits?
- Configure Requests & Limits
-
Node Affinity, Taints & Tolerations
- Assigning Pods to Nodes (Part 1)
- Assigning Pods to Nodes (Part 2)
- Taints & Tolerations
- Inter-Pod Affinity
-
Health Checks - Readiness and Liveness Probes
- What are Liveness & Readiness Probes?
- Configure Liveness & Readiness Probes
- Deployment Update Strategies - Rolling Update
- Owners and dependents
- ETCD Backup & Restore
- Kubernetes REST API
-
Upgrade Kubernetes Cluster
- How Cluster Upgrade works?
- Demo: Upgrade Cluster
- Manage multiple clusters with Contexts
- K8s Certificate Management
- Secure cluster - Network Policies
- Pro Tips
- Open source container
orchestration tool - Developed by Google
- Helps to manage containerize applications on
different deployment environments
Need for a container orchestration tool
- Trend from Monolith to Microservice
- Increase usage of containers
- Deman for a proper way of managing those hundreds of containers
- High Availability or no downtime
- Scalability or high performance
- Disaster recovery - Backup and restore
Worker Machine in K8s Cluster
- Each Node has multiple Pods on it
- Worker Nodes do the actual work
- 3 Processes must be installed on every Node
- Container runtime
- Kubelet
- Kubelet interacts with both the container and node
- Kubelet starts the Pod with a container inside
- Kube Proxy
- Kube Proxy forwards the requests
So, how do you interact with this cluster?
- How to:
- Scheduler Pods?
- Monitor?
- re-scheduled/Re-start Pod
- Join a new Worker?
- etc
- Managing processes are done by Master Nodes (The Control Plane)
- 4 processes runs on every control plane node
- Api Server
- Cluster gateway
- Acts as the gatekeeper for authentication!
- Only 1 entrypoint into the cluster
- Api Server is load balanced
- Scheduler
Scheduler: Where to put the Pod?- Scheduler just decides on which Node new Pod should be scheduled?
Kubeletactually starts the Pod
- Controller Manager
Controller Manager: Detects cluster state changes
- etcd
- Etcd is the cluster brain
- Cluster changes get stored here
- Application data is NOT stored in etcd!
- Distributed storage across all master nodes
- Api Server
- Very powerful way that enables interaction with the cluster
- Limitation when using those commands
- Best practice: Use
K8s Configuration file
- Write the component configuration in a file
Applyconfiguration file withkubectl- Multiple K8s components in 1 file and apply them with 1 apply command
UpdateK8s components:- Edit the config file and apply it again
DeleteK8s components with config file
- Imperative
- Telling Kubernetes WHAT to do
- We operate directly on live objects
- Declarative
- Telling Kubernetes WHAT we want as the end result in the config file
- We operate on object configuration files
- We don't define the operations, these are automatically detected by kubectl
Which one to use?
- Imperative
- Practical when testing
- Or for quick one-off tasks
- Or when just getting started
- Declarative
- History of configurations
- Infrastructure as Code in Git Repo
- Collaboration and review processes possible
- More transparent
- On
Control Plane&Worker- Container Runtime (Run as regular Linux process)
- Kubelet (Run as regular Linux process)
- Kube Proxy (Pod)
- Only on
Control Plane- Api Server (Pod)
- Scheduler (Pod)
- Controller Manger (Pod)
- ETCD (Pod)
- Master components deployed as Pods
- Pods are deployed by master components
- Send a request to
API Server Schedulerdecides where to place Pod- Pod data stored in
etcdstore
- Send a request to
- How to schedule the Master Pods then? (The Egg and Chicken Problem!)
Static Pods
- Are managed directly by the kubelet daemon
- Without control plane
- Regular Pod Scheduling
API Servergets the requestScheduler: which Node?Kubelet: schedules Pod
- Static Pod Scheduling
Kubelet: schedules Pod
How does that work?
- Kubelet watches a specific location on the Node it is running
/etc/kubernetes/manifests
- Schedules Pod, when it finds a "Pod" manifest
-
Why is it called static Pod?
-
How is it different?
-
Kubelet (NOT Controller Manager) watches static Pods and restarts them if they fail
-
Pod names are suffixed with the node hostname
-
First step when installing K8s cluster
- Generate static Pods manifests
- Put those config files into the correct folder
Everything needs a certificate...
How does it work?
- Generate self-signed CA certificate for Kubernetes (cluster root CA)
- Sign all client and server certificates with it
- Certificates are stored in:
/etc/kubernetes/pki - Each component gets a certificate, signed by the same certificate authority
- Proof that components identify and that its part of the same cluster
- Generate a self-signed CA certificate for the whole Kubernetes cluster (
cluster root CA) - Sign all client and server certificates with it
Server certificatefor the API server endpointClient certificatefor scheduler and controller managerServer certificatefor Etcd and KubeletClient certificatefor API Server to talk to Kubelet and EtcdClient certificatefor Kubelet to authenticate to API Server
Public Key Infrastructure
- Governs the issuance of certificates to:
- Protect sensitive data
- Provide unique digital identities for applications, users and devices
- Secure end-to-end communication
For a K8s cluster, we need to do all the steps above + some other configuration details we need to provide
But it is complex and time consuming, when doing it manually
-
Kubeadm
- Toolkit for bootstrapping a best-practices K8s cluster
-
Providing fast paths for creating K8s cluster
-
Performs the actions necessary to get a minimum viable cluster
-
It cares only about bootstrapping, not about provisioning machines
- All
sudo swapoff -a
-
All
sudo vim /etc/hosts
-
Add all the server ips and correspond names in
/etc/hostsfile, e.g.172.31.44.88 master 172.31.44.219 worker1 172.31.37.5 worker2
- All
sudo hostnamectl set-hostname <correspond names e.g. master>
- All
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF sudo modprobe overlay sudo modprobe br_netfilter cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF sudo sysctl --system
- All
sudo apt update sudo apt install -y containerd sudo mkdir -p /etc/containerd containerd config default | sudo tee /etc/containerd/config.toml sudo systemctl restart containerd service containerd status
-
Kubelet
- Does things like starting pods and containers
- Components that runs on all the machines in your cluster
-
Kubeadm
- Command line tool to initialize the cluster
-
Kubectl
- Command line tool to talk to the cluster
-
All
sudo apt-get update sudo apt-get install -y apt-transport-https ca-certificates curl sudo curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg https://packages.cloud.google.com/apt/doc/apt-key.gpg echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
NOTE-1: Kubelet, Kubeadm and Kubectl MOST be ALL in SAME VERSION...
NOTE-2: For seeing all available versions, use:
Use the apt-cache madison kubeadm to get started.
- All
sudo apt-get update sudo apt-get install -y kubelet=<VERSION> kubeadm=<VERSION> kubectl=<VERSION> sudo apt-mark hold kubelet kubeadm kubectl
- preflight
- certs
- kubeconfig
- Checks to validate the system state making any changes
- Generate a self-signed CA to set up identities for each component in the cluster
- writes kubeconfig files in `/etc/kubernetes`
- Master
sudo kubeadm init
/etc/kubernetes/etc/kubernetes/manifests/*/var/lib/kubelet/var/lib/kubelet/pki/var/lib/kubelet/config.yaml
- Now we only can interact with kubectl this way:
sudo kubectl get nodes --kubeconfig=/etc/kubernetes/admin.conf
But it's not an efficient way. So we have two options:
- File passed with
--kubeconfigflag KUBECONFIGenvironment variable- File located in
$HOME/.kube/configfolder
Option 1 is not efficient because everytime you should pass this flag...
Option 2 is not efficient too because its only works in the current session...
But the option number 3 is awesome and actually a bets practise...
-
Create
.kubefoldermkdir -p ~/.kube -
Copy
admin.confto this foldersudo cp -i /etc/kubernetes/admin.conf ~/.kube/config -
Change owner of this file to ourselves
sudo chown $(id -u):$(id -g) ~/.kube/config
-
Now everything is good, and we don't have to use
sudoor--kubeconfigkubectl get nodes
What is a Namespace?
- Organise resources in namespaces
- Virtual cluster inside a cluster ("Cluster inside a Cluster")
4 Namespaces per Default:
kube-syatemkube-publickube-node-leasedefault
Create Namespace:
- Command-line
kubectl create namespace <MY_NAMESPACE>
- Configuration File
apiVersion: ... metadata: name: ... namespace: <MY_NAMESPACE>
- Resource grouped in Namespace
- Conflicts: Many teams, same application
- Resource Sharing:
- Staging and Development
- Blue/Green Deployment
- Access and Resource Limits on Namespace
- Each team has its own, isolated environment
- Limit: CPU, RAM, Storage per NS
Use Cases when to use Namespace
- Structure your components
- Avoid conflicts between teams
- Share services between different environment
- Access and Resource Limits on Namespace Level
Some other note
- You can't access most resources from another Namespace
-
There are some components, which can't be created within a Namespace
- They are live globally in a cluster
- You can't isolate them
-
You can get list of all of those by this command
kubectl api-resources --namespaced=false
Now that we know what Namespaces are and How they are used, It's time to jump into these question:
- What Namespaces do we have in our cluster?
- What Pods are running in those Namespaces?
kubectl get nsdefault:- For your applications, when you don't create a specific ns
- The
defaultNamespace is used as default when executingkubectlcommands - To get another Namespace:
kubectl get ns -n kube-syatem
kube-system:- Control Plane Pods are located in
kube-systemns
- Control Plane Pods are located in
kube-public:- It contains a single ConfigMap object,
cluster-info, that aids discovery and security bootstrap (basically, contains theCAfor the cluster and such). This object is readable without authentication.
- It contains a single ConfigMap object,
kube-node-lease:- It makes node heartbeats significantly cheaper from both scalability and performance perspective.
Sources:
- Why is a
Pod abstraction useful? - Container vs. Pod
- When are
multiple containersnecessary? - How
containers communicate in a Pod?
Every Pod has a unique IP address.
IP address reachable from all other Pods in K8s cluster.
- Container Port Mapping
WITHOUT Pods
Bind host port to application port in container (5432:5432)
- Own IP address
- Own Network namespace
- Virtual Ethernet Connection
- Pod is a Host
- No conflicts...
If you change the container runtime in k8s, e.g. from containerd to docker, K8s configuration would stay the same...
Multiple containers in a Pod
- Hepler or sider application to your main application
- Called
side-carcontainers
- containers can talk via
localhostandport
Pausecontainer in each Pod- Also called
sandboxcontainers - Reserve and holds network namespace (nets)
- Enables communication between containers
Networking WITHIIN PODS
- No built-in solution
- Expects you to implement a networking solution
- But impose fundamental requirements on any implementation to be pluggable into kubernetes
- Every Pod gets its
Own unique IP address - pods on
same NodecanCommunicatewiththat IP address - pods on different Node can
Communicatewiththat IP address without NAT(Network Address Translation)
- k8s doesn't care about the exact IP address
Many networking solutions, which implement this module
- Each Node gets an IP address from IP range from VPC
- Pods are isolated with own private network
- On each Node a private network with a
different IP rangeis created (Bridge via a CNI plugin e.g. wavework)- IP address ranges should
not overlap!
- IP address ranges should
- Bridge enables Pod communication on the same Node
How do we make sure each Node gets a different set of IP addresses? We need to ensure unique IP's --> Because K8S
doesn't care!
- CNI Plugin! (e.g. Wavework)
- Each Node gets an equal subset of this IP range
Virtual Private Networks with own sets of IP addresses
- They
can't talk directly, because of private isolated networks - Pods can communicate via
Gateways - Network Plugin creates
1 large Pod Network- Why? All Nodes are in the
same network - Can talk directly via their IPs
- Why? All Nodes are in the
- Each Node can access via the virtual pod network on its Node
- K8s requirements for CNI Plugins
- Every Pod gets its own unique IP address
- Pods on same Node can communicate with that IP address
- Pods on different Node can communicate with that IP address with out NAT (Network Address translation)
So how to manage thousands of Nodes? we need a more Automated & Scalable solution...
- CNI Plugins solved this!
Now is we should do something about master node status, because currently it is in NotReady statement... also
the coredns pods can't be Ready because of our node statement... so we should implement a networking solution to
achieve these goals... the solution is implenting a CNI Plugin... In this scenario, we are going with Weave Net
- Note: ONLY after this solution we can add
Worker Nodesto our cluster...
Pod Network IP Address Range should not overlap with Node IP Address Range --> VPC IP != Nodes IP
-
the default range that Weave Net would like to use is
10.32.0.0/12- a 12-bit prefix, where all addresses start with the bit pattern000010100010, or in decimal everything from10.32.0.0through10.47.255.255. -
Before installing Weave Net, you should make sure the following ports are not blocked by your firewall:
TCP 6783andUDP 6783/6784. For more details, see the docs. -
Master
wget "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" -O weave.yaml
- You can change the default IP range:
-
Master
vim weave.yaml # Then find: # containers: # - name: weave # command: # - /home/weave/launch.sh # And add e.g. `- --ipalloc-range=100.32.0.0/12` bellow first command...
-
Master
kubectl apply -f weave.yaml
Be sure to check the IP address of Pods and Node in kube-system namespace...
We already installed:
-
containerd
-
kubeadm
-
kubelet
-
kubectl
-
Now lets joining the worker nodes...
-
NOTE: A bidirectional trust needs to be established:
- Discovery (Node trust the K8s Control Plane)
- TLS bootstrap (K8s Control Plane trust Node)
-
master
kubeadm token create --print-join-command
Paste the output of this command on Worker Node(s). Note that you can use this output as much as you want.
- preflight
- run join pre-flight checks
- kubelet-start
- write kubelet settings, certificates and (re)starting the kubelet
- Now that workers added to the cluster, lets check a weave-net pod logs...
kubectl logs -n kube-system pod/weave-net-4xtwz -c weaveAs you can see, there was a connection error that says weave-nets can not reach each other.
We need to open a port on each server for weave-nets.
-
They listen on port
6783 -
To check the weave-net status
kubectl exec -n kube-system weave-net-4xtwz -c weave -- /home/weave/weave --local status -
And now you can deploy a test application
kubectl run test --image=nginx
Now let's deploy an Nginx Deployment with 2 Pods and a test Service for it
-
Nginx Deployment
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80
-
Service
apiVersion: v1 kind: Service metadata: name: nginx-service labels: app: nginx svc: nginx-test spec: selector: app: nginx ports: - protocol: TCP port: 8080 targetPort: 80
-
Important commands:
-
Check the IP Addresses
kubectl get all -o wide
-
Check the IP + Endpoints
kubectl describe svc nginx
-
Check Endpoints
kubectl get ep
-
Check every label
kubectl get XXX --show-labels
-
Filter based on labels
kubectl logs/get/etc -l <label>=<value>
We can scale our application Up/Down by editing the deployment.yaml file of that deployment. But for quickly testing something:
kubectl scale deployment nginx-deployment --replicas=5We can track our changes in cluster via --record flag...
--recordflag is almost valid for most of the K8s components (scale/create/etc)kubectl scale deployment nginx-deployment --replicas=5 --record
then you see it via:
- Command line
kubectl rollout history deployment nginx-deployment - in
annotationskubectl get deployments.apps nginx-deployment -o yaml | less
- We will
curlto Nginx Service from inside that Pod - No need for a Deployment Configuration File
kubectl run test-nginx-svc --image=nginxThen we want to Get Interactive Terminal
kubectl exec -it test-nginx-svc -- bashThen we want to curl the nginx app via its service:
-
IP:PORT
curl http://<SERVICE-IP>:8080
-
DNS
curl http://nginx-service:8080
DNS Server is:
-
Nameserver in the K8s cluster
-
Manage the list of Service Names and their IP addresses
-
All Pods point to this nameserver
-
DNS Server in Kubernetes is:
CoreDNS -
CoreDNS Pods run in
kube-systemnamespace -
2 Replicasin the kube-system namespace -
In the production cluster,
minimum of 2 replicas -
Check logs of CoreDNS
kubectl logs -n kube-system -l k8s-app=kube-dns
kubectl run -it test-nginx-svc --image=nginx -- bashInside that:
cat /etc/resolv.conf The result would look like:
nameserver 10.96.0.10 # IP address of CoreDNS
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5-
kubelet automatically creates
/etc/resolv.conffile for each Pod -
You can see the
cluster DNS IPby:sudo cat /var/lib/kubelet/config.yaml
- As we already know, the Pod IP Addresses come from
CNI. Api-server,Etcd,Kube-Proxy,Scheduler, andcontroller-ManagerIP Addresses come fromServer/NodeIP Address- But how about the Services in K8s (
ClusterIP Type)?
ClusterIP Type:
-
Default Service Type
-
Expose the Service on a
cluster-internal IP -
Service only reachable from
within the cluster -
IP address range is defined in Kube API Server Configuration
If we check API Configuration, we can see the - --service-cluster-ip-range=10.96.0.0/12 option in command section, A
CIDR notation IP range from which to assign service cluster IPs:
sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml- See all defaults configurations:
# $ kubeadm config print init-defaults apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 1.2.3.4 bindPort: 6443 nodeRegistration: criSocket: unix:///var/run/containerd/containerd.sock imagePullPolicy: IfNotPresent name: node taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: { } dns: { } etcd: local: dataDir: /var/lib/etcd imageRepository: k8s.gcr.io kind: ClusterConfiguration kubernetesVersion: 1.24.0 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/12 scheduler: { }
- You can configure Kube API Server with many different options:
- when bootstrapping the cluster via
kubeadm init --service-cidr <IP Range> - Change
kube-apiserverdirectly (kubelet periodically scans the configurations for changes)sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
- Note that with option number
2, you are going to get theThe connection to the server IP:6443 was refused - did you specify the right host or port?error for a while, so you have to wait a couple of minutes forkube-apiserverto start again... - New CIDR block only applies for newly created Services, which means old Services remain in the old CIDR block for
testing:
kubectl create service clusterip test-cidr-block --tcp 80:80
Then Check the newly created Service.
Our goal here is to create a service that balances connections to two different deployments. You might use this as a simplistic way to run two versions of your apps in parallel.
In the real world, you'll likely use a 3rd party load balancer to provide advanced blue/green or canary-style
deployments. Still, this assignment will help further understand how service selectors are used to finding pods to use
as service endpoints.
For simplicity, version 1 of our application will use the NGINX image, and version 2 will use the Apache image. They
both listen on port 80 by default.
When we connect to the service, we expect to see some requests served by NGINX and some by Apache.
Let's create a clusterIP service
apiVersion: v1
kind: Service
metadata:
labels:
app: custom-lb
name: custom-lb
spec:
type: ClusterIP
# The selector of that service will need to match the pods created by both deployments.
selector:
svc: clb
ports:
- protocol: TCP
port: 8080
targetPort: 80Then create our Nginx Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
# We will need to change the deployment specification to add an extra label (svc: clb) to be used solely by the service.
labels:
app: nginx
svc: clb
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80Then create our Apache Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: apache
name: apache
spec:
replicas: 1
selector:
matchLabels:
app: apache
template:
metadata:
# We will need to change the deployment specification to add an extra label (svc: clb) to be used solely by the service.
labels:
app: apache
svc: clb
spec:
containers:
- image: httpd
name: httpd
ports:
- containerPort: 80Apply the files, then use the curl command to show yield responses from NGINX and Apache.
Note: you won't see a perfect round-robin, i.e., NGINX/Apache/NGINX/Apache, etc., but on average, Apache and NGINX
should serve approximately 50% of the requests each.
- The "pod-to-pod network" or "pod network":
- Provides communication between pods and nodes
- Is generally implemented with CNI plugins
- The "pod-to-service network":
- Provides internal communication and load balancing
- Is generally implemented with
kube-proxy
- Network policies:
- Provide firewalling and isolation
- Can be bundled with the "pod network" or provided by another component
- Inbound traffic can be handled by multiple components:
- Something like
kube-proxy(forNodePortservices) (Don't worry. We're going to go throughNodePortin the next section) - Load balancers (ideally, connected to the pod network)
- Something like
- It is possible to use multiple pod networks in parallel (with "meta-plugins" like CNI-Genie or Multus)
ClusterIP= Internal ServiceNodePort= External Service
apiVersion: v1
kind: Service
metadata:
name: nginx-service
labels:
app: nginx
svc: nginx-test
spec:
type: NodePort # ClusterIP is default, thats why we didnt need to specify the type before
selector:
app: nginx
ports:
- protocol: TCP
port: 8080
targetPort: 80
nodePort: 30000- Internal Service accessible on
Service IP address + Port - NodePort also creates ClusterIP internal Service
- NodePort,
Opens Port on each Worker Node - External traffic has access to
fixed porton each Worker Node! - Range:
30000-32676 - Sometimes, it's the only available option for external traffic (e.g. most clusters deployed with
kubeadmoron-premises)
NodePort Service Accessibility:
- NodePort accessible from outside the cluster
- On IP address of Node + The NodePort
- NodePort accessible from outside the cluster
At this point, our application is accessible from outside, but it has a couple of downgrades:
- Not user friendly
- Insecure & messy!
- It is OK for testing
- Better Alternative: Loadbalancer Service Type
apiVersion: v1
kind: Service
metadata:
name: nginx-service
labels:
app: nginx
svc: nginx-test
spec:
type: LoadBalancer # The configuration of Loadbalancer is exactly like NodePort except this part...
selector:
app: nginx
ports:
- protocol: TCP
port: 8080
targetPort: 80
nodePort: 30000 # Notice we still have a nodePort!How Loadbalancer Type works:
- Loadbalancer
outside of K8s cluster - Which accepts traffic as entrypoint
- Loadbalances to 1 of Worker Node on the Nodeport
- Then gets forward to ClusterIP
- Accessible at own IP address & port
- Loadbalancer is not created inside the cluster
- Who is responsible for creating the loadbalancer?
Cloud providers, which offer Kubernetes managed services (EKS, AKS, GKE, etc)
- But, I have a
self-managed K8s cluster
- No Loadbalancer out-of-the-box... You need to
create your own Loadbalancer - Create Loadbalancer for each Service
Some notes:
- When using managed K8s Service, like EKS:
- Cloud native loadbalancer will be provisioned for your Service
- Here we are going to create a loadbalancer from scratch on AWS
- From
EC2 managment consolego toLoad Balancing/Load Balancers - Choose
Application Load Balancer - Give it some name (Doesn't actually matter!)
- In
Availibility Zonessection, we should select a zone that ourWorker Nodesare - Then you must select
at least 2 subnets(Very specific to AWS) - In
Configure Routing(This the part that we decide where does Loadbalancer forward the requests that it gets):
- Select a name
- In port section, we give it
The nodePort from Service configuration(In our case 30000)
- In
Register Targets, we should specifyWhich instancesthis are exactly, In our case these are theWorker Nodes(Onlythe Worker Nodes NOT master) and then Register - After a while your Loadbalancer will go from
ProvisioningtoActivestate
Now, How de we access the Loadbalancer?
- A Loadbalancer has an
IP addressbut it also has aDomain Name
Now if you paste the Loadbalancer Domain Name, you should still see the Welcome to Nginx!
- Self-Managed K8s cluster: Create Loadbalancer yourself
- Managed K8s cluster: Creates Loadbalancers automatically
- Loadbalancer Disadvantages:
- Loadbalancer that all become entrypoints
- Configure Domain Name
- Each Loadbalancer exposes new NodePort
- Each Loadbalancer increases cloud bill
Configureeverythingoutside the cluster- Isn't it good:
- Having this as part of K8s cluster?
- Configure secure connection
- Loadbalancing to different services
- Ingress
- A K8s component
- Configure
Routing - Configure
https - Ingress is deployed and available inside cluster
- We need to
exposeit either as NodePort or Loadbalancer - 1 NodePort or Loadbalancer, which is single entrypoint
- You need an
implementationof Ingress - which is
Ingress Controller
- Evaluates and processes Ingress rules
- Manages redirection
- Entrypoint to cluster
- Many third-party implementations (Default:
K8s Nginx Ingress Controller)
- Cloud Service provider:
- Out-of-the-box K8s solution
- own virtualized Loadbalancer
- Advantage: You don't have to implement Loadbalancer yourself
- Bare Metal:
- You need to configure some kind of entrypoint
- Either inside of cluster or outside as separate server (Software or Hardware solution)
- Separate server
- Public IP address and open ports
- Entrypoint to cluster
No server in K8s cluster is accessible from outside!
- Important Notes:
- Data Keys need to be
tls.crtandtls.key - Values are
file contentsNOT file paths/locations - Secret component must be in the
same namespaceas the Ingress component
- Data Keys need to be
We deploy our Ingress Controller via Helm charts...
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install [RELEASE_NAME] ingress-nginx/ingress-nginxNow check this out:
helm ls # Check for the chart that we just deployed
kubectl get pod # Check for the ingress controller pod
kubectl get svc # Check for 2 ingress services (ClusterIP & Loadbalancer)Now you may wonder why a Loadbalancer Service type?
Now we should Configure a LB again... Check this section
At this point if you check the LB IP address, you should get a Nginx Error (504 Gateway Time-out OR 404 Not Found).
It is because we didn't configure it yet...
So as a next step: Configure Ingress Controller to route the traffic
- Ingress component is like a
Configurationpiecefor the Ingress Controller - We defined HTTP rules, which the Ingress Controller fulfills
- K8s way to configure routing logic
Ourrouting logic
- Change the
ingress-serviceback to ClusterIP Service
- Only accessible internally!
- The
Ingress Controlleris the Only Entrypoint of the K8s cluster
First, lets generate an Ingress Template
kubectl create ingress my-app-ingress --rule=host/path=service:port --dry-run=client -o yaml > my-ingress.yamlOpen the file and cleaning up... After this, you should have something like this:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-ingress
spec:
rules:
- host: host
http:
paths:
- backend:
service:
name: service
port:
name: port
path: /path
pathType: ExactIngress Rules:
Each HTTP rule contains:
- An optional
host - List of
paths, each of which has an associated backend - A
backendis combination of Service and port names
Path Types:
Exact: Matches the URL path exactly and with case sensitivityPrefix: Matches based on a URL path prefix split by/
We should change this template to:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-ingress
namespace: default
spec:
ingressClassName: nginx
rules:
- host: www.kube.com # The Loadbalancer Domain (MUST be domain, NOT IP address)
http:
paths:
- pathType: Exact
backend:
service:
name: nginx-service
port:
number: 8080
path: /my-app- apply it
kubectl apply -f my-ingress.yaml-
Notes: If you get
...Faild calling webhook "validate.nginx.ingress.kubernetes.io...": -
See the API
kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io- Edit the manifest file
kubectl edit validatingwebhookconfigurations.admissionregistration.k8s.io- Change the
failurePolicy
# Change the failurePolicy: Fail to the:
failurePolicy: IgnoreAnd now Everything should be fine:
kubectl get ingress- Ingress component is in the
same namespaceas our internal Service
Now if you paste your URL in the browser, you should see the Nginx result (Instead of the errors that we got
before...)
- Note: If tou are in bare-metal, you should also add the
IP URLin your/etc/hosts--> e.g.10.152.183.110 www.kube.com
Now lets change our my-ingress.yaml file to:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
namespace: default
spec:
ingressClassName: nginx
rules:
- host: www.kube.com
http:
paths:
- pathType: Exact
backend:
service:
name: nginx-service
port:
number: 8080
path: /my-appFirst check the described ingress:
kubectl describe ingress my-ingressAs you can see, if now we check the URL we get an error, bit if we head over to URL/my-app we get 404 Not Found...
This means now traffic will forward to the nginx-service, but my nginx doesn't know how to handle it...
So, Now we want to handle this issue that our main page/root (URL) doesn't response. Just add the following line
under the metadata section:
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /- If your app can handle the path, then you don't need this
rewrite
Now if you head over to URL/my-app, we can see the Nginx welcome page
If you check the pods, you can see that we have only 1 ingress nginx controller,
In Production, you should have 1 replica per Worker Node
Lecture Overview
- How
AuthenticationandAuthorizationworks in Kubernetes - How to configure users, groups, and their permissions
- Authorization with
Role Based Access Control (RBAC). - Which K8s resources to use to define permissions in the cluster
-
Role:
- With the Role component, you can define namespaced permissions
- Bound to a specific Namespace
- What resources in that namespace can you access? (
Pod,Deployment,Service, ...) - What action can you do with this resource? (
"list","get""update","delete", ...) - Define Resources and Access Permissions
- No information on WHO gets these permissions
- How to attach Role Definition to a person or team?
-
RoleBinding:
- Link ("Bind") a Role to a User or Group
- All members of the Group get permissions defined in the Role
How about K8s Admins?
-
Managing Namespaces in a cluster
-
Configuring cluster-wide Volumes
-
ClusterRole:
- Defines resources and permissions `cluster wide
-
ClusterRoleBinding:
- Link ("Bind") a ClusterRole to a User or Group
How do we create Users and Groups?
- Kubernetes doesn't manage Users natively
- Admins can choose from different authentication strategies
- No Kubernetes Objects exist for representing regular user accounts
- Static Token File
- Certificates
- 3rd Party Identity Service
- Admin configure external source
API Serverhandles authentication of all the requests- API Server uses one of these configured authentication methods
-
Example for Tokens (users.csv):
aWpoZGZzaXNoaWZsa2pldWwK, user_1, u1001,group1 aWpoZGZzaXNoaWZoc2Roc2Ru, user_2, u1002,group2 aWpoZGZzaXNoaWZzc3Nzc3MK, user_3, u1005,group3 -
Pass token file via
--token-auth-file=/users.csvcommand optionkube-apiserver --token-auth-file=/users.csv [other options]
Admins:
- Manually create certificates for users
- Or configure LDAP as the authentication source
How about Authorization for Applications
-
Applications are inside the cluster and outside the cluster
-
e.g.:
- Monitoring Apps, which collects metrics from other apps within the cluster (Prometheus)
- CI/CD Server deploying apps inside the cluster (Jenkins)
- Infrastructure Provisioning Tools configuring the cluster (Terraform)
-
We also want Apps to have only the permission it needs!
-
ServiceAccount: K8s Component that represent an Application as User
- Link ServiceAccount to
Rolewith RoleBinding (e.g., CI/CD example) - Link ServiceAccount to
ClusterRolewith ClusterRoleBinding (e.g., Monitoring Apps)
- Link ServiceAccount to
RoleConfiguration File
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: my-app
name: developer
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- list
- watch
resourceNames: # Optional
- "myapp"
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
resourceNames: # Optional
- "myapp"apigroups:""Indicates the core API groupresources: K8s components like Pods, Deployments, etc.verbs:- The actions on a resource
-
get,list(read-only) orcreate,update(read-write)
resourceNames: Define access to only certain pods in that namespace
RoleBindingConfiguration File
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: jane-developer-binding
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: jane
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: developer
ClusterRoleConfiguration File
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-admin
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- create
- delete
- update- Define access for cluster-wide resources
- Define access for namespace resources
-
Create
Role,ClusterRole, etc. just like any other Kubernetes componentkubectl apply -f <ROLE-MANIFEST.YAML>
-
View Components with
getanddescribecommandkubectl get roles kubectl describe role <NAME>
- Kubectl provides an
auth can-isubcommand - To quickly check if the current user can perform a given action
kubectl auth can-i create deployments --namespace dev- Admins can also check the permissions of other users
Layers of Security
Let's say our Jenkins application from outside the cluster wants to connect to the cluster:
- API Server checks if Jenkins User is authenticated
- Is Jenkins allowed to connect to the cluster at all?
- You can enable multi-authentication methods at once
- RBAC is one of the multiple Authorization Modes
- With RBAC: Role, ClusterRole, and Bindings are checked
We already saw that when we initialized our cluster, kubeadm generated certificates automatically and stored them:
Server Certificatein/etc/kubernetes/pki(apiserver.crt&apiserver.key)ETCD Certificatein/etc/kubernetes/pki/etcd
Kubeadm inits also generated a Client Certificate for other services that talk to the API Server.
How signed those Certificates?
As we said before, K8s allow you to configure these external sources;
But you have to manage them yourself in this section, we talk about Configure User Authentication
via Certificates.
Process of signing a Client Certificate:
- Create Key-Pair
- Generate
Certificate Signing Request (CSR) - Send CSR using K8s Certificate API
- K8s signs certificate for you (
PendingState) - K8s admin approves the certificate
In this overview:
- User Account
- Create a client key with
openssl - Create
CertificateSigningRequestfor key - Approve
CertificateSigningRequest - Get signed certificate
- Permit to create, delete, and update K8s resources
- Validate user permission
- Create a client key with
- Service Account
- Create a Service Account for Jenkins
- Give Permissions to create, delete, and update K8s resources
- Create a client key (Private) with openssl
openssl genrsa -out dev-tom.key 2048
- Create CertificateSigningRequest for key
openssl req -new -key dev-tom.key -subj "/CN=tom" -out dev-tom.csr
-
dev-tom-csr.yaml
apiVersion: certificates.k8s.io/v1 kind: CertificateSigningRequest metadata: name: dev-tom spec: request: BASE64 VALUE OF THE CSR FILE with below command signerName: kubernetes.io/kube-apiserver-client expirationSeconds: 86400 # one day usages: - client auth
-
cat dev-tom.csr | base64 | tr -d "\n" - Docs
-
-
Now apply it with
kubectl apply -f dev-tom-csr.yamland check it:kubectl get csr
kubectl certificate approve dev-tom
kubectl get csr dev-tom -o yamlUnder the status section, you see a certificate key that its value is coded with base64
Just copy it and:
echo 'BASE64 CODED CERTIFICATE' | base64 --decode > dev-tom.crtFirst check the kubernetes control plane address by: kubectl cluster-info, then:
kubectl \
--server https://192.168.220.121:6443 \ # The address from above command
--certificate-authority /etc/kubernetes/pki/ca.crt \
--client-certificate dev-tom.crt \
--client-key dev-tom.key \
get podHowever, it will fail after executing because we already have a config file in the ~/.kube folder...
Move the ~/.kube/config file somewhere else and execute the command again...
This time it will execute correctly but shows a Forbidden message since the new user doesn't have any Role. (
Authenticated, but no permissions yet!)
- First copy the
configto change its content:cp config dev-tom.conf
- then open the file and change all the
kubernetes-admintodev-tom - Then, under the
usersection, you have two keys:client-certificate-data&client-key-datathat we have to change & config, for that, we have two options (We'll see them both):- Reference Files
- Include base64-encoded content
Change the keys like so:
# [ ... ]
users:
- name: dev-tom
user:
client-certificate: dev-tom.crt
client-key: dev-tom.keyNow the kubectl works fine: kubectl --kubeconfig dev-tom.conf get pod
- Now we should Give all the 3 files (
dev-tom.conf,dev-tom.crt,dev-tom.key) to Tom
But isn't it better if we give Tom only one file?!
-
encode
dev-tom.crtbase64 dev-tom.crt | tr -d "\n"
-
encode
dev-tom.keybase64 dev-tom.key | tr -d "\n"
-
Change the
dev-tom.conflike so:# [...] users: - name: dev-tom user: client-certificate-data: BASE64 VALUE OF THE dev-tom.crt client-key-data: BASE64 VALUE OF THE dev-tom.key
Now everything is working like a charm, and you only have to give dev-tom.conf to Tom.
Alternatively, you can rename the dev-tom.conf to config and place it in the ~/.kube folder using kubectl
command line before (kubectl get pod)
-
Recap
- User created in the cluster
- Certificate signed by K8s CA
- We have a valid Kubeconfig
-
But we don't have permissions to do anything! in this section, we want to Give permissions to CRUD common resources in
all namespaces. So, in this section we:- Create
ClusterRole - Create
ClusterRoleBinding
- Create
-
First we use an autogenerated template (
dev-cr.yaml) and change it a little betkubectl create clusterrole dev-cr --verb=get,list,create,update,delete --resource=deployments.apps,pods --dry-run=client -o yaml > dev-cr.yamlapiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: dev-cr rules: # List of "rules" - apiGroups: # Rules are defined per apiGroups - "" resources: - pods # Remember: resources always define in "Plural of kind" - services verbs: ["*"] # All kind of actions - apiGroups: - apps resources: - deployments - statefulSets verbs: - get - list - create
- be sure to check the docs for all
the
apiGroups. - List of Kubernetes RBAC rule verbs
- be sure to check the docs for all
the
-
Apply this file with
cluster-adminkubeconfigkubectl --kubeconfig config apply -f dev-cr.yaml
-
Check it
kubectl --kubeconfig config get clusterrole
kubectl --kubeconfig config describe clusterrole dev-cr
-
First we use an autogenerated template (
dev-crb.yaml)kubectl create clusterrolebinding dev-crb --clusterrole=dev-cr --user=tom --dry-run=client -o yaml > dev-crb.yamlapiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: dev-crb roleRef: # reference existing role apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: dev-cr subjects: # reference existing User, Group or ServiceAccount - apiGroup: rbac.authorization.k8s.io kind: User name: tom
-
Apply this file with
cluster-adminkubeconfigkubectl --kubeconfig config apply -f dev-crb.yaml
-
Check it
kubectl --kubeconfig config get clusterrolebinding
kubectl --kubeconfig config describe clusterrolebinding dev-crb
-
And now, at this point, we can see all the Pods, but NOT node, for example...
kubectl get pods # Return all the podskubectl get nodes # Return a "Forbidden" error -
auth can-i: kubectl subcommand for quickly querying the API authorization layerkubectl --kubeconfig config auth can-i get pods --as tom
kubectl --kubeconfig config auth can-i create pods --as tom
kubectl --kubeconfig config auth can-i get nodes --as tom
K8s distinguishes between a user account and a service account.
- Service Accounts provide an identity for processes that run in a Pod
- Assign permissions to that Service Account via
RoleandClusterRole(Binding)
-
Create a ServiceAccount (
jenkins-sa.yaml)kubectl create serviceaccount jenkins --dry-run=client -o yaml > jenkins-sa.yamlapiVersion: v1 kind: ServiceAccount metadata: name: jenkins
-
Apply this file. Based on your Kubernetes version, you should consider:
-
If your Kubernetes version is before
1.24, the ServiceAccount will automatically create a Token (Secret) for you... this means after runningkubectl describe serviceaccount jenkins. You should see a Token... -
But if your Kubernetes version is equal greater than
1.24, you should do some extra steps:
-
create a
Secretmanually for your ServiceAccount:apiVersion: v1 kind: Secret type: kubernetes.io/service-account-token metadata: name: jenkins annotations: kubernetes.io/service-account.name: "jenkins"
-
Apply this secret in the same namespace that you applied the
jenkins-sa.yamland, you are good to go... -
Sources
-
NOTE: in the rest of this tutorial, we are going with option number 2. so I assume your Kubernetes version is greater or equal to
1.24.X. If not, it's OK! You can still follow along.
-
Now we want that generated token! for that:
kubectl get secret <SECRET-NAME> -o yaml
-
Save this token variable inside a
$token:token=<THE-TOKEN>
- If your
version < 1.24then you have to firstbase64 --decodethe token and after that save it in$token
- If your
-
Test the connection
kubectl --server <ADDRESS> --certificate-authority /etc/kubernetes/pki/ca.crt --token $token get pod
At this point you can see everything is correct, but still we don't have permissions...
-
Create a configuration file for jenkins (
jenkins.conf)apiVersion: v1 clusters: - cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1EVXpNREEzTVRVMU1Wb1hEVE15TURVeU56QTNNVFUxTVZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTWNQClBsQ1lLb0Yxb2Z1cmxtc01zSXFxc1FXeEd2Z0xnemlDRFNIcDN3S2NrZVBUREE2VW9vZDRmUDNTM21oR25HS1cKNHhRSHdDRlovWjlkMWRuNHVYMjJNN1NQbzRZS2xlNFkzMUlTRmgxM0lCUnJZM0FrN1lkaVo0UlVYamxjWVhrWQo5ZFdyWUYvSlc0MXdsd1RFemZseEsyeXY1bjlnaTgzaVNNN1ZOcXRWRFkvbVZuQ3YwRXNRN1NiTVlnZUI4aG9XCmpGRVozRUxZeDVJQXZ5dmF2K2pxOTJ1bjJ3bmcwWE4yK04rSkxyY0pJTmhrQkRvRC9HMWpmYlVFR3dqeUZGTkoKejROcEhRYnpwZGRDazk4RHFHRkR2L1lPVExoRW16NWpZdktMYmZqeWJoSUF3YkI0UktlM1E4YkZjcm5WVndkcQpxa0FYY1Y3M04xQ2lnckRiMnFzQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZLK0M2ZXM1cWc1ejA4Qy9VaEI2Nm1iSW5OdlJNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBR296WG9DdEJJOE5zd2g4WGxTVwppMEJMWlR4bUpBOU5UQXc2Q29HNDJiMVZoeGh6b2J3NnZDUHdqZnpqMGRXalBMQ0NaRnVXSzM0UTViNkdNWk4rCm5pdG5uOWhvYjFlbW9SVk5OZnBGYUJZOS8zdFZJWlRNTVFQd3JFaDRaNWYrQ3ZMTHpBM0szak1GZFN5U1hWTXMKeFpGYkdIM3hVb2tNbEJHdmJ6VnF0L2s4OGh2Vi91RTVNOFN4NkJpb3pMa2lGY2I1SlpSalA5Z053d01VMHBFeQpSeDQrSmp5elFxQkVCLzRRZzdoZVJNU0NXdDdZNjlNTzc2ZlA3YVhzRDkrd2tpR0tDa3hFeHBBTzE3YTI5ZkE3CkVrZ2VDSWRXVjExUk5JaGo3VkFidXRuV0p1Y0E4TkRFQlkyaHhDZUd6bzE4VFFGbjhDZUhQUEEvdFduZVNibFoKcG9JPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://192.168.220.121:6443 name: kubernetes contexts: - context: cluster: kubernetes user: jenkins name: jenkins@kubernetes current-context: jenkins@kubernetes kind: Config preferences: {} users: - name: jenkins user: token: <TOKEN>
- If your
version < 1.24then you have to firstbase64 --decodethe token and after that save it in$token
- If your
-
You can access the cluster via a configuration file
kubectl --kubeconfig jenkins.conf get pod
At this point, Jenkins is Authenticated, But no permissions.
In this section, we give permissions only to a specific namespace. We use Role for namespaced permissions.
- Security Best Practice: Give the least privilege...
Create Role
- Create a role (
cicd-role.yaml)kubectl create role cicd-role --verb=create,update,list --resource=deployments.apps,services --dry-run=client -o yaml > cicd-role.yamlapiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: cicd-role namespace: default # Optional rules: - apiGroups: - "" resources: - services verbs: - create - update - list - apiGroups: - apps resources: - deployments verbs: - create - update - list
Apply this file.
Create RoleBinding
- Create a RoleBinding (
cicd-binding.yaml)kubectl create rolebinding cicd-binding --role=cicd-role --serviceaccount=default:jenkins --dry-run=client -o yaml > cicd-bindingapiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: cicd-binding roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: cicd-role subjects: - kind: ServiceAccount name: jenkins namespace: default
Apply this file, then check the jenkins permission.
- Check the permissions
kubectl auth can-i watch deployment --as system:serviceaccount:default:jenkins -n default
kubectl auth can-i create service --as system:serviceaccount:default:jenkins -n my-app
-
Is a Pod running?
kubectl get POD_NAME
-
Is Pod registered with Service?
-
Is Service forwarding the request?
kubectl get ep
kubectl describe SERVICE_NAME
-
Is Service accessible?
nc SERVICE_IP SERVICE_PORT
ping SERVICE_NAME
-
Check application logs
kubectl logs POD_NAME
-
Check Pod status and recent events
kubectl describe pod POD_NAME
These were just Simple paths of troubleshooting Pod and Application
Pod Network is different from a Cluster Node Network.
-
Execute troubleshooting commands from
withina Pod -
Common Docker images with Unix utilities...
-
Busybox provides several Unix utilities in a single executable file. e.g.
ifconfig,nslookup,netstat,ping, etc. -
Run a single BusyBox Pod
kubectl run debug-pod --image=busybox
After running this command, your Pod status will change from Running to Completed status..., and you can not execute
the kubectl exec -it debug-pod -- sh in it ... But why is that?!
- Containers execute a specific task
- Run MySQL database...
- Start web application
- Synchronize data...
- Container exists once the task is done
- Container lives as long as a process inside lives
Where do you see what process a container starts?
-
Dockerfile --> CMD command
- specifies the instruction that is to be executed when a Docker container starts
-
When we look at busybox Dockerfile we see that the default CMD is
sh, sp if no terminal is attached, it exits...
So, How to keep the busybox container alive? Here are 2 Options:
- Start busybox with interactive mode (
-it)kubectl run debug-pod --image=busybox -it
- Now, the busybox Pod will always be in the
Runningstate. even if we exit, we can reattach to the container viakubectl exec -it debug-pod -- sh
- Now, the busybox Pod will always be in the
- Define
commandandargsin Configuration File...
- Overwrite Dockerfile command?
docker run <image_name> <command>
- The will overwrite the CMD instruction
- Not overwrite, but pass parameters?
- like
sh my-script.sh - Not possible with
CMDinstruction Entrypointinstruction allows appending other commands
- like
- CMD
- Main purpose of CMD is to provide defaults for an executing container.
- ENTRYPOINT
- Preferred for the executable that should always run.
- allow users to append other commands.
- CMD & ENTRYPOINT
- ENTRYPOINT = defines the process that starts in the container.
- CMD = provides default arguments for the
ENTRYPOINTinstruction.
How does this map to Kubernetes?
-
ENTRYPOINTis thecommand. -
CMDis thearguments.apiVersion: v1 kind: Pod metadata: name: busybox-pod spec: containers: - image: busybox name: busybox-container command: ["sh"] # ENTRYPOINT args: ["-c", "while true; do echo Hello-World!; sleep 5; done"] # CMD
-
Overwrite commands using K8s configuration
- No need for adjustment in the Docker Image
- Flexibility overwrite in Pod configuration
Execute commands in the Pod environment without entering the Pod.
kubectl exec -it busybox-pod -- sh -c "while true; do echo hello; sleep2; done"- BASH or SHELL?
- bash is a superset of shell
- bash has more functionality & more elegant syntax
- Some images don't have bash available
- In previous sections, we learned Debugging inside the Pod
- Now, we are going to Debug cluster components using kubectl
- Exact information in a digestible way
- Get a massive list of attributes about our nodes:
kubectl get no -o json
But this is too much! well, Kubectl uses JSONPath expressions to filter on specific fields in the
JSON object and format the output...
-o wide: Standard output with additional information.-o yaml: Output a YAML formatted object.-o json: Output a JSON formatted object.
-
JsonPath is a query language for JSON
-
Similar to XPath for XML
-
Kubectl uses JSONPath expression to filter on specific fields
-
Get first pod name:
kubectl get pod -o jsonpath='{.items[0].metadata.name}' -
Get All pods names:
kubectl get pod -o jsonpath='{.items[*].metadata.name}' -
Iterate between names:
kubectl get pod -o jsonpath='{ range .items[*]}{.metadata.name}{"\n"}{end}' -
Get names + IP addresses:
kubectl get pod -o jsonpath='{ range .items[*]}{.metadata.name}{"\t"}{.status.podIP}{"\n"}{end}' -
We can write custom scripts for JSONPath expressions
-
Add columns using
custom-columnsoutput format:kubectl get pod -o custom-columns=POD_NAME:.metadata.name,POD_IP:.status.podIP,CREATED_AT:.status.startTime
POD_NAME POD_IP CREATED_AT nginx-deployment-74d589986c-6bqv4 10.1.116.82 2022-07-02T10:10:48Z nginx-deployment-74d589986c-gpfm7 10.1.116.105 2022-07-02T10:10:48Z ingress-nginx-controller-6587d85c87-tzwpd 10.1.116.91 2022-07-09T05:19:27Z
-
In this section we learned:
- Some useful ways of debugging
- Find information more efficiently
What if one of our worker nodes gets into NotReady status?! This problem is most commonly for kubelet, So,
Check the Kubelet process.
-
Check kubelet status:
service kubelet status
-
Check extended logs of Kubelet service with
journalctl:journalctl -u kubelet
-
Check where the kubelet is:
which kubelet
-
Open the kubelet configuration (Check the path via
service kubelet status):sudo vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+ [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. EnvironmentFile=-/etc/default/kubelet ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
-
in the
ExecStartsection, check the/usr/bin/kubeletpath is the same path as you got fromwhich kubelet -
Reload the service
sudo systemctl daemon-reload sudo systemctl restart kubelet
-
Check the kubelet service again
service kubelet status
If you can not connect to the cluster: Check the ~/.kube/config file:
- Check if CA Certificate is correct?
echo 'CA-CERTIFICATE' | base64 --decode
sudo cat/etc/kubernetes/pki/ca.crt
- This two value should be the same.
- Check if server endpoint is correct?
Great! Now developers have access to the cluster and know how to debug it. but let's say they have some scripts that are NOT part of the main application, e.g., Updating the cache, Doing authentication tasks, collecting logs, etc., or Scripts that run "before" each app start-up, or Preparing the environment before app start-up...
- What is the best way to deploy these scripts?
- How to do it in Kubernetes?
- You can have multiple containers inside the Pod
- Main and Helper application
- The container providing helper functionality is called the sidecar container
- Usually operates asynchronously
- Can talk to each other using localhost (WithOUT DNS, IP, etc.)
- Can share data
- e.g.:
- Set Environment Variables
- System Checks
- Wait for service to be available
- Run once in the beginning and exits
- the Main container starts afterward
- Init containers are used to
initializesomething inside your PodapiVersion: v1 kind: Pod metadata: name: myapp-pod spec: containers: - name: myapp-container image: busybox:1.8 command: ['sh', '-c', 'echo the app is running!'] initContainers: - name: init-myservice image: busybox:1.28 command: ['sh', '-c', 'COMMAND']
- Create a file and apply it
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80 - name: log-sider image: busybox command: ['sh', '-c', "while true; do echo sync app logs; sleep 20; done"]
- Checks these things out:
- Pods and their creation
- the pod logs (when you have multiple containers, you always need to specify the container!)
- Adding an Init container to the previous file:
apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80 - name: log-sider image: busybox command: ['sh', '-c', "while true; do echo sync app logs; sleep 20; done"] initContainers: - name: mydb-available image: busybox command: ['sh', '-c', "until nslookup mydb-service; do echo waiting for database; sleep 4; done"]
- Check that the pods remain at the
Initstate...
- Check that the pods remain at the
- Create a service and see that the Pods state will change!
kubectl create svc clusterip mydb-service --tcp=80:80
Let's say you need some data about your application's Pod or K8s environment to add Pod information as metadata to logs. Such as e.g.
-
Pod IP
-
Pod Namespace
-
Service Account of Pod
-
But how to access this information?
- All Pod information can be made available in the config file
-
There are two ways to expose Pod fields to a running Container:
- Environment Variables
- Volume Files
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment-env
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
- name: log-sider
image: busybox
command: [ 'sh', '-c' ]
args:
- while true; do
echo sync app logs;
printenv POD_NAME POD_IP POD_SERVICE_ASCCOUNT;
sleep 20;
done;
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_SERVICE_ASCCOUNT
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountNameIntroduction to Kubernetes Volumes, How to persist data in K8s using volumes?
- Persistent Volume
- Persistent Volume Claim
- Storage Class
- Storage that doesn't depend on the pod lifecycle.
- Storage must be availabe on all nodes.
- Storage needs to survive even if cluster crashes.
- Persistent Volume is more like A cluster resource. (like CPU & RAM)
- Create via a YAML file
kind: PersistentVolumespec: e.g., how much storage?
- Needs actual physical storage, like:
- Cloud Storage
- NFS server
- Local disk
Where does this Storage come from, and who makes it available to the cluster?
- What Type of Storage do you need?
- You need to create and manage them by yourself
- Think Storage as an External plugin to your cluster
-
NFS Storage
apiVersion: v1 kind: PersistentVolume metadata: name: pv-name spec: capacity: storage: 5Gi # How much volumeMode: Filesystem accessModes: # Additional params, like access - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: slow mountOptions: - hard - nfsvers=4.1 nfs: # Nfs parameters path: /tmp server: 172.17.0.2
-
Google Cloud
apiVersion: v1 kind: PersistentVolume metadata: name: test-volumes labels: failure-domain.beta.kubernetes.io/zone: us-centrall-a__us-centrall-b spec: capacity: storage: 400Gi # How much accessModes: - ReadWriteOnce gcePersistentDisk: # Google Cloud Parametes pdName: my-data-disk fsType: ext4
-
Local Storage
apiVersion: v1 kind: PersistentVolume metadata: name: test-pv spec: capacity: storage: 100Gi volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Delete storageClassName: local-storage local: path: /mnt/disks/ssd1 nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In
-
Depending on storage type, spec attributes are different.
- PV outside of namespaces
- Accessible to the whole cluster
- Each Volume type has its use cases!
- Local volume types violate 2. and 3. requirement for data persistence:
- Being tied to 1 specific node
- Surviving cluster crashes
- For DB persistence, use remote storage!
- Who creates the Persistent Volumes, and when?
- K8s Administrator and K8s User
- PV are resources that need to be there BEFORE the Pod that depends on it is created...
- Admin provisions storage resources.
- Creates the PV components from these storage backends.
-
Application has to claim the Persistent Volume
-
We use that PVC in Pods configurations
-
PVC
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-name spec: storageClassName: manual volumeMode: Filesystem accessModes: - ReadWriteOnce resources: requests: storage: 10Gi
-
Pod
apiVersion: v1 kind: Pod metadata: name: mypod spec: containers: - name: myapp image: nginx volumeMounts: - mountPath: "/var/www/html" name: mypd volumes: - name: mypd persistentVolumeClaim: claimName: pvc-name # The name of the PVC
- Pod requests the volume through the PV claim
- Claim tries to find a volume in the cluster
- Volume has the actual storage backend
- NOTE: Claims must be in the same namespace!
Then:
- Volume is mounted into the Pod
- Volume is mounted into the Container
- Admins provisions storage resource
- User creates claim to PV
- Local volumes
- Not created via PV and PVC
- Managed by Kubernetes
- Configuration file for your pod
- Certificate file for your pod
- Create ConfigMap and/or Secret component
- Mount that into your pod/container
apiVersion: v1 kind: Pod metadata: name: mypod spec: containers: - name: myapp image: busybox volumeMounts: - name: config-dir mountPath: /etc/config volumes: - name: config-dir configMap: name: bb-configmap
- Admins configure storage
- Create PV
- K8s user claim PV using PVC
-
3rd K8s Component, which makes the process more efficient
-
SC provisions PV dynamically, when PVC claims it...
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: storage-class-name # StorageBackend is defined in the SC component via "provisioner" attribute # Each storage backend has own provisioner # `internal` provisioner - 'kubernetes.io' # `external` provisioner provisioner: kubernetes.io/aws-ebs parameters: # Configure 'parameters' for storage we want to request for PV type: io1 iopsPerGB: "10" fsType: ext4
-
Requested by PersistentVolumeClaim
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mypvc spec: storageClassName: storage-class-name # The CS name accessModes: - ReadWriteOnce resources: requests: storage: 100Gi
Storage Class usage
- Pod claims storage via PVC
- PVC requests storage from SC
- SC creates PV that meets the needs of the Claim
Where is the data persisted?
- Remote Storage Volumes
- Cloud-storage
- Remote-storage
- local Volumes
- Data is persisted on the K8s Node
- You should use remote storage, especially in production!
- Remote storage systems persist data independent of the K8s Node, where the data originated.
hostPathVolume- Simple configuration
- But use only for single node testing. For multi-node clusters: use the
localvolume type instead. - Also,
hostPathvolumes contain many security risks, and it's best practice to avoid the use ofhostPathswhen possible
- NOTE: This is one of the volume types tested in the CKA Exam!
Lecture Overview
- Create PV
- Data is stored on the local machine
- Create PVC
- Create a Deployment to use Volume
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-data
spec:
hostPath: # Type of PV
path: "/mnt/data" # Depending on the type, the attributes differ
capacity: # Capacity
storage: 10Gi # A PV will have a specific storage capacity
accessModes:
- ReadWriteOnce- Volume Plugins
- ReadWriteOnce
- ReadWriteMany
- ReadOnlyMany
- ReadWriteOncePod
- Docs
- NOTE: Different resource providers support specific modes
Apply this file and check the PV:
kubectl get pv- It is now on
Availablestatus. - A volume will be in one of the following phases:
Available: A free resource that is not yet bound to a claimBound: The volume is bound to a claimReleased: The claim has been deleted, but the cluster does not yet reclaim the resourceFailed: The volume has failed its automatic reclamation
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-data-pvc
spec:
resources:
requests:
storage: 5Gi # the specification doesn't need to match 100%, but it must be most fitting option
accessModes:
- ReadWriteOnce- In PVC, we define how much storage, access mode, etc., we need
- K8s Control Plane looks for a PV that satisfies the claims requirements
- When a suitable PV is found, it binds the claim to the volume. Otherwise, it will stay unbound.
- NOTE: Different volumes may be available in the cluster
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-db
labels:
app: my-db
spec:
replicas: 1
selector:
matchLabels:
app: my-db
template:
metadata:
labels:
app: my-db
spec:
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
value: mypwd
volumeMounts:
- name: db-data
mountPath: "/var/lib/mysql"
volumes:
- name: db-data
persistentVolumeClaim:
claimName: mysql-data-pvc- Check inside the containers, the node path, etc.
- NOTE: When Pod gets scheduled to another worker Node, the data won't be available!
- Logging Use Case
- Can't access each other's file system
- Doesn't need to be persisted
- Caching Use Case
- Storage can be shared
- No persistence necessary
emptyDir Volume
- Suitable for multi-container Pods
- All containers in the Pod can read and write the same files in the emptyDir Volume
- EmptyDir volume is initially empty
- Is first created when a Pod is assigned to a Node and exists as long as that Pod is running on the Node
- When Pod is removed from a Node, the Data is deleted permanently
- Data inside the emptyDir is mounted into the containers file system
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment-vol
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
command: [ 'sh', '-c' ]
args:
- while true; do
echo "$(date) INFO some app date" >> /var/log/myapp.log;
sleep 5;
done;
volumeMounts:
- name: log-data
mountPath: /var/log
- name: log-sider
image: busybox
command: [ 'sh', '-c' ]
args:
- tail -f /var/sidecar/myapp.log;
volumeMounts:
- name: log-data
mountPath: /var/sidecar
volumes:
- name: log-data
emptyDir: { }kubectl logs pod/nginx-deployment-vol-7b7bf97fdf-gl96t -c log-siderWhat is the best way to create and pass external configuration in K8s?
- ConfigMap
- Secret
Both allow you to decouple environment-specific configuration from your container images
ConfigMap: Regular non-confidential data, like database URLSecret: Use to store sensitive data (The built-in security mechanism is not enabled by default!)
Pods can consume ConfigMaps or Secrets in 2 different ways:
- As individual values --> Using Environment Variables
- As configuration files --> Using Volumes
Steps:
- Create ConfigMap component
- Create Secret component
- Pass Data to Pod using Evironment Variables
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
data:
db_host: mysql_serviceapiVersion: v1
kind: Secret
metadata:
name: myapp-secret
type: Opaque # Have many type
data: # Values must be "base64" encoded stings.
username: bXl1c2Vy # See below command
password: bXlwd2Q=-
To encoder values in base64 format:
echo -n "myuser" | base64
-
K8s Secrets are not encrypt by default.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
labels:
app: my-app
spec:
selector:
matchLabels:
app: my-app
replicas: 1
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: busybox
command: [ 'sh', 'c', "printenv MYSQL_USER MYSQL_PWD MYSQL_SERVER; sleep 10" ]
env:
- name: MYSQL_USER
valueFrom:
secretKeyRef:
name: myapp-secret
key: username
- name: MYSQL_PWD
valueFrom:
secretKeyRef:
name: myapp-secret
key: password
- name: MYSQL_SERVER
valueFrom:
configMapKeyRef:
name: myapp-config
key: db_host- Secrets and ConfigMap resources must already exist, when creating the Deployment resource.
Often applications use a configuration File, rather than just individual values
- Create configuration file via
ConfifMap&Secret - Pass to Pod configuration via Volume
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config-file
data:
mysql.conf: |
[mysqld]
port=3306
socket=/tmp/mysql.sock
key_buffer_size=16M
max_allowed_packet=128MapiVersion: v1
kind: Secret
metadata:
name: myapp-secret-file
type: Opaque
data:
secret.file: |
"base64" ENCODED FILE VALUE VIA --> base64 FILE | tr -d "\n"- configMap & Secret are K8s Volume Types
- Review
- Define Volumes on Pod level
- Mount Volume into container
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-db
labels:
app: my-db
spec:
replicas: 1
selector:
matchLabels:
app: my-db
template:
metadata:
labels:
app: my-db
spec:
containers:
- name: mysql
image: busybox:1.28
command: [ 'sh', '-c', "cat /mysql/db-config/mysql.conf; /mysql/db-secret/secret.file; sleep 20" ]
volumeMounts:
- name: db-coinfig
mountPath: /mysql/db-config
- name: db-secret
mountPath: /mysql/db-secret
readOnly: true
volumes:
- name: db-coinfig
configMap:
name: myapp-config-file
- name: db-secret
secret:
secretName: myapp-secret-fileapply the file and check its logs!
- When updating configMap or secret, Pods don't get the up-to-date data automatically.
- for that, you have to restart the pods.
kubectl rollout restart deployment/my-db
- In this section:
- Understand what Resource
Requests&Limistare - How to use them in practice
- Understand what Resource
-
2 Types of resources:
- CPU
- Memory
-
Configure resource requests for each container.
- Requests is what the container is grantedd to get
- K8s Scheduler uses this to figure out where to run the Pods
Why resource limits are so important?
- Container will consume more than the requested resources.
- If not limited,container could consume all the Node's resources.
Configure resource requests for each container.
- Make sure container never goes above a certain value.
- Container is only allowed to go up to the limit.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-resource
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: my-app
image: nginx
resources:
requests:
memory: "64Mi" # Memory resources are defined in "bytes"
cpu: "250m" # CPU resources are defined in "millicores"
limits:
memory: "128Mi"
cpu: "500m"
- name: log-sider
image: busybox
command: [ 'sh', '-c', "while true; do echo sync app logs; sleep 20; done" ]
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"- Best practice: Keep CPU and request at
1or below - If you put values larger than your biggest Node, your Pod will never be scheduled!
Which Pods have Resource Requests * Limits set?
kubectl get pod -o jsonpath="{range .items[*]} {.metadata.name}{.spec.containers[*].resources}{'\n'}"- Pods, who have no request set are evicted first
- Pods are automatically scheduled on one of the worker Nodes
- Scheduler decides intelligently, where to place the Pod
- In some cases, we want may want to decide ourselves
- It's the simplest form of selecting a Node
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- image: nginx
name: nginx
nodeName: worker1But what if we have
-
Dynamic Node Names?
- So you don't know the names beforehand
- Common in cloud environments
-
Not enough resources?
-
Node Selector
- Attach label to the Node
- Add nodeSelector field to Pod configuration
-
Get all the Node labels
kubectl get node --show-labels
-
Add label (type=cpu) to one our nodes
kubectl label node worker2 type=cpu
-
Double-check the node labels again
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- image: nginx
name: nginx
nodeSelector:
type: cpu- You can delete the labels like so:
kubectl lable node worker2 type-
- nodeSelector
- More flexibility over nodename
- If not enough resources, Pods won't be scheduled
- More flexible expressions
nodeAffinity:
- Similar to
nodeSelector, but affinity language is more expressive. - Match labels more flexible with logical operators
InNot InExistsDoesNotExistsGtLt
- You can define multiple rules
- Currently, 2 types of node affinity:
- "hard" =
required- Specified rules MUST be met
- Similar to
nodeSelector
- "soft" =
preferred- Specified rules are preferences
- Scheduler will try to enforce, if faild, scheduled Pods somewhere else
- "hard" =
- Refer to documentation for the syntax
- But as always it's important to understand the concept or what you are defining
- Node must have:
- Label key: kubernetes.io/e2e-az-name
- Label value: e2e-az1 or e2e-az2
spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/e2e-az-name operator: In values: - e2e-az1 - e2e-az2
- Node should have:
- Label key: another-node-label-key
- Label value: another-node-label-value
spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: another-node-label-key operator: In values: - another-node-label-value
Operators:
Exists:- Match Nodes that have a specific label defined
- Value doesn't matter
DoeesNotExist- Do NOT have specific label key
In:- Match Nodes that have a specific label (
key=value) defined - Value does matter
- Match Nodes that have a specific label (
Not In- Do NOT have specific label
key=value
- Do NOT have specific label
Gt- Greater than
Lt- Less than
Our Rule should be:
- MUST: Linux Nodes
- PREFERRED: type=cpu
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
containers:
- image: nginx
name: myapp
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: type
operator: In
values:
- cpu- Properties of Pods
- nodeName, nodeSelector, nodeAffinity
- Configure Nodes... Which Pods they should accept to be scheduled on?
- e.g. Worker1 & Worker2 should repel my-app Pods
- Master Node also have Taints
- See the
Taintsattributekubectl describe node master - See that the
Taintsis set only on Master Nodeskubectl describe node | grep Taints
- Who set the taints on Master Nodes?
- When bootstrapping
kubeadmsets the taint
- When bootstrapping
But we can see there are a bunch of Pods there! How did the Pods land on the Control Plane?
- Taints
- Applied to Nodes
- Toleration
- Applied to Pods
- Allow the pods to schedule onto Nodes with matching taints
- From
No TolrationstoHas a matching toleration
- So, Taints and Tolerations work together
- Check the existing Pods Toleration on master nodes
kubectl describe pod -n kube-system etcd-master
- They all return
Tolerations: :NoExecute op=Exists
Lets configure Tolreation!
apiVersion: v1
kind: Pod
metadata:
name: pod-with-toleration
spec:
containers:
- image: nginx
name: nginx
tolerations:
- effect: NoExecute
operator: Exists
nodeName: master # Why we add this? Check the below:- Without
nodeName: master, Does not guarantee that Pods is scheduled in Control Plane- Pod tolerates now all Pods
- Scheduler selects between all Nodes
- We should guarantee that it's scheduled there via
Node SelectororNode NameorNode Affinity
Let's say we have a Dynamic infrastructure that Nodes get added and removed based on workload. And we have a
log-collector the collecting logs on Master nodes on 5 Nodes (5 Master Nodes). If we specify nodeSelector to
be scheduled on Master Nodes, All replicas might be scheduled on that 1 Control Plane Node! But we wanted:
1 replica per Node...
- Allows you to constrain, which Nodes your Pod is eligible to be scheduled, based on labels on Pods that are already running on the Node
- In other words: This Pod should not run on worker1 if worker1 is already running one or more Pods with a specific label.
- In our example: Don't run Pod on a Node that already runs this Pod's replica
- Imagine: Our app works with etcd only
- We only want to Let's say we have a Dynamic infrastructure the app on those Nodes that run etcd
- Inter-Pod Affinity
- Schedule our Pods only on Nodes that have etcd running on them
- Inter-Pod Anti-Affinity
- And which do not have another replica of my-app already running
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: myapp
name: myapp-deployment
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- image: nginx
name: myapp-container
tolerations:
- effect: "NoSchedule" # k describe node master | grep Taint
operator: "Exists"
nodeSelector:
type: master
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- "myapp"
topologyKey: "kubernetes.io/hostname"
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: tier
operator: In
values:
- "control-plane"
topologyKey: "kubernetes.io/hostname"- Node Affinity
- Node Labels
- Inter-Pod Affinity
- Pod Labels
- Similar to
DaemonSet- DaemonSet schedules 1 replica on each Worker Node
We covered a lot of things:
-
Inter-Pod (Anti)-Affinity
-
Node (Anti)-Affinity
-
Node Selector
-
Node Name
-
Taints & Tolerations
-
NOTE: Do NOT overuse these scheduling constraints!
- K8s manages its resources intelligently
- Restart Pods when it crashes
- OK, Pod runs, but how about the application inside the Pod?
- We need to let Kubernetes know in which state our application is, so K8s auomatically restarts the application
3 types of checking the health status:
- Exec probes: Kubelet execute the specified command to check the health.
- TCP probes: Kubelet makesd probe connection at the node, not on the pod.
- HTTP probes: Kubelet sends an HTTP request to specified path and
Liveness Probe.
- K8s knows the Pod state (
Liveness Probe) - Liveness Probe for Application state
- Health checks after container started
- But what about the starting up process?
- Let's K8s know if application is ready to receive traffic
- Without readiness probe, K8s assumes the app is ready to receive traffic as soon as the container starts
- Configuration is very similar to liveness probe.
- Both check applications availability
- Readiness Probe
- During application startup
- Liveness Probe
- While application is running
apiVersion: v1
kind: Pod
metadata:
name: myapp-health-probes
spec:
containers:
- image: nginx
name: myapp-container
ports:
- containerPort: 80
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 10 # Tell kubelet that it should wait 10 seconds before performing the first probe
periodSeconds: 5 # Do it every X seconds
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 5
periodSeconds: 15- When updating the Deployment, what happens to the Deployment?
- Will the old Pods removed first and then restarting them?
- Application Downtime?
- Deployment
- Creates an application rollout
- ReplicaSet
- Created automatically in the background
- Ensure that a specified number of Pod replicas are running at any qiven time
________________ ________________ __________
/ Deployment \ ------> / ReplicaSet \ ------> / Pods \
----------------- ------------------ ------------
- We work with Deployment
- K8s creates ReplicaSet in background
- ReplicaSet creates Pods
- In which order do Pods get removed and new ones created?
- Recreate Strategy: All existing Pods are killed before new ones are created (But it cause the Application Downtime ❌)
- Rolling Update Strategy: The Deployment updates Pods in a rolling update fashion (No Application Downtime ✅)
- ReplicaSet and its Pods are linked.
- Name of ReplicaSet:
[Deployment-Name]-[Random-string]
If you describe one of your existing Deployments. you should see:
$ kubectl describe deployment nginx-deployment
[...]
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
[...]- You can specify how many Pods to update at once
- Instead of deleting/creating Pods one by one
- You may want to update 5 at once
Max Unavailable: Specifies the max number of Pods that can be unavailable during the update processMax Surge: Specifies the max number of Pods that can be created over the desired number of Pods
- When a Deployment Rollout is triggered, a new Deployment revision is created
- NOTE: A new revision is only created when the Deployment's Pod template is changed
kubectl rollout history deployment <deployment-name>
$ kubectl rollout history deployment nginx-deployment
deployment.apps/nginx-deployment
REVISION CHANGE-CAUSE # Here we have 5 revision
1 <none>
2 <none>
4 <none>
5 <none>- Rollout to a previous revision
kubectl rollout undo deployment/nginx-deployment --to-revision=0 # The revision to rollback to. Default to 0 (last revision) - Check status
kubectl rollout status deployment/nginx-deployment
- Owners and Dependents
- Some objects are created by other objects (example:
Podscreated byreplicaSets, themselves created byDeployments) - When an owner object is deleted, its dependents are deleted (this is the default behavior; it can be changed)
- We can delete a dependent directly if we want (but generally, the owner will recreate another right away)
- An object can have multiple owners
- The owners are recorded in the field
ownerReferencesin themetadatablock - Let's create a deployment running nginx
kubectl create deployment owner-example --image=nginx --replicas 3
- Check its Pods
kubectl get pods -l app=owner-example -o yaml
[...] ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true # These pods are owned by a ReplicaSet named owner-example-xxxxxxxxxx. kind: ReplicaSet name: owner-example-c466f59dd uid: 96a164d6-1a9b-4ec8-8d07-ac9b3bb3278e resourceVersion: "296451" uid: a9d312f4-23d6-4efa-b75c-73178b3e6486 [...]
- This is a good opportunity to review the custom-columns output!
- Let's show all Pods with their owners:
kubectl get pod -o custom-columns=\ NAME:.metadata.name,\ OWNER-KIND:".metadata.ownerReferences[0].kind",\ OWNER-NAME:".metadata.ownerReferences[0].name"
- When deleting an object through the API, three policies are
available (Docs):
foreground(API call returns after all dependents are deleted)background(API call returns immediately; dependents are scheduled for deletion)orphan(the dependents are not deleted)
- When deleting an object with
kubectl, this is selected with--cascade:--cascade=backgrounddeletes all dependent objects (default)--cascade=orphanorphans dependent objects
- It is removed from the list of owners of its dependents
- If, for one of these dependents, the list of owners becomes empty ...
- If the policy is
orphan, the object stays - Otherwise, the object is deleted
- If the policy is
We are going to delete the Deployment and replicaSet that we created without deleting the corresponding Pods!
- Delete the Deployment
kubectl delete deployment owner-example --cascade=orphan
- Delete the replicaSet
kubectl delete rs -l app=owner-example --cascade=orphan
- Check that the pods are still here
kubectl get pods
- If we remove an owner and explicitly instruct the API to orphan dependents
- If we change the labels on a dependent, so that it's not selected anymore (e.g. change the
app: owner-examplein the pods) - If a deployment tool that we're using does these things for us
- If there is a serious problem within API machinery or other components (i.e. "this should not happen")
- We're going to output all pods in
JSONformat - Then we will use
jqto keep only the ones without an owner - And we will display their name
- List all pods that do not have an owner
kubectl get pod -o json | jq -r " .items[] | select(.metadata.ownerReferences|not) | .metadata.name"
Now that we can list orphan pods, deleting them is easy
- Add
| xargs kubectl delete podto the previous command:kubectl get pod -o json | jq -r " .items[] | select(.metadata.ownerReferences|not) | .metadata.name" | xargs kubectl delete pod
- etcd: A distributed, reliable key-value store
- Periodically backing up the etcd cluster data is important
- What is stored inside etcd?
- Kubernetes cluster configuration and state data such as the number of pods, their state, namespace, etc. It also stores Kubernetes API objects and service discovery details.
- What is not stored inside etcd?
- Application Data is not stored in it
- Reminder: Storage configured with Persistent Volume for application data
- How to create an etcd Backup?
- Etcdctl is a command line tool for interacting with etcd server
- Install etcdctl
- Create backup with etcdctl
- Install etcd-client client
sudo apt install etcd-client
ETCDCTL_API=3 \ # The API version used by etcdctl to speak to etcd, you can check the version with "etcdctl version"
etcdctl snapshot save /tmp/etcd-backup-new.dbHowever, it qives us an error (Error: context deadline exceeded), because we need to authenticate to the etcd
server!
- API server also connects to etcd server
$ sudo cat /etc/kubernetes/manifests/kube-apiserver.yaml [...] - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key [...]
- Check the
etcd.yamlfile too$ sudo cat /etc/kubernetes/manifests/etcd.yaml | grep "/etc/kubernetes/pki" - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --key-file=/etc/kubernetes/pki/etcd/server.key - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - mountPath: /etc/kubernetes/pki/etcd path: /etc/kubernetes/pki/etcd
-
Save the snapshot
sudo ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup-new.db \ --cacert /etc/kubernetes/pki/etcd/ca.crt \ --cert /etc/kubernetes/pki/etcd/server.crt \ --key /etc/kubernetes/pki/etcd/server.key
-
Check the status
$ sudo ETCDCTL_API=3 etcdctl snapshot status /tmp/etcd-backup-new.db --write-out=table +----------+----------+------------+------------+ | HASH | REVISION | TOTAL KEYS | TOTAL SIZE | +----------+----------+------------+------------+ | d8d0da24 | 7220348 | 874 | 1.9 MB | +----------+----------+------------+------------+
Now you backed up the etcd, here are few more extra steps:
- Save snapshot file safely
- Encrypt the snapshot files
-
How about:
- Multiple replicas?
- Manage etcd independent of K8s cluster?
-
See the location of etcd
$ sudo cat /etc/kubernetes/manifests/etcd.yaml [ ... ] - hostPath: path: /var/lib/etcd type: DirectoryOrCreate name: etcd-data
- Use Remote Storage outside K8s cluster
- AWS, Google Cloud, etc.
- On-Permise Storage
- Run etcd outside K8s cluster
- Instead of running etcd on Master Nodes, run them outside cluster
- More complex, but options you should consider
We lose all cluster configurations!
How can we restore the data?
-
Create Restore Point from backup
sudo ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup-new.db --data-dir /var/lib/etcd-backup
-
We have to tell etcd to use new location
sudo vim /etc/kubernetes/manifests/etcd.yaml
- hostPath: path: /var/lib/etcd-backup # Changed this ONLY! type: DirectoryOrCreate name: etcd-data
- Kubelet restarts static Pods automatically
- This may take a while because Restart any components (e.g. kube-scheduler, kube-controller-manager, kubelet) to ensure they don't rely on stale data!
We need two things to connect to API:
- location of cluster (= Endpoint of API Server)
- Credentials to authenticate
- First way: access using kubectl
- with
kubeconfig--> Till now we were used this method (Default method)
- with
- Second way: Directly accessing the REST API
- with
curl,wget,browser
- with
We will discuss about the Second way in this chapter
-
Kube proxy:
- Kubectl will run in proxy mode
- It uses the stored apiserver location and verifies the identity of the API server using a self-signed cert
-
Start the proxy
kubectl proxy --port 8080 & -
check the API Server
curl http://localhost:8080/api/v1
- Another way is without kubectl proxy
- We need to pass authentication ourselves
- In this lecture, we execute script with a user with limit permissions
- Create a seviceaccount
kubectl create sa myscript
- Add secret to sa
apiVersion: v1 kind: Secret type: kubernetes.io/service-account-token metadata: name: myscript annotations: kubernetes.io/service-account.name: "myscript"
- Create Role
apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: script-role rules: - apiGroups: - apps resources: - deployments verbs: - get - list - delete - apiGroups: - "" resources: - pods - services verbs: - get - list - delete
- Create RoleBinding
kubectl create rolebinding script-role-binding --role script-role --serviceaccount default:myscript
- Get ServiceAccount Token
kubectl get secrets myscript -o yaml | grep token: - Decode it
echo <TOKEN> | base64 --decode | tr -d "\n"
- Save it to
TOKENvariableTOKEN=<VALUE>
- Get and save the server endpoint
kubectl config view | grep server: SERVER=<VALUE FROM ABOVE COMMAND>
-
Curl!
curl -X GET $SERVER/api --header "Authorization: Bearer $TOKEN" --cacert /etc/kubernetes/pki/ca.crt
-
Only if you are on one of the nodes of the cluster
curl -X GET $SERVER/api --header "Authorization: Bearer $TOKEN" --insecure
-
Based on the action, you need a different HTTP method:
- GET: Querying data
- POST: Creating resources
- PATCH: Partially updating resources
- PUT: Replacing resources
- DELETE: Deleting resources
https://kube-api-server:8080/api/???- What is the endpoint for:
- Listing Pods, Services, Namespaces etc.
- Reading a specific Deployment, Volume etc.
- Creating Deployment, Service, Role etc.
- Updating a Volume, Service, Service Account etc.
- Reading logs of specific Pod
- API Groups
- Part of the Kubernetes API
- Different K8s resources belong to different API groups
- e.g.
core--> REST Path:/api/v1,apps--> REST Path:/api/apps/v1etc. - Official Kubernetes Documentation
- List all deployments in
defaultNamespacecurl -X GET $SERVER/apis/apps/v1/namespaces/default/deployments --header "Authorization: Bearer $TOKEN" --cacert /etc/kubernetes/pki/ca.crt
- Get specific Deployment
curl -X GET $SERVER/apis/apps/v1/namespaces/default/deployments/nginx-deployment --header "Authorization: Bearer $TOKEN" --cacert /etc/kubernetes/pki/ca.crt
Write application that iteract with the K8s API?
- Instead of simple shell scripts, use a programmatic language
- K8s officially supports client libraries for Go, Python, JavaScript, etc.
- With client libraries do not need to implement the API calls and request/reposnse types yourself
- Unofficial client libraries for other programming language
- Docs
In this chapter, we learn how to upgrade the K8s cluster with minimum app downtime...
How Cluster Upgrade works?
- Upgrade Cintrol Plane
- Cons
- Control Plane processes not available
- No application downtime
- Mangement functionalities not available
- Crashed Pods won't be rescheduled
- Solution
- Have more than 1 Control Plane Node
- Upgrade each Node one by one
- Cons
- Upgrade Worker Nodes
- There is different components on Control Plane and Worker Nodes
- kube-apiserver
- kube-scheduler
- controller-manager
- ...
- What components are we upgrading?
How are they versioned?
kube-apiservermust have a latest versionkubectl: 1 version later/earlier- Recommended: Upgrading to same version
The way you upgrade the cluster depends on how you initially deployed it...
- Same versioning as K8s components
- Upgrade
kubeadmtool - With
kubeadm, upgrade all Controll Plane components and renew cluster certificateskubeadm upgrade apply 1.24.0
- Drain Nodes to remove all Pods
kubeadm drain master- Will evict all Pods safely
- Will be marked as
unschedulable
- Upghrade
kubeletandkubectl - Change Master back to
schedulablekubectl uncordon master
- Upgrade
kubeadmkubeadm upgrade apply 1.24.0
- Execute
kubeadmto upgrade allkubeletconfiguration - Drain the Nodes
- Pods will be rescheduled on other Nodes
- Upgrade
kubelet - Change Worker Node back to scheduled
- Best practices
- Upgrade Worker Nodes one by one!
- At least 2 Worker Nodes
- At least 2 Pod Replicas
- No application downtime
kubectl drain worker1- Marked As unscheduled
- Evict all Pods
kubectl cordon worker1CORDON: Only mark unschedulable- Existing Pods will stay on Node
kubectl uncordon worker1UNCORDON: Marks Node schedulable- Existing Pods will stay on Node
When and how often should you upgrade?
- Use Case: Fix in later K8s version, which affects your cluster
- Best Practice: You should always keep your cluster up-to-date
- Kubernetes supports only up to recent 3 versions
- Recommended: Upgrade 1 version at a time
In this chapter, we will go through updating the K8s cluster with all the best practices...
Be sure to check the Upgrading kubeadm clusters docs
- Switch to sudo mode
sudo -i
- See all the available version:
apt-cache madison kubeadm
- Upgrade kubeadm:
# replace x in 1.24.x-00 with the latest patch version apt-get update && \ apt-get install -y --allow-change-held-packages kubeadm=1.24.x-00
- Verify that the download works and has the expected version:
kubeadm version
-
Verify the upgrade plan:
kubeadm upgrade plan
-
Choose a version to upgrade to, and run the appropriate command.
# replace x with the patch version you picked for this upgrade sudo kubeadm upgrade apply v1.24.x -
Important Note: The above command (
kubeadm upgrade apply) is only run on the FIRST master!For the other control plane nodes:
Same as the first control plane node but use:
sudo kubeadm upgrade nodeinstead of:
sudo kubeadm upgrade apply
Now if you get all pods in kube-system namespace, you'll see all the static Pods has restarted...
But if you get your node (kubectl get node) you'll see that your node version is still the old version...
- Prepare the node for maintenance by marking it unschedulable and evicting the workloads:
# replace <node-to-drain> with the name of your node you are draining kubectl drain <node-to-drain> --ignore-daemonsets
Now if you get the nodes (kubectl get node), you'll see Ready/SchedulingDisable in status section.
-
Upgrade the kubelet and kubectl:
sudo -i
-
Switch to sudo mode
# replace x in 1.24.x-00 with the latest patch version apt-get update && \ apt-get install -y --allow-change-held-packages kubelet=1.24.x-00 kubectl=1.24.x-00
-
Restart the kubelet:
systemctl daemon-reload systemctl restart kubelet
- Bring the node back online by marking it schedulable:
# replace <node-to-drain> with the name of your node kubectl uncordon <node-to-drain>
Basically go through all over this process again and again!
In this chapter we will cover: How to use kubectl to switch between multiple clusters?
Lets say we have multiple clusters to administrator, so we have multiple kubeconfig file.
But it's not so efficent to use --kubeconfig option everytime with our kubectl command!
Access multiple clusters using Contexts
- Define all the clusters and users in the 1
kubeconfigfile - Define a context for each cluster
- We can switch between clusters using these contexts
- No need to specify kube konfig file
-
In a
kubeconfigfile, we have:- List of K8s clusters
- List of K8s users
- Names to reference them inside the kubeconfig file
- And also we have
Context
-
Context
- Combination of which user should access which cluster
- Or "Use the credentials of the kubernetes-admin user to access the kubernetes cluster"
- We interact with it via either:
- Update kubeconfig manually
- or Use
kubectl configcommands
- Lets say we have a
~/.kube/configfile taht:- Multiple clusters
- Multiple users
- Multiple contexts
- Which context does it use?
current-context
- How to switch the context?
kubectl config use-context <CONTEXT-NAME>
- Display list of contexts
kubectl config get-context
- Display the
current-contextkubectl config gcurrent-context
Lets say we have 3 cluster and for two of them, we have an admin but for the third one, we want to add user for it
___________________________ _____________________________ ______________________________
| stage-admin@staging | | dev-admin@development | | my-script@development |
| stage-admin ---> staging | | dev-admin ---> development | | my-script ---> development |
|__________________________| |____________________________| |____________________________|
- Add a user under
usersusers: - name: my-script user: token: xxxxxxx
- Add a context under
contexts
contexts:
- context:
cluster: development
user: my-script
name: my-script@development- Each context consists actually 3 components
- cluster
- user
- namespace
- By default, the
defaultnamespace is configured - Other than
defaultnamespace, we need to define them
Lets say most of the time, we work with 1 specific namespace (other than default) and its kind of
annoying to use --namespace for each kubectl command...
-
Switch default namespace
kubectl config set-context --current --namespace kube-system
-
Now check the
~/.kube/configfilecontexts: - context: cluster: kubernetes namespace: kube-system # Just added! user: kubernetes-admin name: kubernetes-admin@kubernetes
In this chapter, we'll cover:
- When does the cluster certificates expire?
- Renew certificates, if needed...
We already know that all certificates has been generated by Kubeadm...
-
Check the certs expirations
sudo kubeadm certs check-expiration
-
Clientcertificates expire after1year -
CAcertificates expire after10year -
Check certs with detail
sudo openssl x509 -in /etc/kubernetes/pki/ca.crt -text -noout
-
Check certs with detail and grep the expiration
sudo openssl x509 -in /etc/kubernetes/pki/ca.crt -text -noout | grep Validity -A2
- Kubeadm renews certificates during the cluster upgrade
- You should actually upgrade regularly, So you don't have to manually renew it, kubeadm will take care of it
You can renew 1 certificate or all at once...
- Renew
apiserversudo kubeadm certs renew apiserver
- By default: All communication (to and from Pods) is allowed
- With Network Policies you can control traffic flow at the IP address and port level
Network Policy: Who can talk to whom
Network Policy
- The rules are defined with NetworkPlocy resource in a K8s manifest
- It configure the
CNIapplication - The Network plugin implements the Network policies
- NOTE: Not all Network Plugins support Network Policies! e.g.
- Flannel does not support network policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-data
namespace: my-app
spec:
podSelector: # Which application?
# For which Pod replicas do this policies apply?
# If empty, then all Pods in defined namespace
matchLabels:
app: mysql
policyTypes: # Which rule type?
# Incoming rules: Ingress
# Outgoing rules: Egress
- Ingress
ingress:
# Each rule allows traffic which matches both the "from" and "ports" section
- from:
- podSelector:
matchLabels:
app: backend
ports: # Restrict port
- protocol: TCP
port: 3306
# List of Ingress rulesapiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-backend
namespace: my-app
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
# Restrict backends outgoing traffic to mysql and redis
# Ingress: "from" which apps?
# Egress: "to" which apps?
egress:
- to:
- podSelector:
matchLabels:
app: mysql
ports:
- protocol: TCP
port: 3306apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-backend
namespace: my-app
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
egress:
- to: # First rule
- podSelector:
matchLabels:
app: mysql
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: database # Match namespace via Labels
ports:
- protocol: TCP
port: 3306
- to: # Second rule
- podSelector:
matchLabels:
app: redis
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: database
ports:
- protocol: TCP
port: 6379Becareful with the syntax...
- podSelectoer
- Match Pods with label
app: mysqlAND are in namespace...:database- podSelector: matchLabels: app: mysql namespaceSelector: matchLabels: kubernetes.io/metadata.name: database
- Match Pods with label
- namespaceSelector
- Match Pods with label
app" mysqlin local namespace - Match Pods in
databasenamespaces - Its a
OR- podSelector: matchLabels: app: mysql - namespaceSelector: matchLabels: kubernetes.io/metadata.name: database
- Match Pods with label
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-backend
namespace: my-app
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
- Egress
ingress:
[ ... ]
egress:
[ ... ]apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-backend
namespace: my-app
spec:
podSelector: { } # Match all pods
policyTypes:
- Ingress- With this you can create a
defaultisolation policy for a namespace - No
ingressattribute = Block all incoming traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-backend
namespace: my-app
spec:
podSelector: { }
policyTypes:
- Ingress
ingress:
- { } # Allows all traffic- With this you can allow all traffic to all pods in a namespace
- Bydefault: All traffic is allowed in all the namespaces
In this section:
- Create 3 Deployments (
frontend,backend,DB) - Create and configure network policies
- Create a
myappnamespacekubectl create ns myapp
- Set a
myappnamespace as our default namepacekubectl config set-context --current --namespace=myapp
-
frontend.yaml
apiVersion: apps/v1 kind: Deployment metadata: name: frontend namespace: myapp labels: app: frontend spec: selector: matchLabels: app: frontend replicas: 2 template: metadata: labels: app: frontend spec: containers: - name: node image: node:16-alpine command: - sh - -c - sleep 3000 ports: - containerPort: 3000
-
backend.yaml
apiVersion: apps/v1 kind: Deployment metadata: name: backend namespace: myapp spec: replicas: 2 selector: matchLabels: app: backend template: metadata: labels: app: backend spec: containers: - name: nginx image: nginx:1.21-alpine ports: - containerPort: 80
-
database.yaml
apiVersion: apps/v1 kind: Deployment metadata: name: database namespace: myapp labels: app: database spec: replicas: 2 selector: matchLabels: app: database template: metadata: labels: app: database spec: containers: - name: redis image: redis:6-alpine ports: - containerPort: 6379
-
Apply them one one by one
kubectl apply -f frontend.yaml kubectl apply -f backend.yaml kubectl apply -f database.yaml
-
First get all Pods ip
# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP frontend-7b9546d8f-fqwg9 1/1 Running 0 33m 10.42.0.63 frontend-7b9546d8f-9kbqp 1/1 Running 0 33m 10.42.0.62 database-5648cc7db9-5476q 1/1 Running 0 30m 10.42.0.64 database-5648cc7db9-5cvbl 1/1 Running 0 30m 10.42.0.65 backend-6645fb55c5-zq8r4 1/1 Running 0 2m20s 10.42.0.67 backend-6645fb55c5-g7bn9 1/1 Running 0 2m20s 10.42.0.66 -
✅
backendtodatabasekubectl exec backend-6645fb55c5-zq8r4 -- sh -c 'nc -v 10.42.0.64 6379' # 10.42.0.64 (10.42.0.64:6379) open
-
✅
frontendtodatabasekubectl exec frontend-7b9546d8f-fqwg9 -- sh -c 'nc -v 10.42.0.64 6379' # 10.42.0.64 (10.42.0.64:6379) open
-
✅
databasetobackendkubectl exec database-5648cc7db9-5476q -- sh -c 'nc -v 10.42.0.66 80' # 10.42.0.66 (10.42.0.66:80) open
- Forntend Policy
Egress: Allow traffic only tobackendPod
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-frontend
namespace: myapp
spec:
podSelector:
matchLabels:
app: frontend
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 80- DB Policy
Ingress: Allow traffic only frombackendPodEgress: Block any outgoing traffic
Egressmeans only outgoing traffic that Pod initiates
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: np-database
namespace: myapp
spec:
podSelector:
matchLabels:
app: database
policyTypes:
- Ingress # Limit who can talk to 'DB': 'backend'
- Egress # Limit who 'DB' can talk to: 'no one'
ingress:
- from:
- podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 6379- Apply the files
- See the
networkpolicieskubectl get networkpolicy
- ✅
backendtoDBkubectl exec backend-6645fb55c5-zq8r4 -- sh -c 'nc -v 10.42.0.65 6379' # 10.42.0.65 (10.42.0.65:6379) open
- ❌
frontendtoDBkubectl exec frontend-7b9546d8f-fqwg9 -- sh -c 'nc -v 10.42.0.65 6379' # command terminated with exit code 1
- ✅
frontendtobackendkubectl exec frontend-7b9546d8f-fqwg9 -- sh -c 'nc -v 10.42.0.67 80' # 10.42.0.67 (10.42.0.67:80) open
- ❌
DBtobackendkubectl exec -it database-5648cc7db9-5476q -- sh -c 'nc -v 10.42.0.67 80' # command terminated with exit code 1
- Quickely create resources using imperative commands
- Faster and preparing config files
- Get used to using
--helpoption- Save you a lot of time
- Quickly look up instead of using UI
- You don't have to memorize any commands
- Generate boilerplate manifests
kubectl create service clusterip myservice --tcp=80:80 --dry-run=client -o yaml > myservice.yaml
- Kubectl
- Create an alias
alias k=kubectl
- Dry Run
- Save command in variable
export do="--dry-run=client -o yaml"k create svc clusterip myservice --tcp=80:80 $do > myservice.yaml
- You can NOT edit all specification of existing Pods
- You can't add/remove containers
- You can't add volumes
- You need to delete your Deployment and re-apply updated Deployment manifest
kubectl edit ...: When exit, you will get an error
- In the event an error occurs while updating, a temporary file will be created
kubectl apply -f /tmp/file.yaml
- Scale Deployments up and down
kubectl scale --replicas=3 deployment/mysql
- Filter resources
- Display all Nodes, which don't have taints
NoSchedule - Display all
readyNodes - Display all Pods that have Resource Requests set
- List all Pods running on
Worker1
- Display all Nodes, which don't have taints
- Display resource usage Pods or Nodes
kubectl top <POD_NAME> --containers
- Switching default namespace
kubectl config set-context --current --namespace=some-name-space
- You are NOT working with root user. e.g.
sudo apt update - Switch to the root user, if you have more to do (
sudo -i)
- Be careful of session switches
- Pay Attention: which server and which user?
- At the beginning of each question: Command to switch to right cluster
- Pay Attention ti the environment you are in
- Kubeconfig file?
- Cluster context?
During the exam, you only have access to the Kubernetes Official Documentation
- Everything can be found here
- Learn how to work with the docs
- Use as the only source
- You can crete a simple pod and go inside and from there, curl Services via
IP:PORT&Service-name:PORT
kubectl run -it --rm test-nginx-svc --image=nginx -- bash- IP:PORT
curl http://<SERVICE-IP>:8080- DNS
curl http://nginx-service:8080If you couldn't curl your service via Service-name:PORT then you probably have a DNS Issue....
Service Name Resolution Problems?
- Check CoreDNS Pods are running and accessible?
- Check CoreDNS logs
- Check this section
For your time sake:
alias k=kubectlImperative vs. Declarative:
- Imperative (Kubectl Command):
- Testing
- Creating K8s objects temporarily...
- Declarative (Configuration Files)
- Creating permanent components...
- History of configurations
kubectl create service clusterip test-cidr-block --tcp 80:80 --dry-run=client -o yaml > test-svc.yaml
# OR
kubectl create deployment test-nginx --image=nginx --port=80 --replicas=3 --dry-run=client -o yaml > test-nginx.yamlBut remember :You have to clean up a little bit of this file, e.g. creationTimestamp or status






