Kubernetes Tutorial for Beginners

1. Introduction to Kubernetes

What is Kubernetes?

  • Definition: Kubernetes is an open-source container orchestration platform designed to automate the deployment, scaling, and management of containerized applications.
  • Origin: Developed by Google, based on their experience running production workloads at scale.
  • Current status: Maintained by the Cloud Native Computing Foundation (CNCF).

Problems Kubernetes Solves

  1. Complex Container Management: As applications grow, managing hundreds or thousands of containers becomes challenging.
  2. High Availability: Ensures applications remain accessible even if some containers or nodes fail.
  3. Scalability: Easily scale applications up or down based on demand.
  4. Disaster Recovery: Facilitates backup and restore processes for containerized applications.
  5. Efficient Resource Utilization: Optimizes the use of underlying hardware resources.

Features of Container Orchestration Tools

  1. Automated Scheduling: Places containers on nodes based on resource requirements and constraints.
  2. Self-healing: Automatically restarts failed containers or replaces and reschedules containers when nodes die.
  3. Horizontal Scaling: Can scale applications in or out manually or automatically based on CPU, memory usage, or other metrics.
  4. Load Balancing: Distributes network traffic to ensure stable and reliable application performance.
  5. Rolling Updates and Rollbacks: Allows for updates to applications without downtime and enables quick rollbacks if issues occur.
  6. Service Discovery: Automatically detects new services and integrates them into the application environment.
  7. Secret and Configuration Management: Manages sensitive information and application configurations securely.

2. Main Kubernetes Components

Node & Pod

  • Node:
  • A physical or virtual machine in the Kubernetes cluster.
  • Can be a worker node that runs applications or a master node that manages the cluster.
  • Components: kubelet, container runtime, kube-proxy.
  • Pod:
  • Smallest deployable unit in Kubernetes.
  • Contains one or more containers.
  • Shared storage and network resources.
  • Ephemeral by nature (can be created, destroyed, and recreated dynamically).

Service & Ingress

  • Service:
  • Provides a stable endpoint to access a group of pods.
  • Types: ClusterIP, NodePort, LoadBalancer.
  • Enables load balancing and service discovery within the cluster.
  • Ingress:
  • Manages external access to services within a cluster.
  • Provides HTTP/HTTPS routing, SSL termination, and name-based virtual hosting.
  • Requires an Ingress controller to function.

ConfigMap & Secret

  • ConfigMap:
  • Stores non-sensitive configuration data as key-value pairs.
  • Can be consumed by pods as environment variables, command-line arguments, or configuration files.
  • Secret:
  • Similar to ConfigMap but designed for sensitive data (e.g., passwords, tokens, keys).
  • Base64 encoded by default (not encrypted).
  • Can be encrypted at rest when used with appropriate storage solutions.

Volumes

  • Provides persistent storage for pods.
  • Types:
  • EmptyDir: Temporary storage tied to pod lifecycle.
  • HostPath: Mounts a file or directory from the host node’s filesystem.
  • PersistentVolume (PV) and PersistentVolumeClaim (PVC): For more durable storage.
  • Ensures data persistence across container restarts and pod rescheduling.

Deployment & StatefulSet

  • Deployment:
  • Manages the lifecycle of stateless applications.
  • Supports declarative updates, rolling updates, and rollbacks.
  • Ensures the desired number of pod replicas are running.
  • StatefulSet:
  • Manages stateful applications.
  • Provides guarantees about the ordering and uniqueness of pods.
  • Supports stable persistent storage and network identifiers.
  • Useful for applications that require stable hostnames or persist data.

3. Kubernetes Architecture

Worker Nodes

  • Run the actual application workloads.
  • Components:
  1. Kubelet: Primary node agent, ensures containers are running in a pod.
  2. Container Runtime: Software responsible for running containers (e.g., Docker, containerd).
  3. Kube-proxy: Maintains network rules and performs connection forwarding.

Master Nodes

  • Manage the overall state of the cluster.
  • Components:
  1. API Server: Front-end for the Kubernetes control plane, exposes the Kubernetes API.
  2. Scheduler: Assigns pods to nodes based on resource availability and constraints.
  3. Controller Manager: Runs controller processes (e.g., node controller, replication controller).
  4. etcd: Distributed key-value store that stores all cluster data.

API Server

  • Central management entity of the Kubernetes cluster.
  • All communication between components goes through the API server.
  • Validates and processes RESTful requests, updating the state in etcd accordingly.
  • Serves as a frontend for the cluster’s shared state.

Scheduler

  • Watches for newly created pods with no assigned node and selects a node for them to run on.
  • Considers factors such as:
  • Individual and collective resource requirements
  • Hardware/software/policy constraints
  • Affinity and anti-affinity specifications
  • Data locality

Controller Manager

  • Runs various controllers that regulate the state of the cluster.
  • Examples of controllers:
  • Node Controller: Notices and responds when nodes go down.
  • Replication Controller: Maintains the correct number of pods for each replication controller object.
  • Endpoints Controller: Populates the Endpoints object (joins Services & Pods).
  • Service Account & Token Controllers: Create default accounts and API access tokens for new namespaces.

etcd – The Cluster Brain

  • Consistent and highly-available key-value store.
  • Stores all cluster data, including:
  • Job scheduling info
  • Pod details
  • State information
  • API objects
  • Supports watch operations, allowing components to be notified of changes.
  • Critical for maintaining cluster state and facilitating leader election in HA setups.

4. Minikube and kubectl – Local Setup

What is Minikube?

  • A tool that allows you to run Kubernetes locally.
  • Sets up a single-node Kubernetes cluster on your machine.
  • Ideal for learning, development, and testing purposes.
  • Supports most Kubernetes features.

What is kubectl?

  • Command-line tool for interacting with Kubernetes clusters.
  • Allows you to deploy applications, inspect and manage cluster resources.
  • Works with both local (e.g., Minikube) and remote Kubernetes clusters.

Install Minikube and kubectl

  1. Install Minikube:
  • Download and install based on your operating system (Windows, macOS, Linux).
  • Requires a hypervisor (e.g., VirtualBox, Hyper-V) or container runtime (e.g., Docker).
  • Installation command examples for different OS.
  1. Install kubectl:
  • Can be installed via package managers or direct download.
  • Installation command examples for different OS.
  • Verify installation with kubectl version.

Create and start a Minikube cluster

  1. Start Minikube: minikube start
  2. Check cluster status: minikube status
  3. Access Kubernetes dashboard: minikube dashboard
  4. Stop Minikube: minikube stop
  5. Delete Minikube cluster: minikube delete

5. Main kubectl Commands – Kubernetes CLI

Get status of different components

  • kubectl get nodes: List all nodes in the cluster
  • kubectl get pods: List all pods in the current namespace
  • kubectl get services: List all services
  • kubectl get deployments: List all deployments
  • kubectl get <resource> -n <namespace>: Get resources in a specific namespace
  • kubectl get all: List all resources in the current namespace

Create a pod/deployment

  • kubectl create deployment <name> --image=<image>: Create a deployment
  • kubectl run <pod-name> --image=<image>: Create a pod directly

Layers of abstraction

  • Explain the relationship between:
  1. Deployment
  2. ReplicaSet
  3. Pod
  • Show how to view each layer: kubectl get deployments, kubectl get replicasets, kubectl get pods

Change the pod/deployment

  • kubectl edit deployment <name>: Edit deployment configuration
  • kubectl scale deployment <name> --replicas=<number>: Scale the number of replicas
  • kubectl set image deployment/<name> <container-name>=<new-image>: Update container image

Debugging pods

  • kubectl logs <pod-name>: View pod logs
  • kubectl describe pod <pod-name>: Get detailed information about a pod
  • kubectl exec -it <pod-name> -- /bin/bash: Get an interactive shell in a pod

Delete pod/deployment

  • kubectl delete pod <pod-name>: Delete a specific pod
  • kubectl delete deployment <deployment-name>: Delete a deployment
  • kubectl delete -f <file-name.yaml>: Delete resources defined in a YAML file

CRUD by applying configuration file

  • kubectl apply -f <file-name.yaml>: Create or update resources defined in a YAML file
  • kubectl get -f <file-name.yaml>: Get resources defined in a YAML file
  • kubectl delete -f <file-name.yaml>: Delete resources defined in a YAML file

6. Kubernetes YAML Configuration File

3 parts of a Kubernetes config file

  1. Metadata: Information about the object (name, labels, etc.)
  2. Specification: Desired state for the object
  3. Status: Actual state of the object (automatically generated and updated by Kubernetes)

Format of configuration file

apiVersion: <version>
kind: <resource-type>
metadata:
  name: <resource-name>
  labels:
    key: value
spec:
  # Resource-specific configuration

Blueprint for pods (template)

  • Explain how the template section in a Deployment or StatefulSet YAML defines the pod configuration.
  • Show an example of a Deployment YAML with a pod template.

Connecting services to deployments and pods

  • Label: Key-value pairs attached to objects
  • Selector: Used by services to identify which pods to route traffic to
  • Port: Specifies which port the service should use

Example:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 9376

Demo

  • Walk through creating a simple application deployment and service using YAML files.
  • Show how to apply the YAML files and verify the resources are created correctly.

7. Demo Project: MongoDB and MongoExpress

Deploying MongoDB and Mongo Express

  • Overview of the components needed for this demo project.

MongoDB Pod

  • YAML configuration for MongoDB deployment.
  • Explanation of necessary environment variables and volume mounts.

Secret

  • Creating a Secret for MongoDB credentials.
  • YAML configuration for the Secret.
  • How to reference the Secret in the MongoDB deployment.

MongoDB Internal Service

  • YAML configuration for the internal MongoDB service.
  • Explanation of why an internal service is used.

Deployment Service and Config Map

  • YAML configuration for Mongo Express deployment.
  • Creating a ConfigMap for Mongo Express configuration.
  • Referencing the ConfigMap and Secret in the deployment.

Mongo Express External Service

  • YAML configuration for the external Mongo Express service.
  • Explanation of why an external service is used.
  • Discussion on different service types (ClusterIP, NodePort, LoadBalancer).

Step-by-step deployment process

  1. Create the Secret
  2. Create the MongoDB deployment
  3. Create the MongoDB internal service
  4. Create the ConfigMap
  5. Create the Mongo Express deployment
  6. Create the Mongo Express external service
  7. Accessing the Mongo Express web interface

8. Organizing Your Components with Kubernetes Namespaces

What is a Namespace?

  • A virtual cluster within a Kubernetes cluster
  • Used to organize and isolate resources in multi-user environments
  • Provides a scope for names (resource names must be unique within a namespace)

4 Default Namespaces

  1. default: The default namespace for resources without a specified namespace
  2. kube-system: For system components and add-ons
  3. kube-public: Publicly accessible data; mainly used for cluster usage
  4. kube-node-lease: Holds Lease objects associated with each node for node heartbeat purposes

Create a Namespace

  • Using kubectl: kubectl create namespace <namespace-name>
  • Using YAML:
  apiVersion: v1
  kind: Namespace
  metadata:
    name: <namespace-name>

Why use Namespaces? 4 Use Cases

  1. Resource organization: Group related resources together
  2. Resource isolation: Separate resources for different teams or projects
  3. Access control: Apply different permissions to different namespaces
  4. Resource quotas: Limit resource usage per namespace

Characteristics of Namespaces

  • Cannot be nested within each other
  • Each Kubernetes resource can only be in one namespace
  • Some resources (like nodes and persistent volumes) are not namespaced

Create Components in Namespaces

  • Specify the namespace in the metadata section of the resource YAML:
  metadata:
    name: <resource-name>
    namespace: <namespace-name>
  • Use the -n flag with kubectl: kubectl apply -f <file.yaml> -n <namespace-name>

Change Active Namespace

  • Use a tool like kubens to switch between namespaces
  • Set the namespace for a request using the --namespace flag
  • Modify the kubeconfig file to change the default namespace

9. Kubernetes Ingress Explained

What is Ingress? External Service vs. Ingress

  • Ingress: API object that manages external access to services in a cluster
  • Provides HTTP/HTTPS routing, SSL termination, and name-based virtual hosting
  • Compared to LoadBalancer services, Ingress is more flexible and cost-effective

Example YAML Config Files for External Service and Ingress

  • Show example YAML for a LoadBalancer service
  • Show example YAML for an Ingress resource

Internal Service Configuration for Ingress

  • Explain how Ingress works with ClusterIP services
  • Show example YAML for a ClusterIP service used with Ingress

How to configure Ingress in your cluster?

  • Install an Ingress controller (e.g., Nginx, Traefik)
  • Create Ingress resources to define routing rules

What is Ingress Controller?

  • Implementation of Ingress
  • Typically runs as pods within the cluster
  • Examples: Nginx Ingress Controller, Traefik, HAProxy

Environment on which your cluster is running

  • Cloud provider considerations (managed Kubernetes services often provide native Ingress solutions)
  • Bare metal considerations (may require manual setup of Ingress controller and external load balancer)

Demo: Configure Ingress in Minikube

  • Enable Ingress addon in Minikube
  • Create sample applications and services
  • Create and test Ingress rules

Ingress Default Backend

  • Handles requests that don’t match any rules
  • How to set up and customize the default backend

Routing Use Cases

  • Path-based routing
  • Host-based routing
  • TLS/SSL termination

Configuring TLS Certificate

  • Using Kubernetes Secrets to store TLS certificates
  • Configuring Ingress to use TLS

10. Helm – Package Manager

Package Manager and Helm Charts

  • Helm: The package manager for Kubernetes
  • Helm Charts: Pre-configured Kubernetes resources

Templating Engine

  • Using Go templates in Helm
  • Creating dynamic Kubernetes manifests

Use Cases for Helm

  1. Managing complex applications
  2. Sharing applications
  3. Managing releases and rollbacks

Helm Chart Structure

mychart/
  Chart.yaml
  values.yaml
  templates/
  charts/

Values injection into template files

  • Defining default values in values.yaml
  • Overriding values during installation or upgrade

Release Management / Tiller (Helm Version 2!)

  • Note: Tiller was removed in Helm 3
  • Brief explanation of the architectural changes in Helm 3

11. Persisting Data in Kubernetes with Volumes

The need for persistent storage & storage requirements

  • Stateful applications in Kubernetes
  • Challenges with pod lifecycle and data persistence

Persistent Volume (PV)

  • Definition and purpose
  • Types of Persistent Volumes
  • Reclaim policies

Local vs Remote Volume Types

  • Local storage: HostPath, local
  • Remote storage: NFS, cloud provider volumes (e.g., AWS EBS, Azure Disk)

Who creates the PV and when?

  • Static provisioning
  • Dynamic provisioning with Storage Classes

Persistent Volume Claim (PVC)

  • Definition and purpose
  • Relationship between PV and PVC
  • How pods use PVCs

Levels of volume abstractions

  1. Pod
  2. PVC
  3. PV
  4. Storage backend

ConfigMap and Secret as volume types

  • Mounting ConfigMaps and Secrets as volumes
  • Use cases and examples

Storage Class (SC)

  • Automating storage provisioning
  • Defining different classes of storage

12. Deploying Stateful Apps with StatefulSet

What is StatefulSet? Difference of stateless and stateful applications

  • Definition of StatefulSet
  • Characteristics of stateful applications
  • When to use StatefulSets vs Deployments

Deployment of stateful and stateless apps

  • Comparing Deployment and StatefulSet resources
  • Example use cases for each

Deployment vs StatefulSet

  • Ordered pod creation and deletion
  • Stable network identities
  • Persistent storage handling

Pod Identity

  • How StatefulSets provide unique identities to pods
  • Importance of pod identity in stateful applications

Scaling database applications: Master and Worker Pods

  • Example of a stateful database application
  • Handling master-slave replication

Pod state, Pod Identifier

  • How StatefulSets maintain pod state
  • Naming conventions for StatefulSet pods

2 Pod endpoints

  • Headless service for StatefulSets
  • Individual DNS names for pods

13. Kubernetes Services Explained

What is a Service in Kubernetes and when do we need it?

  • Definition of Kubernetes Service
  • Service types and their use cases

ClusterIP Services

  • Internal communication within the cluster
  • How ClusterIP works

Service Communication

  • How services discover and route traffic to pods
  • Role of kube-proxy

Multi-Port Services

  • Configuring services with multiple ports
  • Use cases for multi-port services

Headless Services

  • Definition and use cases
  • How they differ from regular services

NodePort Services

  • Exposing services externally using node ports
  • Configuration and limitations

LoadBalancer Services

  • Cloud provider integration
  • How LoadBalancer services work
  • Considerations for on-premises clusters