Kubernetes Tutorial for Beginners

1. Introduction to Kubernetes

What is Kubernetes?

Definition: Kubernetes is an open-source container orchestration platform designed to automate the deployment, scaling, and management of containerized applications.
Origin: Developed by Google, based on their experience running production workloads at scale.
Current status: Maintained by the Cloud Native Computing Foundation (CNCF).

Problems Kubernetes Solves

Complex Container Management: As applications grow, managing hundreds or thousands of containers becomes challenging.
High Availability: Ensures applications remain accessible even if some containers or nodes fail.
Scalability: Easily scale applications up or down based on demand.
Disaster Recovery: Facilitates backup and restore processes for containerized applications.
Efficient Resource Utilization: Optimizes the use of underlying hardware resources.

Features of Container Orchestration Tools

Automated Scheduling: Places containers on nodes based on resource requirements and constraints.
Self-healing: Automatically restarts failed containers or replaces and reschedules containers when nodes die.
Horizontal Scaling: Can scale applications in or out manually or automatically based on CPU, memory usage, or other metrics.
Load Balancing: Distributes network traffic to ensure stable and reliable application performance.
Rolling Updates and Rollbacks: Allows for updates to applications without downtime and enables quick rollbacks if issues occur.
Service Discovery: Automatically detects new services and integrates them into the application environment.
Secret and Configuration Management: Manages sensitive information and application configurations securely.

2. Main Kubernetes Components

Node & Pod

Node:
A physical or virtual machine in the Kubernetes cluster.
Can be a worker node that runs applications or a master node that manages the cluster.
Components: kubelet, container runtime, kube-proxy.
Pod:
Smallest deployable unit in Kubernetes.
Contains one or more containers.
Shared storage and network resources.
Ephemeral by nature (can be created, destroyed, and recreated dynamically).

Service & Ingress

Service:
Provides a stable endpoint to access a group of pods.
Types: ClusterIP, NodePort, LoadBalancer.
Enables load balancing and service discovery within the cluster.
Ingress:
Manages external access to services within a cluster.
Provides HTTP/HTTPS routing, SSL termination, and name-based virtual hosting.
Requires an Ingress controller to function.

ConfigMap & Secret

ConfigMap:
Stores non-sensitive configuration data as key-value pairs.
Can be consumed by pods as environment variables, command-line arguments, or configuration files.
Secret:
Similar to ConfigMap but designed for sensitive data (e.g., passwords, tokens, keys).
Base64 encoded by default (not encrypted).
Can be encrypted at rest when used with appropriate storage solutions.

Volumes

Provides persistent storage for pods.
Types:
EmptyDir: Temporary storage tied to pod lifecycle.
HostPath: Mounts a file or directory from the host node’s filesystem.
PersistentVolume (PV) and PersistentVolumeClaim (PVC): For more durable storage.
Ensures data persistence across container restarts and pod rescheduling.

Deployment & StatefulSet

Deployment:
Manages the lifecycle of stateless applications.
Supports declarative updates, rolling updates, and rollbacks.
Ensures the desired number of pod replicas are running.
StatefulSet:
Manages stateful applications.
Provides guarantees about the ordering and uniqueness of pods.
Supports stable persistent storage and network identifiers.
Useful for applications that require stable hostnames or persist data.

3. Kubernetes Architecture

Worker Nodes

Run the actual application workloads.
Components:

Kubelet: Primary node agent, ensures containers are running in a pod.
Container Runtime: Software responsible for running containers (e.g., Docker, containerd).
Kube-proxy: Maintains network rules and performs connection forwarding.

Master Nodes

Manage the overall state of the cluster.
Components:

API Server: Front-end for the Kubernetes control plane, exposes the Kubernetes API.
Scheduler: Assigns pods to nodes based on resource availability and constraints.
Controller Manager: Runs controller processes (e.g., node controller, replication controller).
etcd: Distributed key-value store that stores all cluster data.

API Server

Central management entity of the Kubernetes cluster.
All communication between components goes through the API server.
Validates and processes RESTful requests, updating the state in etcd accordingly.
Serves as a frontend for the cluster’s shared state.

Scheduler

Watches for newly created pods with no assigned node and selects a node for them to run on.
Considers factors such as:
Individual and collective resource requirements
Hardware/software/policy constraints
Affinity and anti-affinity specifications
Data locality

Controller Manager

Runs various controllers that regulate the state of the cluster.
Examples of controllers:
Node Controller: Notices and responds when nodes go down.
Replication Controller: Maintains the correct number of pods for each replication controller object.
Endpoints Controller: Populates the Endpoints object (joins Services & Pods).
Service Account & Token Controllers: Create default accounts and API access tokens for new namespaces.

etcd – The Cluster Brain

Consistent and highly-available key-value store.
Stores all cluster data, including:
Job scheduling info
Pod details
State information
API objects
Supports watch operations, allowing components to be notified of changes.
Critical for maintaining cluster state and facilitating leader election in HA setups.

4. Minikube and kubectl – Local Setup

What is Minikube?

A tool that allows you to run Kubernetes locally.
Sets up a single-node Kubernetes cluster on your machine.
Ideal for learning, development, and testing purposes.
Supports most Kubernetes features.

What is kubectl?

Command-line tool for interacting with Kubernetes clusters.
Allows you to deploy applications, inspect and manage cluster resources.
Works with both local (e.g., Minikube) and remote Kubernetes clusters.

Install Minikube and kubectl

Install Minikube:

Download and install based on your operating system (Windows, macOS, Linux).
Requires a hypervisor (e.g., VirtualBox, Hyper-V) or container runtime (e.g., Docker).
Installation command examples for different OS.

Install kubectl:

Can be installed via package managers or direct download.
Installation command examples for different OS.
Verify installation with kubectl version.

Create and start a Minikube cluster

Start Minikube: minikube start
Check cluster status: minikube status
Access Kubernetes dashboard: minikube dashboard
Stop Minikube: minikube stop
Delete Minikube cluster: minikube delete

5. Main kubectl Commands – Kubernetes CLI

Get status of different components

kubectl get nodes: List all nodes in the cluster
kubectl get pods: List all pods in the current namespace
kubectl get services: List all services
kubectl get deployments: List all deployments
kubectl get <resource> -n <namespace>: Get resources in a specific namespace
kubectl get all: List all resources in the current namespace

Create a pod/deployment

kubectl create deployment <name> --image=<image>: Create a deployment
kubectl run <pod-name> --image=<image>: Create a pod directly

Layers of abstraction

Explain the relationship between:

Deployment
ReplicaSet
Pod

Show how to view each layer: kubectl get deployments, kubectl get replicasets, kubectl get pods

Change the pod/deployment

kubectl edit deployment <name>: Edit deployment configuration
kubectl scale deployment <name> --replicas=<number>: Scale the number of replicas
kubectl set image deployment/<name> <container-name>=<new-image>: Update container image

Debugging pods

kubectl logs <pod-name>: View pod logs
kubectl describe pod <pod-name>: Get detailed information about a pod
kubectl exec -it <pod-name> -- /bin/bash: Get an interactive shell in a pod

Delete pod/deployment

kubectl delete pod <pod-name>: Delete a specific pod
kubectl delete deployment <deployment-name>: Delete a deployment
kubectl delete -f <file-name.yaml>: Delete resources defined in a YAML file

CRUD by applying configuration file

kubectl apply -f <file-name.yaml>: Create or update resources defined in a YAML file
kubectl get -f <file-name.yaml>: Get resources defined in a YAML file
kubectl delete -f <file-name.yaml>: Delete resources defined in a YAML file

6. Kubernetes YAML Configuration File

3 parts of a Kubernetes config file

Metadata: Information about the object (name, labels, etc.)
Specification: Desired state for the object
Status: Actual state of the object (automatically generated and updated by Kubernetes)

Format of configuration file

apiVersion: <version>
kind: <resource-type>
metadata:
  name: <resource-name>
  labels:
    key: value
spec:
  # Resource-specific configuration

Blueprint for pods (template)

Explain how the template section in a Deployment or StatefulSet YAML defines the pod configuration.
Show an example of a Deployment YAML with a pod template.

Connecting services to deployments and pods

Label: Key-value pairs attached to objects
Selector: Used by services to identify which pods to route traffic to
Port: Specifies which port the service should use

Example:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 9376

Demo

Walk through creating a simple application deployment and service using YAML files.
Show how to apply the YAML files and verify the resources are created correctly.

7. Demo Project: MongoDB and MongoExpress

Deploying MongoDB and Mongo Express

Overview of the components needed for this demo project.

MongoDB Pod

YAML configuration for MongoDB deployment.
Explanation of necessary environment variables and volume mounts.

Secret

Creating a Secret for MongoDB credentials.
YAML configuration for the Secret.
How to reference the Secret in the MongoDB deployment.

MongoDB Internal Service

YAML configuration for the internal MongoDB service.
Explanation of why an internal service is used.

Deployment Service and Config Map

YAML configuration for Mongo Express deployment.
Creating a ConfigMap for Mongo Express configuration.
Referencing the ConfigMap and Secret in the deployment.

Mongo Express External Service

YAML configuration for the external Mongo Express service.
Explanation of why an external service is used.
Discussion on different service types (ClusterIP, NodePort, LoadBalancer).

Step-by-step deployment process

Create the Secret
Create the MongoDB deployment
Create the MongoDB internal service
Create the ConfigMap
Create the Mongo Express deployment
Create the Mongo Express external service
Accessing the Mongo Express web interface

8. Organizing Your Components with Kubernetes Namespaces

What is a Namespace?

A virtual cluster within a Kubernetes cluster
Used to organize and isolate resources in multi-user environments
Provides a scope for names (resource names must be unique within a namespace)

4 Default Namespaces

default: The default namespace for resources without a specified namespace
kube-system: For system components and add-ons
kube-public: Publicly accessible data; mainly used for cluster usage
kube-node-lease: Holds Lease objects associated with each node for node heartbeat purposes

Create a Namespace

Using kubectl: kubectl create namespace <namespace-name>
Using YAML:

  apiVersion: v1
  kind: Namespace
  metadata:
    name: <namespace-name>

Why use Namespaces? 4 Use Cases

Resource organization: Group related resources together
Resource isolation: Separate resources for different teams or projects
Access control: Apply different permissions to different namespaces
Resource quotas: Limit resource usage per namespace

Characteristics of Namespaces

Cannot be nested within each other
Each Kubernetes resource can only be in one namespace
Some resources (like nodes and persistent volumes) are not namespaced

Create Components in Namespaces

Specify the namespace in the metadata section of the resource YAML:

  metadata:
    name: <resource-name>
    namespace: <namespace-name>

Use the -n flag with kubectl: kubectl apply -f <file.yaml> -n <namespace-name>

Change Active Namespace

Use a tool like kubens to switch between namespaces
Set the namespace for a request using the --namespace flag
Modify the kubeconfig file to change the default namespace

9. Kubernetes Ingress Explained

What is Ingress? External Service vs. Ingress

Ingress: API object that manages external access to services in a cluster
Provides HTTP/HTTPS routing, SSL termination, and name-based virtual hosting
Compared to LoadBalancer services, Ingress is more flexible and cost-effective

Example YAML Config Files for External Service and Ingress

Show example YAML for a LoadBalancer service
Show example YAML for an Ingress resource

Internal Service Configuration for Ingress

Explain how Ingress works with ClusterIP services
Show example YAML for a ClusterIP service used with Ingress

How to configure Ingress in your cluster?

Install an Ingress controller (e.g., Nginx, Traefik)
Create Ingress resources to define routing rules

What is Ingress Controller?

Implementation of Ingress
Typically runs as pods within the cluster
Examples: Nginx Ingress Controller, Traefik, HAProxy

Environment on which your cluster is running

Cloud provider considerations (managed Kubernetes services often provide native Ingress solutions)
Bare metal considerations (may require manual setup of Ingress controller and external load balancer)

Demo: Configure Ingress in Minikube

Enable Ingress addon in Minikube
Create sample applications and services
Create and test Ingress rules

Ingress Default Backend

Handles requests that don’t match any rules
How to set up and customize the default backend

Routing Use Cases

Path-based routing
Host-based routing
TLS/SSL termination

Configuring TLS Certificate

Using Kubernetes Secrets to store TLS certificates
Configuring Ingress to use TLS

10. Helm – Package Manager

Package Manager and Helm Charts

Helm: The package manager for Kubernetes
Helm Charts: Pre-configured Kubernetes resources

Templating Engine

Using Go templates in Helm
Creating dynamic Kubernetes manifests

Use Cases for Helm

Managing complex applications
Sharing applications
Managing releases and rollbacks

Helm Chart Structure

mychart/
  Chart.yaml
  values.yaml
  templates/
  charts/

Values injection into template files

Defining default values in values.yaml
Overriding values during installation or upgrade

Release Management / Tiller (Helm Version 2!)

Note: Tiller was removed in Helm 3
Brief explanation of the architectural changes in Helm 3

11. Persisting Data in Kubernetes with Volumes

The need for persistent storage & storage requirements

Stateful applications in Kubernetes
Challenges with pod lifecycle and data persistence

Persistent Volume (PV)

Definition and purpose
Types of Persistent Volumes
Reclaim policies

Local vs Remote Volume Types

Local storage: HostPath, local
Remote storage: NFS, cloud provider volumes (e.g., AWS EBS, Azure Disk)

Who creates the PV and when?

Static provisioning
Dynamic provisioning with Storage Classes

Persistent Volume Claim (PVC)

Definition and purpose
Relationship between PV and PVC
How pods use PVCs

Levels of volume abstractions

Pod
PVC
PV
Storage backend

ConfigMap and Secret as volume types

Mounting ConfigMaps and Secrets as volumes
Use cases and examples

Storage Class (SC)

Automating storage provisioning
Defining different classes of storage

12. Deploying Stateful Apps with StatefulSet

What is StatefulSet? Difference of stateless and stateful applications

Definition of StatefulSet
Characteristics of stateful applications
When to use StatefulSets vs Deployments

Deployment of stateful and stateless apps

Comparing Deployment and StatefulSet resources
Example use cases for each

Deployment vs StatefulSet

Ordered pod creation and deletion
Stable network identities
Persistent storage handling

Pod Identity

How StatefulSets provide unique identities to pods
Importance of pod identity in stateful applications

Scaling database applications: Master and Worker Pods

Example of a stateful database application
Handling master-slave replication

Pod state, Pod Identifier

How StatefulSets maintain pod state
Naming conventions for StatefulSet pods

2 Pod endpoints

Headless service for StatefulSets
Individual DNS names for pods

13. Kubernetes Services Explained

What is a Service in Kubernetes and when do we need it?

Definition of Kubernetes Service
Service types and their use cases

ClusterIP Services

Internal communication within the cluster
How ClusterIP works

Service Communication

How services discover and route traffic to pods
Role of kube-proxy

Multi-Port Services

Configuring services with multiple ports
Use cases for multi-port services

Headless Services

Definition and use cases
How they differ from regular services

NodePort Services

Exposing services externally using node ports
Configuration and limitations

LoadBalancer Services

Cloud provider integration
How LoadBalancer services work
Considerations for on-premises clusters