PostgreSQL on Kubernetes — Complete Setup Guide with CloudNativePG

PostgreSQL database running inside a Kubernetes cluster with CloudNativePG operator managing high availability and replication

Running PostgreSQL in Kubernetes used to be a bad idea. StatefulSets were tricky, persistent volumes were unreliable, and failover meant data loss. Most teams defaulted to managed cloud databases and called it done.

That calculus has changed. CloudNativePG — the CNCF-listed PostgreSQL operator — handles high availability, automated failover, Point-in-Time Recovery, connection pooling, and streaming replication out of the box. In 2026 it’s the production-grade way to run PostgreSQL on Kubernetes, and the gap between “self-hosted on K8s” and “managed cloud database” has narrowed significantly.

This guide walks you through a complete CloudNativePG installation, deploying a production-ready PostgreSQL cluster, configuring backups, and the operational practices that actually matter when your database lives inside Kubernetes.


Why CloudNativePG Over a Plain StatefulSet?

Before we get into the setup, it’s worth being clear on why you’d use an operator instead of writing your own StatefulSet.

A plain StatefulSet gets you a pod with a persistent volume. That’s it. You still need to handle: primary election, streaming replication, automatic failover when the primary dies, backup scheduling, PITR recovery, connection pooling, and certificate rotation. That’s months of work and ongoing operational burden.

CloudNativePG handles all of that. It’s a Kubernetes operator built specifically for PostgreSQL — it knows what Postgres is, not just that it’s a process that needs a volume. The operator watches your Cluster resources and continuously reconciles the actual state of your database cluster against the desired state, the same way ArgoCD reconciles your application manifests.

Key capabilities:

  • High availability — multi-instance clusters with automatic primary election via Raft consensus
  • Streaming replication — synchronous or asynchronous, configurable per-cluster
  • Automated failover — promotes a standby in seconds when the primary fails
  • Backup and PITR — native integration with object storage (S3, GCS, Azure Blob)
  • Connection pooling — built-in PgBouncer support via the Pooler resource
  • TLS everywhere — certificates managed automatically, rotation handled by the operator

If you’re evaluating whether to self-host on Kubernetes or use a managed database service, our comparison of AWS, Azure, and GCP databases covers the trade-offs in depth. For teams that want operational control without managing the control plane, DigitalOcean Managed Databases is worth considering as an alternative.


Prerequisites

  • A running Kubernetes cluster (Kubernetes 1.26+)
  • kubectl configured — see our Kubernetes commands reference if you need a refresher
  • helm installed (v3+)
  • An S3-compatible object store for backups (AWS S3, MinIO, DigitalOcean Spaces, GCS all work)
  • Basic familiarity with Kubernetes — if you’re new to K8s concepts, start with Kubernetes vs Docker

Step 1 — Install the CloudNativePG Operator

The operator runs in its own namespace and watches for Cluster, Backup, and Pooler resources across all namespaces.

# Install via kubectl (latest stable)
kubectl apply --server-side -f \
  https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.24/releases/cnpg-1.24.0.yaml

# Verify the operator is running
kubectl get deployment -n cnpg-system cnpg-controller-manager

Wait for the operator pod to be in Running state:

kubectl wait --for=condition=available \
  --timeout=300s \
  deployment/cnpg-controller-manager \
  -n cnpg-system

Install the kubectl-cnpg plugin — it adds useful commands for managing clusters:

# macOS
brew install cloudnative-pg

# Linux
curl -sSfL \
  https://github.com/cloudnative-pg/cloudnative-pg/raw/main/hack/install-cnpg-plugin.sh | \
  sudo sh -s -- -b /usr/local/bin

# Verify
kubectl cnpg version


Step 2 — Create Your First PostgreSQL Cluster

A CloudNativePG Cluster resource defines everything about your PostgreSQL cluster — instance count, storage, PostgreSQL version, backup config, and replication settings.

Start with a simple 3-instance cluster (1 primary + 2 standbys):

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-cluster
  namespace: production
spec:
  instances: 3                          # 1 primary + 2 standbys
  imageName: ghcr.io/cloudnative-pg/postgresql:16.3

  postgresql:
    parameters:
      max_connections: "200"
      shared_buffers: "256MB"
      effective_cache_size: "768MB"
      maintenance_work_mem: "64MB"
      checkpoint_completion_target: "0.9"
      wal_buffers: "16MB"
      default_statistics_target: "100"
      log_min_duration_statement: "1000"  # Log queries taking over 1 second

  storage:
    size: 20Gi
    storageClass: standard               # Use your cluster's storage class

  resources:
    requests:
      memory: "512Mi"
      cpu: "500m"
    limits:
      memory: "1Gi"
      cpu: "1000m"

  bootstrap:
    initdb:
      database: myapp
      owner: myapp_user
      secret:
        name: postgres-app-secret        # Reference to a Secret (created below)

Create the application credentials secret first:

kubectl create secret generic postgres-app-secret \
  --from-literal=username=myapp_user \
  --from-literal=password=$(openssl rand -base64 24) \
  -n production

Apply the cluster:

kubectl apply -f postgres-cluster.yaml

Watch the cluster come up — this takes 2–3 minutes on first run:

kubectl get cluster postgres-cluster -n production -w

You’ll see the status progress through Setting up primaryCreating replicaCluster in healthy state.

Check the cluster status in detail:

kubectl cnpg status postgres-cluster -n production


Step 3 — Connect to Your PostgreSQL Cluster

CloudNativePG creates several services automatically:

kubectl get svc -n production | grep postgres

You’ll see:

  • postgres-cluster-rw — points to the primary (read-write)
  • postgres-cluster-ro — points to standbys (read-only, load-balanced)
  • postgres-cluster-r — points to all instances (any replica)

Connect from inside the cluster:

# Example deployment connecting to PostgreSQL
env:
  - name: DATABASE_URL
    value: "postgresql://myapp_user:$(DB_PASSWORD)@postgres-cluster-rw.production.svc.cluster.local:5432/myapp"
  - name: DATABASE_READONLY_URL
    value: "postgresql://myapp_user:$(DB_PASSWORD)@postgres-cluster-ro.production.svc.cluster.local:5432/myapp"

Connect from outside the cluster (for debugging):

# Port-forward the primary service
kubectl port-forward svc/postgres-cluster-rw 5432:5432 -n production

# Get the superuser password
kubectl get secret postgres-cluster-superuser \
  -n production \
  -o jsonpath='{.data.password}' | base64 -d

# Connect with psql
psql -h localhost -U postgres -d myapp

Use read replicas for read-heavy workloads. Point your application’s read queries at postgres-cluster-ro and writes at postgres-cluster-rw. CloudNativePG load-balances across all healthy standbys automatically.


Step 4 — Configure Backups to Object Storage

No backup configuration = no database. This is non-negotiable. CloudNativePG integrates natively with S3-compatible object storage for both base backups and WAL archiving.

Create the backup credentials secret:

kubectl create secret generic backup-storage-creds \
  --from-literal=ACCESS_KEY_ID=your-access-key \
  --from-literal=ACCESS_SECRET_KEY=your-secret-key \
  -n production

Update your Cluster to enable WAL archiving:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-cluster
  namespace: production
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:16.3

  # ... (previous config) ...

  backup:
    barmanObjectStore:
      destinationPath: s3://your-bucket-name/postgres-cluster/
      s3Credentials:
        accessKeyId:
          name: backup-storage-creds
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: backup-storage-creds
          key: ACCESS_SECRET_KEY
      wal:
        compression: gzip
        maxParallel: 2
    retentionPolicy: "30d"              # Keep backups for 30 days

Create a ScheduledBackup for automated daily backups:

apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: postgres-daily-backup
  namespace: production
spec:
  schedule: "0 2 * * *"               # 2am UTC daily
  backupOwnerReference: self
  cluster:
    name: postgres-cluster
  target: primary                      # Always back up from primary
  method: barmanObjectStore

Apply both:

kubectl apply -f postgres-cluster.yaml   # updated with backup config
kubectl apply -f scheduled-backup.yaml

Verify WAL archiving is working:

kubectl cnpg status postgres-cluster -n production | grep -A5 "Continuous Archiving"

You should see WAL archiving: OK.


Step 5 — Set Up Connection Pooling with PgBouncer

Direct connections to PostgreSQL are expensive — each connection spawns a process and consumes memory. In a Kubernetes environment where you might have dozens of pods each opening their own connection pool, this adds up fast.

CloudNativePG’s Pooler resource deploys PgBouncer in front of your cluster:

apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
  name: postgres-pooler
  namespace: production
spec:
  cluster:
    name: postgres-cluster
  instances: 2                          # PgBouncer replicas for HA
  type: rw                              # Pool connections to primary
  pgbouncer:
    poolMode: transaction               # Transaction pooling (most efficient)
    parameters:
      max_client_conn: "500"
      default_pool_size: "25"
      reserve_pool_size: "5"
      reserve_pool_timeout: "5"
      server_idle_timeout: "600"

Apply it:

kubectl apply -f pooler.yaml

This creates a postgres-pooler-rw service. Point your application at the pooler instead of directly at the cluster:

# Use the pooler, not the cluster directly
value: "postgresql://myapp_user:$(DB_PASSWORD)@postgres-pooler-rw.production.svc.cluster.local:5432/myapp"


Step 6 — RBAC and Network Policies for PostgreSQL

Your database pods need tight access controls. Lock down who can connect at both the Kubernetes and network layer.

Make sure your cluster’s Kubernetes RBAC is configured so only your application service accounts can interact with the PostgreSQL cluster resources. The CloudNativePG operator already creates its own RBAC roles — don’t give application pods access to Cluster or Backup resources.

At the network level, add a Kubernetes Network Policy to ensure only your application pods can reach the database:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-app-to-postgres
  namespace: production
spec:
  podSelector:
    matchLabels:
      cnpg.io/cluster: postgres-cluster   # Selects all PostgreSQL pods
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: myapp                   # Only your app pods
      ports:
        - protocol: TCP
          port: 5432
    - from:
        - podSelector:
            matchLabels:
              cnpg.io/cluster: postgres-cluster  # Allow inter-cluster replication
      ports:
        - protocol: TCP
          port: 5432


Step 7 — Testing Failover

One of the main reasons to use CloudNativePG over a plain StatefulSet is automated failover. Test it before you need it in production.

# Check which pod is currently the primary
kubectl cnpg status postgres-cluster -n production | grep "Primary"

# Simulate a primary failure by deleting the primary pod
kubectl delete pod postgres-cluster-1 -n production

# Watch the failover happen — should complete in 10-30 seconds
kubectl get pods -n production -w
kubectl cnpg status postgres-cluster -n production

The operator detects the primary is gone, promotes the most up-to-date standby, updates the postgres-cluster-rw service to point to the new primary, and starts rebuilding the failed pod as a new standby. Your application experiences a brief connection interruption — configure your connection retry logic to handle this gracefully.


Point-in-Time Recovery

The real power of WAL archiving is PITR — recovering your database to any point in time, not just the last backup.

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-cluster-restored
  namespace: production
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:16.3

  storage:
    size: 20Gi

  bootstrap:
    recovery:
      source: postgres-cluster
      recoveryTarget:
        targetTime: "2026-06-10 14:30:00"   # Recover to this exact point in time

  externalClusters:
    - name: postgres-cluster
      barmanObjectStore:
        destinationPath: s3://your-bucket-name/postgres-cluster/
        s3Credentials:
          accessKeyId:
            name: backup-storage-creds
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: backup-storage-creds
            key: ACCESS_SECRET_KEY

This creates a new cluster restored to the specified point in time. Run it alongside your original cluster to verify data before cutting over.


Common Mistakes

1. No backup configuration on day one The most expensive mistake. WAL archiving must be configured before you put data in the database — you can’t retroactively enable PITR. Configure backups before your first application write.

2. Connecting directly to pods instead of services Pod IPs change. Always connect to the service (postgres-cluster-rw) — the operator manages which pod the service points to during failover.

3. Not testing failover Automated failover means nothing if you haven’t verified it works in your environment and your application handles the reconnection correctly. Test it in staging before you rely on it in production.

4. Under-sizing storage PostgreSQL data grows. WAL files accumulate. Start with more storage than you think you need, and set up storage monitoring. Expanding a PVC after the fact is possible but disruptive.

5. Wrong pool mode for your workload PgBouncer’s transaction mode is most efficient but doesn’t support all PostgreSQL features (prepared statements, advisory locks, SET commands). If your application uses these, use session mode instead.

6. Ignoring pg_hba.conf defaults CloudNativePG configures pg_hba.conf to allow connections from within the cluster by default. Pair this with Network Policies — don’t assume Kubernetes networking provides sufficient isolation on its own.


Best Practices

  • Size your instances properly — set resources.requests based on actual query load, not guesses. Use pg_stat_statements to identify heavy queries.
  • Use synchronous replication for critical data — set minSyncReplicas: 1 and maxSyncReplicas: 1 on your cluster to ensure at least one standby is always in sync before a write is acknowledged
  • Monitor with Prometheus — CloudNativePG exposes metrics on port 9187. Scrape them with your existing Prometheus setup. Key metrics: cnpg_pg_postmaster_start_time, cnpg_backends_total, cnpg_pg_replication_lag
  • Separate namespaces for databases — don’t run your database cluster in the same namespace as your applications
  • Use kubectl cnpg promote for planned failovers — it performs a clean switchover without data loss, unlike deleting the primary pod
  • Encrypt storage — use a storage class with encryption enabled or configure PostgreSQL TDE for data at rest

What’s Next

With PostgreSQL running on Kubernetes, the next logical steps in the database cluster are:

  • Database migrations in CI/CD — automating schema changes with Flyway or Liquibase in your GitOps pipeline, now that you have ArgoCD running (see our ArgoCD setup guide)
  • Monitoring PostgreSQL — scraping CloudNativePG metrics with Prometheus and building Grafana dashboards for query performance, replication lag, and connection pool saturation
  • Secrets management — moving database credentials out of Kubernetes Secrets and into a proper secrets backend using External Secrets Operator

If you decide that managing PostgreSQL on Kubernetes isn’t the right fit for your team’s operational capacity, DigitalOcean Managed Databases offers a fully managed PostgreSQL service with automated backups, failover, and connection pooling — without the operational overhead of running the operator yourself.

The right choice depends on your team’s Kubernetes experience and tolerance for database operations. CloudNativePG narrows the gap significantly, but managed databases still win on operational simplicity.

Leave a Reply