
Running PostgreSQL in Kubernetes used to be a bad idea. StatefulSets were tricky, persistent volumes were unreliable, and failover meant data loss. Most teams defaulted to managed cloud databases and called it done.
That calculus has changed. CloudNativePG — the CNCF-listed PostgreSQL operator — handles high availability, automated failover, Point-in-Time Recovery, connection pooling, and streaming replication out of the box. In 2026 it’s the production-grade way to run PostgreSQL on Kubernetes, and the gap between “self-hosted on K8s” and “managed cloud database” has narrowed significantly.
This guide walks you through a complete CloudNativePG installation, deploying a production-ready PostgreSQL cluster, configuring backups, and the operational practices that actually matter when your database lives inside Kubernetes.
Why CloudNativePG Over a Plain StatefulSet?
Before we get into the setup, it’s worth being clear on why you’d use an operator instead of writing your own StatefulSet.
A plain StatefulSet gets you a pod with a persistent volume. That’s it. You still need to handle: primary election, streaming replication, automatic failover when the primary dies, backup scheduling, PITR recovery, connection pooling, and certificate rotation. That’s months of work and ongoing operational burden.
CloudNativePG handles all of that. It’s a Kubernetes operator built specifically for PostgreSQL — it knows what Postgres is, not just that it’s a process that needs a volume. The operator watches your Cluster resources and continuously reconciles the actual state of your database cluster against the desired state, the same way ArgoCD reconciles your application manifests.
Key capabilities:
- High availability — multi-instance clusters with automatic primary election via Raft consensus
- Streaming replication — synchronous or asynchronous, configurable per-cluster
- Automated failover — promotes a standby in seconds when the primary fails
- Backup and PITR — native integration with object storage (S3, GCS, Azure Blob)
- Connection pooling — built-in PgBouncer support via the
Poolerresource - TLS everywhere — certificates managed automatically, rotation handled by the operator
If you’re evaluating whether to self-host on Kubernetes or use a managed database service, our comparison of AWS, Azure, and GCP databases covers the trade-offs in depth. For teams that want operational control without managing the control plane, DigitalOcean Managed Databases is worth considering as an alternative.
Prerequisites
- A running Kubernetes cluster (Kubernetes 1.26+)
kubectlconfigured — see our Kubernetes commands reference if you need a refresherhelminstalled (v3+)- An S3-compatible object store for backups (AWS S3, MinIO, DigitalOcean Spaces, GCS all work)
- Basic familiarity with Kubernetes — if you’re new to K8s concepts, start with Kubernetes vs Docker
Step 1 — Install the CloudNativePG Operator
The operator runs in its own namespace and watches for Cluster, Backup, and Pooler resources across all namespaces.
# Install via kubectl (latest stable)
kubectl apply --server-side -f \
https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.24/releases/cnpg-1.24.0.yaml
# Verify the operator is running
kubectl get deployment -n cnpg-system cnpg-controller-manager
Wait for the operator pod to be in Running state:
kubectl wait --for=condition=available \
--timeout=300s \
deployment/cnpg-controller-manager \
-n cnpg-system
Install the kubectl-cnpg plugin — it adds useful commands for managing clusters:
# macOS
brew install cloudnative-pg
# Linux
curl -sSfL \
https://github.com/cloudnative-pg/cloudnative-pg/raw/main/hack/install-cnpg-plugin.sh | \
sudo sh -s -- -b /usr/local/bin
# Verify
kubectl cnpg version
Step 2 — Create Your First PostgreSQL Cluster
A CloudNativePG Cluster resource defines everything about your PostgreSQL cluster — instance count, storage, PostgreSQL version, backup config, and replication settings.
Start with a simple 3-instance cluster (1 primary + 2 standbys):
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgres-cluster
namespace: production
spec:
instances: 3 # 1 primary + 2 standbys
imageName: ghcr.io/cloudnative-pg/postgresql:16.3
postgresql:
parameters:
max_connections: "200"
shared_buffers: "256MB"
effective_cache_size: "768MB"
maintenance_work_mem: "64MB"
checkpoint_completion_target: "0.9"
wal_buffers: "16MB"
default_statistics_target: "100"
log_min_duration_statement: "1000" # Log queries taking over 1 second
storage:
size: 20Gi
storageClass: standard # Use your cluster's storage class
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
bootstrap:
initdb:
database: myapp
owner: myapp_user
secret:
name: postgres-app-secret # Reference to a Secret (created below)
Create the application credentials secret first:
kubectl create secret generic postgres-app-secret \
--from-literal=username=myapp_user \
--from-literal=password=$(openssl rand -base64 24) \
-n production
Apply the cluster:
kubectl apply -f postgres-cluster.yaml
Watch the cluster come up — this takes 2–3 minutes on first run:
kubectl get cluster postgres-cluster -n production -w
You’ll see the status progress through Setting up primary → Creating replica → Cluster in healthy state.
Check the cluster status in detail:
kubectl cnpg status postgres-cluster -n production
Step 3 — Connect to Your PostgreSQL Cluster
CloudNativePG creates several services automatically:
kubectl get svc -n production | grep postgres
You’ll see:
postgres-cluster-rw— points to the primary (read-write)postgres-cluster-ro— points to standbys (read-only, load-balanced)postgres-cluster-r— points to all instances (any replica)
Connect from inside the cluster:
# Example deployment connecting to PostgreSQL
env:
- name: DATABASE_URL
value: "postgresql://myapp_user:$(DB_PASSWORD)@postgres-cluster-rw.production.svc.cluster.local:5432/myapp"
- name: DATABASE_READONLY_URL
value: "postgresql://myapp_user:$(DB_PASSWORD)@postgres-cluster-ro.production.svc.cluster.local:5432/myapp"
Connect from outside the cluster (for debugging):
# Port-forward the primary service
kubectl port-forward svc/postgres-cluster-rw 5432:5432 -n production
# Get the superuser password
kubectl get secret postgres-cluster-superuser \
-n production \
-o jsonpath='{.data.password}' | base64 -d
# Connect with psql
psql -h localhost -U postgres -d myapp
Use read replicas for read-heavy workloads. Point your application’s read queries at postgres-cluster-ro and writes at postgres-cluster-rw. CloudNativePG load-balances across all healthy standbys automatically.
Step 4 — Configure Backups to Object Storage
No backup configuration = no database. This is non-negotiable. CloudNativePG integrates natively with S3-compatible object storage for both base backups and WAL archiving.
Create the backup credentials secret:
kubectl create secret generic backup-storage-creds \
--from-literal=ACCESS_KEY_ID=your-access-key \
--from-literal=ACCESS_SECRET_KEY=your-secret-key \
-n production
Update your Cluster to enable WAL archiving:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgres-cluster
namespace: production
spec:
instances: 3
imageName: ghcr.io/cloudnative-pg/postgresql:16.3
# ... (previous config) ...
backup:
barmanObjectStore:
destinationPath: s3://your-bucket-name/postgres-cluster/
s3Credentials:
accessKeyId:
name: backup-storage-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: backup-storage-creds
key: ACCESS_SECRET_KEY
wal:
compression: gzip
maxParallel: 2
retentionPolicy: "30d" # Keep backups for 30 days
Create a ScheduledBackup for automated daily backups:
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: postgres-daily-backup
namespace: production
spec:
schedule: "0 2 * * *" # 2am UTC daily
backupOwnerReference: self
cluster:
name: postgres-cluster
target: primary # Always back up from primary
method: barmanObjectStore
Apply both:
kubectl apply -f postgres-cluster.yaml # updated with backup config
kubectl apply -f scheduled-backup.yaml
Verify WAL archiving is working:
kubectl cnpg status postgres-cluster -n production | grep -A5 "Continuous Archiving"
You should see WAL archiving: OK.
Step 5 — Set Up Connection Pooling with PgBouncer
Direct connections to PostgreSQL are expensive — each connection spawns a process and consumes memory. In a Kubernetes environment where you might have dozens of pods each opening their own connection pool, this adds up fast.
CloudNativePG’s Pooler resource deploys PgBouncer in front of your cluster:
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
name: postgres-pooler
namespace: production
spec:
cluster:
name: postgres-cluster
instances: 2 # PgBouncer replicas for HA
type: rw # Pool connections to primary
pgbouncer:
poolMode: transaction # Transaction pooling (most efficient)
parameters:
max_client_conn: "500"
default_pool_size: "25"
reserve_pool_size: "5"
reserve_pool_timeout: "5"
server_idle_timeout: "600"
Apply it:
kubectl apply -f pooler.yaml
This creates a postgres-pooler-rw service. Point your application at the pooler instead of directly at the cluster:
# Use the pooler, not the cluster directly
value: "postgresql://myapp_user:$(DB_PASSWORD)@postgres-pooler-rw.production.svc.cluster.local:5432/myapp"
Step 6 — RBAC and Network Policies for PostgreSQL
Your database pods need tight access controls. Lock down who can connect at both the Kubernetes and network layer.
Make sure your cluster’s Kubernetes RBAC is configured so only your application service accounts can interact with the PostgreSQL cluster resources. The CloudNativePG operator already creates its own RBAC roles — don’t give application pods access to Cluster or Backup resources.
At the network level, add a Kubernetes Network Policy to ensure only your application pods can reach the database:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-app-to-postgres
namespace: production
spec:
podSelector:
matchLabels:
cnpg.io/cluster: postgres-cluster # Selects all PostgreSQL pods
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: myapp # Only your app pods
ports:
- protocol: TCP
port: 5432
- from:
- podSelector:
matchLabels:
cnpg.io/cluster: postgres-cluster # Allow inter-cluster replication
ports:
- protocol: TCP
port: 5432
Step 7 — Testing Failover
One of the main reasons to use CloudNativePG over a plain StatefulSet is automated failover. Test it before you need it in production.
# Check which pod is currently the primary
kubectl cnpg status postgres-cluster -n production | grep "Primary"
# Simulate a primary failure by deleting the primary pod
kubectl delete pod postgres-cluster-1 -n production
# Watch the failover happen — should complete in 10-30 seconds
kubectl get pods -n production -w
kubectl cnpg status postgres-cluster -n production
The operator detects the primary is gone, promotes the most up-to-date standby, updates the postgres-cluster-rw service to point to the new primary, and starts rebuilding the failed pod as a new standby. Your application experiences a brief connection interruption — configure your connection retry logic to handle this gracefully.
Point-in-Time Recovery
The real power of WAL archiving is PITR — recovering your database to any point in time, not just the last backup.
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: postgres-cluster-restored
namespace: production
spec:
instances: 3
imageName: ghcr.io/cloudnative-pg/postgresql:16.3
storage:
size: 20Gi
bootstrap:
recovery:
source: postgres-cluster
recoveryTarget:
targetTime: "2026-06-10 14:30:00" # Recover to this exact point in time
externalClusters:
- name: postgres-cluster
barmanObjectStore:
destinationPath: s3://your-bucket-name/postgres-cluster/
s3Credentials:
accessKeyId:
name: backup-storage-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: backup-storage-creds
key: ACCESS_SECRET_KEY
This creates a new cluster restored to the specified point in time. Run it alongside your original cluster to verify data before cutting over.
Common Mistakes
1. No backup configuration on day one The most expensive mistake. WAL archiving must be configured before you put data in the database — you can’t retroactively enable PITR. Configure backups before your first application write.
2. Connecting directly to pods instead of services Pod IPs change. Always connect to the service (postgres-cluster-rw) — the operator manages which pod the service points to during failover.
3. Not testing failover Automated failover means nothing if you haven’t verified it works in your environment and your application handles the reconnection correctly. Test it in staging before you rely on it in production.
4. Under-sizing storage PostgreSQL data grows. WAL files accumulate. Start with more storage than you think you need, and set up storage monitoring. Expanding a PVC after the fact is possible but disruptive.
5. Wrong pool mode for your workload PgBouncer’s transaction mode is most efficient but doesn’t support all PostgreSQL features (prepared statements, advisory locks, SET commands). If your application uses these, use session mode instead.
6. Ignoring pg_hba.conf defaults CloudNativePG configures pg_hba.conf to allow connections from within the cluster by default. Pair this with Network Policies — don’t assume Kubernetes networking provides sufficient isolation on its own.
Best Practices
- Size your instances properly — set
resources.requestsbased on actual query load, not guesses. Usepg_stat_statementsto identify heavy queries. - Use synchronous replication for critical data — set
minSyncReplicas: 1andmaxSyncReplicas: 1on your cluster to ensure at least one standby is always in sync before a write is acknowledged - Monitor with Prometheus — CloudNativePG exposes metrics on port 9187. Scrape them with your existing Prometheus setup. Key metrics:
cnpg_pg_postmaster_start_time,cnpg_backends_total,cnpg_pg_replication_lag - Separate namespaces for databases — don’t run your database cluster in the same namespace as your applications
- Use
kubectl cnpg promotefor planned failovers — it performs a clean switchover without data loss, unlike deleting the primary pod - Encrypt storage — use a storage class with encryption enabled or configure PostgreSQL TDE for data at rest
What’s Next
With PostgreSQL running on Kubernetes, the next logical steps in the database cluster are:
- Database migrations in CI/CD — automating schema changes with Flyway or Liquibase in your GitOps pipeline, now that you have ArgoCD running (see our ArgoCD setup guide)
- Monitoring PostgreSQL — scraping CloudNativePG metrics with Prometheus and building Grafana dashboards for query performance, replication lag, and connection pool saturation
- Secrets management — moving database credentials out of Kubernetes Secrets and into a proper secrets backend using External Secrets Operator
If you decide that managing PostgreSQL on Kubernetes isn’t the right fit for your team’s operational capacity, DigitalOcean Managed Databases offers a fully managed PostgreSQL service with automated backups, failover, and connection pooling — without the operational overhead of running the operator yourself.
The right choice depends on your team’s Kubernetes experience and tolerance for database operations. CloudNativePG narrows the gap significantly, but managed databases still win on operational simplicity.