Home
DevOps & Cloud Engineering / Lesson 7 — Continuous Delivery & Deployment

Continuous Delivery & Deployment

From green build to running in production. Pipelines, environments, and the difference between delivery and deployment.


Delivery vs Deployment

Two terms often conflated:

Continuous Delivery — every change that passes CI is automatically prepared for production. The artifact is built, deployment is staged, but the final "go" is a button click.

Continuous Deployment — every change that passes CI automatically deploys to production. No human in the loop. Many companies do this with feature flags ensuring no user sees half-built features.

Text
Continuous Delivery:
git push → CI passes → artifact ready → human pushes deploy button

Continuous Deployment:
git push → CI passes → automatic deploy to production

Most teams start with continuous delivery (build trust) and graduate to continuous deployment as their tests, monitoring, and feature-flag discipline mature.

Either way, the goal: collapse the time from "code written" to "code in production." Every minute of delay is a minute the change isn't validated against real users.


The Deployment Pipeline

A typical CD pipeline extends CI with environment progression:

Text
   CI (every PR push) — build, lint, test, security
              │
   Merge to main
              │
   Build production artifact (Docker image, tarball)
              │ tag with git sha + version
              │ push to registry
              ▼
   Deploy to STAGING (automatic)
              │ run smoke tests
              │ run E2E suite
              ▼
   Deploy to PRODUCTION
              │ automatic, manual gate, or scheduled
              │ progressive: 1% → 10% → 50% → 100%
              │ monitor, automatic rollback on regression
              ▼
   Post-deploy verification
              │ synthetic checks pass
              │ metrics within SLO

Environments are typically:
• development — your laptop or a personal cloud env
• staging / pre-prod — production-like, integration testing
• production — the real thing

Some orgs add UAT, QA, or per-developer dev envs. The right number is "as few as you need." Each extra env is overhead — drift, cost, maintenance.


Environment Configuration

The same artifact must run in every environment with only configuration differences. This is the 12-Factor App principle.

Bad pattern:

JavaScript
const DB_URL = environment === 'production'
  ? 'postgres://prod-db.example.com'
  : 'postgres://localhost:5432/dev';

This bakes environment knowledge INTO the artifact. Now staging artifacts ≠ production artifacts.

Good pattern:

JavaScript
const DB_URL = process.env.DATABASE_URL || throwIfMissing();

The artifact reads from environment variables. Different environments inject different values. The artifact is identical between staging and production — making staging a TRUE rehearsal.

Where to inject configuration:
• Local dev — .env file (in .gitignore)
• Cloud — Secrets Manager / Parameter Store / Vault
• Kubernetes — ConfigMaps and Secrets
• CI/CD — pipeline secrets, environment-scoped variables

Critical: secrets NEVER go into Git. Use a secrets backend. Rotation is easier, audit is possible, leaks are bounded.

Tools:
• AWS Secrets Manager / Parameter Store
• GCP Secret Manager
• HashiCorp Vault (multi-cloud)
• Doppler, Infisical (developer-friendly SaaS)
• Kubernetes External Secrets Operator


Deployment Strategies

When you push a new version, you have choices about HOW it rolls out.

Recreate / Big Bang

Text
v1 stops → v2 starts

Simple but causes downtime. Use only for non-critical apps with maintenance windows.

Rolling deployment

Text
v1 instances slowly replaced by v2:
   v1 v1 v1 v1 v1
   v2 v1 v1 v1 v1
   v2 v2 v1 v1 v1
   ...
   v2 v2 v2 v2 v2

Default in Kubernetes. Zero downtime if old and new versions can coexist. Slow rollback.

Blue-Green deployment

Text
Blue (v1) — currently serving live traffic
Green (v2) — fully provisioned, validated, idle

Switch the load balancer: Green now live
Blue stays up briefly for instant rollback

Instant rollback by flipping the LB. Doubles infrastructure briefly. Requires app to handle "two versions can't run" issues (DB schema migrations).

Canary deployment

Text
99% traffic → v1 (old)
 1% traffic → v2 (new)

Watch metrics. Looking good?
   90% → v1, 10% → v2
   ...
   100% v2

Best risk control. Detects bad deploys with minimal user impact. Requires good monitoring + traffic-splitting capability.

Feature flags + dark launches

Text
v2 deployed everywhere immediately
  but new feature behind flag = OFF for users
  
Slowly enable the flag:
  - 1% of users see new feature
  - 10%
  - 100%

Decouples deploy from release. Rollback = flag toggle, no redeploy.

Most modern teams combine: rolling or blue-green for the deploy mechanic, feature flags for user-visible rollout.


Deploys That Are Boring

A "boring" deploy is the gold standard:

1. Tiny changes — small diffs are easier to debug. Aim for daily deploys with ~50-300 lines each.

2. Forward-only — when something breaks, push another small fix. Rollback is the EMERGENCY option, not the default.

Snippet
3. Database migrations are separate from app deploys
   • Step 1: deploy migration (add column, with default)
   • Step 2: deploy app code that uses both old and new
   • Step 3: backfill data
   • Step 4: deploy app code that uses only new
   • Step 5: drop old column (later, much later)
   This "expand-contract" pattern allows zero-downtime schema changes.
Snippet
4. Health checks gate traffic
   • Liveness: process is running
   • Readiness: process is ready to receive traffic
   New instances don't get traffic until readiness passes.
Snippet
5. Automatic rollback on regression
   • Define metrics indicating health: error rate, latency p99, throughput
   • If they degrade after deploy, automatically halt rollout (canary) or revert (blue-green)
   • Argo Rollouts, Flagger, AWS CodeDeploy support this
Snippet
6. Deploy windows? Increasingly avoided
   • If Friday deploys are scary, your CD process is broken — fix the process
   • That said: avoid deploying right before holidays and major events

The dream state: deploy at 4 PM on Friday, be home by 5, no incidents. Many teams achieve this. It takes investment, but the productivity gain is enormous.


Rollback — When and How

Even with good processes, deploys go wrong. Your rollback story matters.

When to roll back:
• Crashes / 500s spike
• Latency increases significantly
• A critical feature breaks
• Data integrity issue (sometimes)

When NOT to roll back:
• Cosmetic issue (fix forward)
• Slow regression you can investigate (push a fix)
• Destructive DB migration (rollback may corrupt data)

Rollback mechanics for stateless services:

Bash
# Kubernetes
kubectl rollout undo deployment/myapp

# AWS ECS
aws ecs update-service --service myapp --task-definition myapp:42

For canary: stop the rollout, route 100% back to old version. Argo Rollouts and Flagger automate this.

What makes rollback hard:
• Schema migrations that aren't reversible (the expand-contract pattern fixes this)
• Data already written by the new version
• Cache state from the new version

The discipline: design every deploy to be safely rollback-able. If you wrote code incompatible with the previous version, you've created an irreversible deploy — handle with extra care.


⁂ Back to all modules