Docker — Containers from First Principles
Images, layers, networking, multi-stage builds. The technology that made 'works on my machine' obsolete.
What a Container Actually Is
A container is a process running on a Linux host with an isolated view of its environment. That's it. It's not a VM. It's a regular Linux process with kernel features (namespaces, cgroups) restricting what it can see and how much resource it can use.
Compare to a VM:
VM Container
┌─────────────────┐ ┌─────────────────┐
│ Application │ │ Application │
├─────────────────┤ ├─────────────────┤
│ Guest OS kernel │ │ (uses host │
│ (full Linux) │ │ kernel) │
├─────────────────┤ └─────────────────┘
│ Virtualization │
│ (hypervisor) │ Shares host kernel
├─────────────────┤ Starts in milliseconds
│ Host OS │ Tiny memory overhead
└─────────────────┘ Image is ~50MB-1GB
Boots in seconds
GBs of memory
GBs of disk
Why this matters:
• A container starts in 100ms, a VM in 30+ seconds
• A container adds ~10MB overhead, a VM adds 100MB+
• You can run dozens of containers per host
• An image of 100MB packages your whole app + its dependencies
The kernel features that make it work:
• namespaces — isolate what a process can see (PIDs, network, mounts, users)
• cgroups — limit what a process can use (CPU, memory, I/O)
• Union filesystem — efficient layered image storage
Docker is the most popular tool for working with containers, but containers are a Linux feature, not a Docker feature. Other runtimes (containerd, CRI-O, podman) all do the same fundamental thing.
Images vs Containers
Images and containers are like classes and instances:
- Image — read-only template. A snapshot of a filesystem + metadata about how to run it.
- Container — a running (or stopped) instance of an image. Has a writable layer on top.
Lifecycle:
# Pull an image from a registry
docker pull nginx:1.27
# Run a container from the image
docker run -d -p 8080:80 --name web nginx:1.27
# Container is now running. Inspect it:
docker ps # running containers
docker logs web # container's stdout/stderr
docker exec -it web bash # shell inside the running container
docker stats web # CPU, memory, network usage
# Stop and remove
docker stop web
docker rm web
# Or in one go
docker rm -f web
Images are stored locally and in registries:
docker images # local images
docker pull <image> # download
docker push <image> # upload to registry
docker rmi <image> # delete local image
Image names follow the pattern: [registry/]namespace/image:tag
- nginx:1.27 — Docker Hub default registry, library/nginx, tag 1.27
- gcr.io/my-project/api:v2 — Google Container Registry
- 123456.dkr.ecr.us-east-1.amazonaws.com/api:abc123 — AWS ECR
Always tag images explicitly. The default latest is a moving target and causes "it worked yesterday" bugs.
Dockerfile — Building Your Own Images
A Dockerfile is a script that builds an image. Each instruction creates a layer.
A simple Node.js app:
# Use an official base image
FROM node:20-alpine
# Set working directory inside the container
WORKDIR /app
# Copy package files first — layer caching
COPY package*.json ./
RUN npm ci --only=production
# Copy app source
COPY . .
# Document which port the app listens on
EXPOSE 3000
# Default command when container starts
CMD ["node", "server.js"]
Build it:
docker build -t myapp:v1 .
The . is the build context — Docker tarballs everything in this directory and sends it to the daemon. Use .dockerignore to exclude things you don't want copied (node_modules, .git, secrets):
# .dockerignore
node_modules
.git
.env
*.log
README.md
Key Dockerfile instructions:
• FROM — base image (always required, first non-comment line)
• WORKDIR — set the directory for subsequent commands
• COPY / ADD — copy files from build context into image
• RUN — execute a shell command at build time
• ENV — set environment variables
• ARG — build-time variables (not in final image)
• EXPOSE — document port (doesn't actually publish — that's -p at runtime)
• USER — switch to a non-root user (security)
• ENTRYPOINT / CMD — what runs when the container starts
ENTRYPOINT vs CMD:
• Use CMD for the typical default command
• Use ENTRYPOINT when you want the image to behave like an executable
• If both: ENTRYPOINT is the command, CMD provides default args
Layers and Caching
Each instruction in your Dockerfile creates a layer. Docker caches layers — if the inputs to a layer haven't changed, it reuses the cached layer.
This makes layer ordering critical. Bad:
FROM node:20-alpine
WORKDIR /app
COPY . . # any source change invalidates this
RUN npm ci # ...and re-runs install every time
Every code change means re-installing dependencies. Slow.
Good:
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./ # only invalidates when deps change
RUN npm ci # cache hit unless package*.json changed
COPY . . # source changes invalidate only this
CMD ["node", "server.js"]
Now npm ci is cached as long as package*.json hasn't changed. Builds drop from minutes to seconds.
The general rule: order instructions from least-to-most-changing. Stable things first (base image, deps), changing things last (source code).
Inspect layers:
docker history myapp:v1
docker inspect myapp:v1
Each RUN, COPY, etc. creates a new layer. To minimize layers and image size, combine related commands:
# Bad — three layers, each layer keeps the apt cache
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# Good — one layer, cache cleaned in same layer
RUN apt-get update && \
apt-get install -y curl && \
rm -rf /var/lib/apt/lists/*
Multi-Stage Builds
Production images shouldn't contain build tools. Multi-stage builds let you compile in one stage, copy artifacts to a slim final image:
# Stage 1: build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: runtime
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
USER node # don't run as root
EXPOSE 3000
CMD ["node", "dist/server.js"]
The final image only contains:
• Node.js runtime (alpine variant: ~50MB)
• Production deps
• Compiled output
Build tools, dev deps, source code — none in final. A 500MB build stage produces a 100MB runtime image.
Even smaller: distroless images contain ONLY your app and its runtime, no shell, no package manager:
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o /myapp
FROM gcr.io/distroless/static-debian12
COPY --from=builder /myapp /myapp
ENTRYPOINT ["/myapp"]
Resulting image is ~10-20MB and has no shell — much smaller attack surface. Standard for Go services in 2026.
Other slim base options:
• alpine — ~5MB, has busybox shell, common
• debian:slim — ~30MB, full glibc, useful for Python/Node
• distroless — minimal, no shell, secure
• scratch — empty, only for fully static binaries (Go, Rust)
Networking
By default, Docker creates a bridge network so containers can talk to each other and the host.
docker network ls # list networks
docker network inspect bridge # see details
Common patterns:
Publish a port to the host:
docker run -d -p 8080:80 nginx # host:8080 → container:80
docker run -d -p 80:80 nginx # privileged port (root)
docker run -d -P nginx # publish all exposed ports to random host ports
Containers on the same custom network can reach each other by name:
docker network create app-net
docker run -d --name db --network app-net postgres:16
docker run -d --name api --network app-net -e DB_HOST=db myapp
# Inside the api container, "db" resolves to the postgres container's IP
DNS resolution between containers works on user-defined networks but not on the default bridge. Always create a network for multi-container apps.
Talking to the host from inside a container:
• Linux: use host.docker.internal (recent Docker versions) or the docker0 bridge IP
• macOS / Windows: host.docker.internal works natively
Volumes — persistent storage:
# Named volume
docker volume create pgdata
docker run -d -v pgdata:/var/lib/postgresql/data postgres:16
# Bind mount a host directory
docker run -d -v $(pwd)/code:/app -p 3000:3000 node:20
# Tmpfs (in-memory, gone when container stops)
docker run --tmpfs /tmp myapp
Use named volumes for persistent data (databases). Use bind mounts for development (hot-reloading code from host).
Image Tagging & Registries
Production images need consistent tagging so you know what's deployed.
Bad: only latest tag
docker build -t myapp:latest .
docker push myapp:latest
Now you can never roll back — every deploy overwrites the same tag.
Good: immutable tags + a moving alias
SHA=$(git rev-parse --short HEAD)
docker build -t myapp:$SHA -t myapp:latest .
docker push myapp:$SHA
docker push myapp:latest
Now you can pin deploys to specific commits:
# Kubernetes
image: myapp:abc1234
Common tag schemes:
• Git SHA: abc1234 — every commit gets a unique tag
• Semantic version: v1.2.3 — for releases
• Branch name: main, develop — for latest of a branch
• Combination: v1.2.3-abc1234 — version + commit
Container registries:
• Docker Hub — public, free for public repos
• GitHub Container Registry (ghcr.io) — free for public, integrates with Actions
• AWS ECR — AWS-native, IAM-integrated
• Google Artifact Registry — GCP-native
• Azure Container Registry — Azure-native
• Self-hosted: Harbor (open source), JFrog Artifactory
Authentication:
# Docker Hub
docker login
# AWS ECR
aws ecr get-login-password | docker login --username AWS --password-stdin 123456.dkr.ecr.us-east-1.amazonaws.com
# Google
gcloud auth configure-docker
Production-Ready Dockerfile Template
A solid template combining everything:
# syntax=docker/dockerfile:1.6
ARG NODE_VERSION=20
# ─── BUILD STAGE ───────────────────────────────────────────────
FROM node:${NODE_VERSION}-alpine AS builder
WORKDIR /app
# Install only build deps
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
npm ci
COPY . .
RUN npm run build
# ─── RUNTIME STAGE ─────────────────────────────────────────────
FROM node:${NODE_VERSION}-alpine
# Don't run as root
RUN addgroup -S app && adduser -S app -G app
USER app
WORKDIR /app
# Only production deps
COPY --chown=app:app package*.json ./
RUN --mount=type=cache,target=/root/.npm \
npm ci --only=production && \
npm cache clean --force
# Compiled output
COPY --from=builder --chown=app:app /app/dist ./dist
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD wget --quiet --spider http://localhost:3000/health || exit 1
EXPOSE 3000
ENV NODE_ENV=production
CMD ["node", "dist/server.js"]
Notes:
• # syntax=docker/dockerfile:1.6 — enable BuildKit features
• --mount=type=cache — persistent cache across builds (faster CI)
• Non-root user — security
• HEALTHCHECK — Docker can monitor the container's health
• ENV NODE_ENV=production — many libraries optimize based on this
The next lesson covers image security — once you're building images, securing them is critical.
⁂ Back to all modules