Build Optimization
Caching, parallelism, monorepo strategies. How to keep CI under 5 minutes as your codebase grows.
Why Build Speed is a Force Multiplier
A team with 3-minute CI ships dramatically faster than one with 30-minute CI:
- Faster feedback → smaller mental cache → fewer bugs slipping
- Faster deploys → faster rollbacks → safer experimentation
- Faster signal on PRs → less context switching → more flow
- Reviewers wait less → faster reviews
Past 30 minutes, CI essentially becomes asynchronous and humans batch their work — a process failure mode that limits team throughput.
Most slow CI is fixable with standard techniques. This lesson covers the high-impact ones.
Profile Before You Optimize
Don't optimize what you don't measure.
What to measure:
• Total wall-clock time of the pipeline
• Time per step
• Time per job (jobs run parallel — the LONGEST job is the bottleneck)
• Cache hit rate
• Time spent installing deps vs running tests vs building
GitHub Actions, CircleCI, GitLab — all show step-level timing.
Then ask:
• What's the longest single step?
• Which jobs are sequential that could be parallel?
• Where are caches missing or invalidating frequently?
Track this over time. A pipeline creeping from 5 to 15 minutes deserves attention BEFORE it hits 30. Most teams don't notice the slow drift until everyone is annoyed.
Caching — The Biggest Win
Most CI runs install the same dependencies. Caching them is the single biggest speedup.
Built-in caching:
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm' # caches based on package-lock.json hash
When package-lock.json doesn't change, the cache hits and dependencies aren't re-downloaded.
For more granular control:
- uses: actions/cache@v4
with:
path: |
node_modules
~/.npm
key: ${{ runner.os }}-deps-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-deps-
restore-keys is a fallback — if the exact key misses, try keys with this prefix. Useful when most deps are stable but a few change.
Docker layer caching — a giant win for image builds:
- uses: docker/setup-buildx-action@v3
- uses: docker/build-push-action@v5
with:
push: true
tags: my-image:${{ github.sha }}
cache-from: type=registry,ref=my-image:buildcache
cache-to: type=registry,ref=my-image:buildcache,mode=max
Layers are cached in the registry. Subsequent builds reuse unchanged layers.
Cache invalidation is the hardest part: too eager and you don't reuse; too lax and you serve stale data. Use file hashes in the key — when relevant files change, the key changes.
Parallelization
Most CI pipelines run things sequentially that could be parallel.
Wrong:
jobs:
ci:
steps:
- run: npm run lint # 30s
- run: npm run typecheck # 60s
- run: npm run test # 180s
- run: npm run build # 90s
# Total: 360s
Right — split into parallel jobs:
jobs:
lint:
runs-on: ubuntu-latest
steps: [..., npm run lint]
typecheck:
runs-on: ubuntu-latest
steps: [..., npm run typecheck]
test:
runs-on: ubuntu-latest
steps: [..., npm run test]
build:
runs-on: ubuntu-latest
steps: [..., npm run build]
# All run in parallel — total time ≈ slowest job (180s)
Tradeoff: each parallel job has setup cost (checkout, install deps). Use composite actions to share setup.
Test sharding — split big test suites across runners:
jobs:
test:
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npm test -- --shard=${{ matrix.shard }}/4
Most test runners (Jest, pytest-xdist, Go's -shuffle) support sharding. A 20-minute test suite split into 4 shards → 5 minutes.
Skip What Doesn't Need to Run
Don't run everything for every change.
Path filters:
on:
pull_request:
paths:
- 'src/**'
- 'package.json'
Now this workflow only runs when those paths change. Documentation-only PRs skip it.
Multiple workflows by area:
# .github/workflows/backend.yml
on:
pull_request:
paths: ['backend/**']
# .github/workflows/frontend.yml
on:
pull_request:
paths: ['frontend/**']
Or use a tool that detects which packages changed:
Monorepo-aware tools:
• Turborepo / Nx — detect affected packages, run only their tasks
• Bazel — fully hermetic, sub-second incremental builds
• git diff + scripting — the manual approach
# Detect changed packages and run their tests
CHANGED=$(git diff --name-only origin/main | xargs -n1 dirname | sort -u)
for pkg in $CHANGED; do
if [ -f "$pkg/package.json" ]; then
(cd "$pkg" && npm test)
fi
done
For monorepos with hundreds of packages, this is the difference between 60-minute CI and 5-minute CI.
Right-Sizing Runners
Default GitHub-hosted runners (ubuntu-latest) are 4 vCPU / 16 GB RAM. For most CI, this is fine. Sometimes more is faster AND cheaper.
GitHub paid larger runners:
• 8 vCPU, 16 vCPU, 32 vCPU
• Charges scale, but a 16 vCPU runner that finishes in half the time costs about the same per minute count
When bigger runners help:
• CPU-bound builds (Rust, C++, large compilation)
• Parallel test suites (more cores = more shards in parallel)
• Docker builds with multi-stage parallelism
When bigger doesn't help:
• I/O-bound work (waiting on network, disk)
• Single-threaded tasks
• Tests that don't parallelize
Self-hosted runners — for big shops, running your own runners on EC2/GCE can be much cheaper at scale. Use Actions Runner Controller (Kubernetes-based) to autoscale them.
Balance: GitHub-hosted is operationally free. Self-hosted is cheaper per minute but you maintain it. Most teams stay on hosted until CI costs become a real line item.
Monorepo CI Strategies
Monorepos (one repo, many packages) are popular but require thoughtful CI.
The naive approach: rebuild everything on every change. Doesn't scale past ~10 packages.
The smart approach: detect what actually changed, only rebuild affected packages and their dependents.
monorepo/
packages/
core/ ← change
ui/ ← depends on core, must retest
api/ ← depends on core, must retest
cli/ ← unaffected, skip
docs/ ← unaffected, skip
Tools that handle this:
Turborepo (JS/TS) — fast, simple, good defaults:
turbo run test --filter='[main]' # only packages changed since main
Nx — JS/TS, more sophisticated, supports affected graph:
nx affected --target=test --base=main
Bazel — Google's tool, polyglot, hermetic builds, perfect incrementality. Steep learning curve but unmatched at scale.
Pants — similar to Bazel, more pragmatic, Python-friendly.
For most teams: Turborepo or Nx. For massive scale: Bazel.
Critical pattern: cache build outputs across runs. Turborepo Remote Cache, Nx Cloud, or Bazel Remote Cache let one CI run reuse the artifacts another CI run produced earlier.
Watch the Drift
Build optimization isn't a one-time task. CI gets slower over time:
• More tests → longer runs
• More dependencies → bigger installs
• More files → slower scans
• More features → more surface area to test
Make speed a continuous concern:
1. Track CI duration on a dashboard
2. Set alerts when it exceeds your target
3. Allocate time periodically to optimization
4. Reject PRs that significantly slow CI without justification
Some teams treat CI duration as an SLO with quarterly targets. That sounds extreme until you realize how much engineering productivity hangs on it.
The next lesson moves to containers — Docker, the technology that made consistent build/deploy artifacts possible across environments.
⁂ Back to all modules