Skip to main content
ship it and sleep

Build Caching and Selective Testing Strategies

4 min read Chapter 60 of 66

Build Caching and Selective Testing Strategies

The Failure

The monorepo CI installed dependencies from scratch on every run. Node.js: 90 seconds. Go modules: 60 seconds. Gradle: 120 seconds. Maven: 90 seconds. Even when the lockfiles had not changed. 6 minutes wasted on every build downloading the same dependencies.

Caching dependencies by lockfile hash reduces install time from minutes to seconds.

The Mechanism

Cache Hit Scenarios

ScenarioDependenciesSourceAction
No changesCached ✓Cached ✓Skip build
Source changed, deps sameCached ✓RebuildBuild with cached deps
Deps changedRebuildRebuildFull rebuild
New CI runnerCold ✓Cold ✓Full rebuild, populate cache

Cache Key Strategy

Primary key:    {os}-{tool}-{service}-{hash(lockfile)}
Restore keys:   {os}-{tool}-{service}-
                 {os}-{tool}-

The restore key fallback ensures partial cache hits. If the lockfile changed but only one package was added, most of the cached modules are still valid.

The Implementation

Multi-Layer Cache Configuration

# HARDENED: Layered caching for all language stacks
jobs:
  build-catalog:
    steps:
      - uses: actions/checkout@v4

      # Layer 1: Package manager cache
      - uses: actions/cache@v4
        with:
          path: ~/.npm
          key: ${{ runner.os }}-npm-catalog-${{ hashFiles('services/catalog/package-lock.json') }}
          restore-keys: |
            ${{ runner.os }}-npm-catalog-
            ${{ runner.os }}-npm-

      # Layer 2: Build output cache
      - uses: actions/cache@v4
        with:
          path: services/catalog/dist
          key: ${{ runner.os }}-build-catalog-${{ hashFiles('services/catalog/src/**') }}

      - name: Install dependencies
        working-directory: services/catalog
        run: npm ci --prefer-offline

      - name: Build (skip if cached)
        working-directory: services/catalog
        run: |
          if [[ -d dist ]]; then
            echo "Build cache hit, skipping"
          else
            npm run build
          fi

  build-checkout:
    steps:
      - uses: actions/checkout@v4

      - uses: actions/cache@v4
        with:
          path: |
            ~/go/pkg/mod
            ~/.cache/go-build
          key: ${{ runner.os }}-go-${{ hashFiles('services/checkout/go.sum') }}
          restore-keys: ${{ runner.os }}-go-

      - name: Build
        working-directory: services/checkout
        run: go build ./...

  build-payments:
    steps:
      - uses: actions/checkout@v4

      - uses: actions/cache@v4
        with:
          path: |
            ~/.gradle/caches
            ~/.gradle/wrapper
          key: ${{ runner.os }}-gradle-${{ hashFiles('services/payments/**/*.gradle*', 'services/payments/gradle/wrapper/gradle-wrapper.properties') }}
          restore-keys: ${{ runner.os }}-gradle-

      - name: Build
        working-directory: services/payments
        run: ./gradlew build -x test

Docker Build Caching

- name: Set up Docker Buildx
  uses: docker/setup-buildx-action@v3

- name: Build with GitHub Actions cache
  uses: docker/build-push-action@v5
  with:
    context: services/catalog
    push: ${{ github.event_name == 'push' }}
    tags: ghcr.io/acme/catalog-service:${{ github.sha }}
    cache-from: type=gha,scope=catalog
    cache-to: type=gha,mode=max,scope=catalog

Selective Test Execution

test:
  needs: detect-changes
  steps:
    - uses: actions/checkout@v4
      with:
        fetch-depth: 0

    - name: Run affected tests
      run: |
        AFFECTED=$(bash scripts/detect-affected.sh)
        echo "Affected services: $AFFECTED"

        for svc in $AFFECTED; do
          echo "Testing $svc"
          case $svc in
            catalog)
              cd services/catalog && npm test
              ;;
            checkout)
              cd services/checkout && go test ./...
              ;;
            payments)
              cd services/payments && ./gradlew test
              ;;
          esac
          cd "$GITHUB_WORKSPACE"
        done

Cache Metrics

- name: Report cache stats
  if: always()
  run: |
    echo "## Cache Statistics" >> $GITHUB_STEP_SUMMARY
    echo "| Layer | Status |" >> $GITHUB_STEP_SUMMARY
    echo "|-------|--------|" >> $GITHUB_STEP_SUMMARY
    echo "| npm | ${{ steps.npm-cache.outputs.cache-hit && '✓ Hit' || '✗ Miss' }} |" >> $GITHUB_STEP_SUMMARY
    echo "| build | ${{ steps.build-cache.outputs.cache-hit && '✓ Hit' || '✗ Miss' }} |" >> $GITHUB_STEP_SUMMARY
    echo "| docker | ${{ steps.docker-cache.outputs.cache-hit && '✓ Hit' || '✗ Miss' }} |" >> $GITHUB_STEP_SUMMARY

The Gate

Build time is the implicit gate. If CI takes more than 15 minutes, developers merge without waiting. Caching keeps CI under 5 minutes for single-service changes. Track cache hit rates in the pipeline dashboard (CH18).

The Recovery

Cache grows too large: GitHub Actions cache has a 10GB limit per repository. Old caches are evicted LRU. If the limit is hit frequently, reduce cache scope or exclude large directories.

Cache poisoning: A corrupted cache causes all builds to fail. Delete the cache via the GitHub API: gh actions-cache delete --all. The next build will be slow but clean.

Selective tests miss a regression: The dependency graph was incomplete. A shared utility changed but the affected service was not detected. Add integration tests that run on all merges to main as a safety net.