Skip to main content
ship it and sleep

Knative in the GitOps Loop: ArgoCD Sync for Serverless Workloads

5 min read Chapter 45 of 66

Knative in the GitOps Loop: ArgoCD Sync for Serverless Workloads

The Failure

The team added Knative Services to their ArgoCD-managed infrastructure repo. After the first sync, ArgoCD showed the Knative Service as “Progressing” indefinitely. The health check never resolved because ArgoCD did not understand Knative’s custom resource status conditions. The team assumed ArgoCD was broken and manually applied the Knative manifests with kubectl apply. Two weeks later, someone pushed a conflicting change to the infra repo. ArgoCD synced and overwrote the manual changes. The product import service went down.

ArgoCD requires custom health checks for Knative CRDs. Without them, the GitOps loop is broken.

The Mechanism

Knative Resource Health

Knative uses Kubernetes-style conditions on its custom resources:

ResourceReady ConditionMeaning
ServiceReady=TrueAll conditions met, traffic routed
ConfigurationReady=TrueLatest revision created
RevisionReady=TruePod ready, serving traffic
RouteReady=TrueTraffic rules applied

ArgoCD evaluates health by checking status.conditions. Without custom Lua health checks, it falls back to generic resource health, which checks only that the resource exists—not that it is functional.

The Implementation

ArgoCD Custom Health Checks

# HARDENED: ArgoCD ConfigMap with Knative health checks
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cm
  namespace: argocd
data:
  resource.customizations.health.serving.knative.dev_Service: |
    hs = {}
    if obj.status ~= nil then
      if obj.status.conditions ~= nil then
        for _, condition in ipairs(obj.status.conditions) do
          if condition.type == "Ready" then
            if condition.status == "True" then
              hs.status = "Healthy"
              hs.message = "Service is ready"
            elseif condition.status == "False" then
              hs.status = "Degraded"
              hs.message = condition.message or "Service is not ready"
            else
              hs.status = "Progressing"
              hs.message = condition.message or "Service is progressing"
            end
            return hs
          end
        end
      end
    end
    hs.status = "Progressing"
    hs.message = "Waiting for conditions"
    return hs

  resource.customizations.health.serving.knative.dev_Revision: |
    hs = {}
    if obj.status ~= nil then
      if obj.status.conditions ~= nil then
        for _, condition in ipairs(obj.status.conditions) do
          if condition.type == "Ready" then
            if condition.status == "True" then
              hs.status = "Healthy"
            elseif condition.status == "False" then
              hs.status = "Degraded"
              hs.message = condition.message
            else
              hs.status = "Progressing"
            end
            return hs
          end
        end
      end
    end
    hs.status = "Progressing"
    return hs

GitOps Directory Structure for Knative

ecommerce-infra/
├── apps/
│   ├── product-import/
│   │   ├── base/
│   │   │   ├── kustomization.yaml
│   │   │   └── knative-service.yaml
│   │   └── overlays/
│   │       ├── staging/
│   │       │   ├── kustomization.yaml
│   │       │   └── patches/
│   │       │       └── scale.yaml
│   │       └── production/
│   │           ├── kustomization.yaml
│   │           └── patches/
│   │               └── scale.yaml

Base Knative Service

# apps/product-import/base/knative-service.yaml
# HARDENED: Base Knative Service definition
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: product-import
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/min-scale: "0"
        autoscaling.knative.dev/max-scale: "5"
    spec:
      containers:
        - image: ghcr.io/acme/product-import:latest
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: 250m
              memory: 512Mi

Production Overlay

# apps/product-import/overlays/production/patches/scale.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: product-import
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/max-scale: "10"
        autoscaling.knative.dev/scale-to-zero-pod-retention-period: "15m"

Blue-Green with Knative Traffic Splitting

# HARDENED: Blue-green deployment via Knative traffic spec
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: product-import
spec:
  template:
    metadata:
      name: product-import-v2
    spec:
      containers:
        - image: ghcr.io/acme/product-import:v2.0.0
  traffic:
    # 100% to old revision until validated
    - revisionName: product-import-v1
      percent: 100
    # New revision accessible via tag URL only
    - revisionName: product-import-v2
      percent: 0
      tag: candidate

The candidate tag creates a URL like candidate-product-import.production.example.com. Test the new version via this URL. When validated, update the traffic split:

traffic:
  - revisionName: product-import-v2
    percent: 100

CI Pipeline Updates Image Tag

# .github/workflows/deploy.yml
# HARDENED: Update Knative image tag via Kustomize
- name: Update image tag
  run: |
    cd ecommerce-infra
    kustomize edit set image \
      ghcr.io/acme/product-import:${{ github.sha }}
    git add .
    git commit -m "deploy: product-import ${{ github.sha }}"
    git push

ArgoCD detects the commit, syncs the Knative Service, Knative creates a new Revision with the new image, and routes traffic to it.

The Gate

ArgoCD sync status is the gate. A Knative Service that fails to reach Ready=True will leave the ArgoCD Application in a Degraded state. Combined with ArgoCD notifications (Slack, webhook), this creates an automated alert when serverless deployments fail.

The Recovery

ArgoCD sync succeeds but Knative Service stays Progressing: The container image is failing to start. Check the Revision’s pods: kubectl get pods -n production -l serving.knative.dev/service=product-import. Examine pod events and logs.

ArgoCD prune deletes old Knative Revisions: This is expected. If you need to retain revisions for quick rollback, exclude Revision resources from ArgoCD pruning with argocd.argoproj.io/compare-options: IgnoreExtraneous.

Traffic split changes are not applied: Knative’s traffic spec is declarative. If ArgoCD and manual kubectl patch commands both modify traffic, ArgoCD will overwrite manual changes on the next sync. All traffic changes must go through the Git repo.