Knative in the GitOps Loop: ArgoCD Sync for Serverless Workloads
Knative in the GitOps Loop: ArgoCD Sync for Serverless Workloads
The Failure
The team added Knative Services to their ArgoCD-managed infrastructure repo. After the first sync, ArgoCD showed the Knative Service as “Progressing” indefinitely. The health check never resolved because ArgoCD did not understand Knative’s custom resource status conditions. The team assumed ArgoCD was broken and manually applied the Knative manifests with kubectl apply. Two weeks later, someone pushed a conflicting change to the infra repo. ArgoCD synced and overwrote the manual changes. The product import service went down.
ArgoCD requires custom health checks for Knative CRDs. Without them, the GitOps loop is broken.
The Mechanism
Knative Resource Health
Knative uses Kubernetes-style conditions on its custom resources:
| Resource | Ready Condition | Meaning |
|---|---|---|
| Service | Ready=True | All conditions met, traffic routed |
| Configuration | Ready=True | Latest revision created |
| Revision | Ready=True | Pod ready, serving traffic |
| Route | Ready=True | Traffic rules applied |
ArgoCD evaluates health by checking status.conditions. Without custom Lua health checks, it falls back to generic resource health, which checks only that the resource exists—not that it is functional.
The Implementation
ArgoCD Custom Health Checks
# HARDENED: ArgoCD ConfigMap with Knative health checks
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
data:
resource.customizations.health.serving.knative.dev_Service: |
hs = {}
if obj.status ~= nil then
if obj.status.conditions ~= nil then
for _, condition in ipairs(obj.status.conditions) do
if condition.type == "Ready" then
if condition.status == "True" then
hs.status = "Healthy"
hs.message = "Service is ready"
elseif condition.status == "False" then
hs.status = "Degraded"
hs.message = condition.message or "Service is not ready"
else
hs.status = "Progressing"
hs.message = condition.message or "Service is progressing"
end
return hs
end
end
end
end
hs.status = "Progressing"
hs.message = "Waiting for conditions"
return hs
resource.customizations.health.serving.knative.dev_Revision: |
hs = {}
if obj.status ~= nil then
if obj.status.conditions ~= nil then
for _, condition in ipairs(obj.status.conditions) do
if condition.type == "Ready" then
if condition.status == "True" then
hs.status = "Healthy"
elseif condition.status == "False" then
hs.status = "Degraded"
hs.message = condition.message
else
hs.status = "Progressing"
end
return hs
end
end
end
end
hs.status = "Progressing"
return hs
GitOps Directory Structure for Knative
ecommerce-infra/
├── apps/
│ ├── product-import/
│ │ ├── base/
│ │ │ ├── kustomization.yaml
│ │ │ └── knative-service.yaml
│ │ └── overlays/
│ │ ├── staging/
│ │ │ ├── kustomization.yaml
│ │ │ └── patches/
│ │ │ └── scale.yaml
│ │ └── production/
│ │ ├── kustomization.yaml
│ │ └── patches/
│ │ └── scale.yaml
Base Knative Service
# apps/product-import/base/knative-service.yaml
# HARDENED: Base Knative Service definition
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: product-import
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/min-scale: "0"
autoscaling.knative.dev/max-scale: "5"
spec:
containers:
- image: ghcr.io/acme/product-import:latest
ports:
- containerPort: 8080
resources:
requests:
cpu: 250m
memory: 512Mi
Production Overlay
# apps/product-import/overlays/production/patches/scale.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: product-import
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/max-scale: "10"
autoscaling.knative.dev/scale-to-zero-pod-retention-period: "15m"
Blue-Green with Knative Traffic Splitting
# HARDENED: Blue-green deployment via Knative traffic spec
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: product-import
spec:
template:
metadata:
name: product-import-v2
spec:
containers:
- image: ghcr.io/acme/product-import:v2.0.0
traffic:
# 100% to old revision until validated
- revisionName: product-import-v1
percent: 100
# New revision accessible via tag URL only
- revisionName: product-import-v2
percent: 0
tag: candidate
The candidate tag creates a URL like candidate-product-import.production.example.com. Test the new version via this URL. When validated, update the traffic split:
traffic:
- revisionName: product-import-v2
percent: 100
CI Pipeline Updates Image Tag
# .github/workflows/deploy.yml
# HARDENED: Update Knative image tag via Kustomize
- name: Update image tag
run: |
cd ecommerce-infra
kustomize edit set image \
ghcr.io/acme/product-import:${{ github.sha }}
git add .
git commit -m "deploy: product-import ${{ github.sha }}"
git push
ArgoCD detects the commit, syncs the Knative Service, Knative creates a new Revision with the new image, and routes traffic to it.
The Gate
ArgoCD sync status is the gate. A Knative Service that fails to reach Ready=True will leave the ArgoCD Application in a Degraded state. Combined with ArgoCD notifications (Slack, webhook), this creates an automated alert when serverless deployments fail.
The Recovery
ArgoCD sync succeeds but Knative Service stays Progressing: The container image is failing to start. Check the Revision’s pods: kubectl get pods -n production -l serving.knative.dev/service=product-import. Examine pod events and logs.
ArgoCD prune deletes old Knative Revisions: This is expected. If you need to retain revisions for quick rollback, exclude Revision resources from ArgoCD pruning with argocd.argoproj.io/compare-options: IgnoreExtraneous.
Traffic split changes are not applied: Knative’s traffic spec is declarative. If ArgoCD and manual kubectl patch commands both modify traffic, ArgoCD will overwrite manual changes on the next sync. All traffic changes must go through the Git repo.