Optimizing Kubernetes Resource Management: Requests vs. Limits

Kubernetes resources

Kubernetes uses the resources block to determine pod scheduling and runtime behavior. Miscalculating these values can lead to pods being stuck in Pending or terminated via OOMKilled when real-world traffic exceeds estimated guesses.

Why This Matters

Many engineering teams treat resource definitions as a final, decorative step in a YAML manifest rather than a critical operational decision. In technical reality, these values dictate how the scheduler places workloads and how nodes respond under pressure; setting a request too low results in node oversubscription, while missing limits allows single applications to destabilize entire clusters through resource exhaustion.

Key Insights

OOMKilled errors occur when a container exceeds its defined memory limit, such as an API peaking at 500 MiB against a 256 MiB limit (Coles C, 2026).
The Scheduler uses requests as the minimum threshold for pod placement, ensuring nodes have enough capacity before admitting new workloads.
Ephemeral-storage management is critical for Jenkins pipelines or Kaniko builds that generate large intermediate files or caches.
CPU throttling happens when a container reaches its limit, causing performance degradation under load without necessarily crashing the pod.
Requests should align with baseline usage while limits provide a buffer for picos (peaks) to ensure cluster predictability.

Working Examples

Standard resource block defining minimum requests for the scheduler and maximum limits for runtime execution.

resources:
  requests:
    cpu: "250m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

Adjusted configuration for an API with picos of 500 MiB, preventing OOMKilled errors.

resources:
  requests:
    cpu: "300m"
    memory: "512Mi"
  limits:
    cpu: "1000m"
    memory: "768Mi"

Configuration including ephemeral-storage to prevent pod eviction during high disk usage in /tmp or artifact downloads.

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
    ephemeral-storage: "1Gi"
  limits:
    cpu: "1000m"
    memory: "1Gi"
    ephemeral-storage: "2Gi"

Practical Applications

Use Case: Setting requests close to observed baseline usage for a production API ensures the scheduler places pods on nodes with genuine capacity. Pitfall: Using ‘reasonable’ numbers without validation, leading to node oversubscription and eventual Evicted status.
Use Case: Defining ephemeral-storage for CI/CD runners or Kaniko builds that write large caches to disk. Pitfall: Ignoring disk limits, which causes Kubernetes to evict pods when the node’s local storage is pressured.

References:

https://dev.to/coles980/kubernetes-resources-50lo

On This Page

Kubernetes resources

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

Coiled: Simplifying Python Scaling Beyond Kubernetes

Leveraging EKS Capabilities for Managed Kubernetes Infrastructure and Resource Orchestration

Optimizing Azure Monitor: Hybrid Cloud Deployment and Quota Management