Skip to main content

On This Page

Optimizing Kubernetes Resource Management: Requests vs. Limits

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Kubernetes resources

Kubernetes uses the resources block to determine pod scheduling and runtime behavior. Miscalculating these values can lead to pods being stuck in Pending or terminated via OOMKilled when real-world traffic exceeds estimated guesses.

Why This Matters

Many engineering teams treat resource definitions as a final, decorative step in a YAML manifest rather than a critical operational decision. In technical reality, these values dictate how the scheduler places workloads and how nodes respond under pressure; setting a request too low results in node oversubscription, while missing limits allows single applications to destabilize entire clusters through resource exhaustion.

Key Insights

  • OOMKilled errors occur when a container exceeds its defined memory limit, such as an API peaking at 500 MiB against a 256 MiB limit (Coles C, 2026).
  • The Scheduler uses requests as the minimum threshold for pod placement, ensuring nodes have enough capacity before admitting new workloads.
  • Ephemeral-storage management is critical for Jenkins pipelines or Kaniko builds that generate large intermediate files or caches.
  • CPU throttling happens when a container reaches its limit, causing performance degradation under load without necessarily crashing the pod.
  • Requests should align with baseline usage while limits provide a buffer for picos (peaks) to ensure cluster predictability.

Working Examples

Standard resource block defining minimum requests for the scheduler and maximum limits for runtime execution.

resources:
  requests:
    cpu: "250m"
    memory: "256Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

Adjusted configuration for an API with picos of 500 MiB, preventing OOMKilled errors.

resources:
  requests:
    cpu: "300m"
    memory: "512Mi"
  limits:
    cpu: "1000m"
    memory: "768Mi"

Configuration including ephemeral-storage to prevent pod eviction during high disk usage in /tmp or artifact downloads.

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
    ephemeral-storage: "1Gi"
  limits:
    cpu: "1000m"
    memory: "1Gi"
    ephemeral-storage: "2Gi"

Practical Applications

  • Use Case: Setting requests close to observed baseline usage for a production API ensures the scheduler places pods on nodes with genuine capacity. Pitfall: Using ‘reasonable’ numbers without validation, leading to node oversubscription and eventual Evicted status.
  • Use Case: Defining ephemeral-storage for CI/CD runners or Kaniko builds that write large caches to disk. Pitfall: Ignoring disk limits, which causes Kubernetes to evict pods when the node’s local storage is pressured.

References:

Continue reading

Next article

LeakBase Admin Arrested: Russian Law Enforcement Dismantles Major Stolen Credential Marketplace

Related Content