Skip to main content
search at depth

Index Lifecycle Management: Hot-Warm-Cold

4 min read Chapter 34 of 60

Index Lifecycle Management: Hot-Warm-Cold

The documentation platform accumulates content over time. Version 3.0 documentation receives 10,000 searches per day. Version 1.2, deprecated two years ago, receives 15 searches per day. Both versions sit on the same high-performance SSD nodes. The cluster runs 12 nodes of identical hardware, 80% of which stores data that nobody reads.

Tiered Architecture

The hot-warm-cold architecture assigns indices to hardware tiers based on access frequency:

TierHardwarePurposeTypical Age
HotNVMe SSD, high CPU, 64GB RAMActive indexing and frequent search0-30 days
WarmSATA SSD, moderate CPU, 32GB RAMInfrequent search, no writes30-180 days
ColdHDD, minimal CPU, 16GB RAMRare search, compliance retention180+ days

Node roles declare which tier a node belongs to:

# opensearch.yml for a warm node
node.attr.temp: warm
node.roles:
  - data
  - ingest

OpenSearch ISM Policy

OpenSearch uses Index State Management (ISM) policies instead of Elasticsearch’s ILM. ISM policies are JSON documents that define states, transitions, and actions.

{
  "policy": {
    "description": "Documentation index lifecycle",
    "default_state": "hot",
    "states": [
      {
        "name": "hot",
        "actions": [
          {
            "rollover": {
              "min_doc_count": 500000,
              "min_index_age": "30d"
            }
          }
        ],
        "transitions": [
          {
            "state_name": "warm",
            "conditions": {
              "min_index_age": "30d"
            }
          }
        ]
      },
      {
        "name": "warm",
        "actions": [
          {
            "replica_count": {
              "number_of_replicas": 1
            }
          },
          {
            "allocation": {
              "require": {
                "temp": "warm"
              }
            }
          },
          {
            "force_merge": {
              "max_num_segments": 1
            }
          }
        ],
        "transitions": [
          {
            "state_name": "cold",
            "conditions": {
              "min_index_age": "180d"
            }
          }
        ]
      },
      {
        "name": "cold",
        "actions": [
          {
            "replica_count": {
              "number_of_replicas": 0
            }
          },
          {
            "allocation": {
              "require": {
                "temp": "cold"
              }
            }
          },
          {
            "read_only": {}
          }
        ],
        "transitions": [
          {
            "state_name": "delete",
            "conditions": {
              "min_index_age": "730d"
            }
          }
        ]
      },
      {
        "name": "delete",
        "actions": [
          {
            "delete": {}
          }
        ],
        "transitions": []
      }
    ],
    "ism_template": [
      {
        "index_patterns": ["docs-v*"],
        "priority": 100
      }
    ]
  }
}

Rollover-Based Index Management

// HARDENED: Create ISM policy and index template for rollover-managed indices

public class IndexLifecycleManager {

    private final OpenSearchClient client;

    public IndexLifecycleManager(OpenSearchClient client) {
        this.client = client;
    }

    public void setupRolloverIndex(String tenantId) throws IOException {
        String alias = "docs-" + tenantId;
        String firstIndex = "docs-" + tenantId + "-000001";

        // Create the first index with the alias and is_write_index=true
        client.indices().create(c -> c
            .index(firstIndex)
            .aliases(alias, a -> a.isWriteIndex(true))
            .settings(s -> s
                .numberOfShards("2")
                .numberOfReplicas("1")
                .putAll(Map.of(
                    "index.routing.allocation.require.temp",
                    JsonData.of("hot"),
                    "plugins.index_state_management.policy_id",
                    JsonData.of("docs-lifecycle-policy")
                ))
            )
        );
    }
}

When the rollover conditions are met (500,000 documents or 30 days), ISM creates a new index (docs-tenant-000002) with the write alias, and the old index transitions to the warm phase. Reads continue through the read alias, which spans all indices in the series.

Hot-warm-cold architecture showing index progression through tiers with ISM policy transitions

The diagram shows three hardware tiers with indices progressing from hot (active writes, NVMe) to warm (read-only, SSD, force-merged) to cold (rare reads, HDD, zero replicas). ISM policy transitions trigger automatically based on index age.

Cost Modeling

Annual storage cost comparison for 10TB of documentation indices:

StrategyHot NodesWarm NodesCold NodesAnnual Cost
All hot (no tiering)12 x NVMe00$72,000
Hot-warm4 x NVMe6 x SATA SSD0$42,000
Hot-warm-cold3 x NVMe4 x SATA SSD3 x HDD$31,000

The hot-warm-cold strategy reduces costs by 57% compared to the all-hot baseline. The savings come from two sources: cheaper hardware for infrequently accessed data, and reduced replica counts (cold indices with zero replicas store half the data of hot indices with one replica).

The Decision Rule

Implement tiered storage when more than 30% of index data is accessed fewer than once per day. The hardware cost savings justify the operational complexity of ISM policies.

Set rollover conditions based on shard size targets (25-50GB per shard) and document count, not just age. Time-based rollover alone produces uneven shard sizes across tenants with different write volumes.

Force merge warm indices to a single segment. This eliminates deleted documents, reduces segment overhead, and improves read performance for the rare queries that hit warm data. Never force merge hot indices—the merge cost interferes with ongoing writes.