Mapping Versioning and Migration Strategy
Mapping Versioning and Migration Strategy
The Symptom
The documentation platform needs to add a difficulty_level keyword field to the mapping. The developer runs PUT docs-v1/_mapping with the new field, which succeeds. The next sprint, the team needs to change the body field’s analyzer from standard to the custom code_analyzer. The mapping update API rejects this change. The field’s analyzer is immutable. The only path forward is a full reindex into a new index with the correct mapping.
The team has no process for this. They create docs-v2 manually, reindex with the _reindex API, update all application code to point to docs-v2, and spend the next four hours debugging why the reindex missed 12,000 documents.
The Internals
OpenSearch index mappings are append-only for additive changes and immutable for destructive changes. The mapping API allows adding new fields but not modifying existing ones. This is a Lucene constraint: the inverted index, doc values, and stored fields for existing documents were built with the original field configuration. Changing the configuration retroactively would require rewriting every segment.
A mapping versioning strategy treats this constraint as a feature, not a limitation. By naming indices with version suffixes and accessing them through aliases, mapping changes become deployment operations rather than emergencies.
The Implementation
Index Naming Convention
docs-v1 → initial mapping
docs-v2 → added code_analyzer, changed body field analysis
docs-v3 → added vector field for semantic search
Alias-Based Access
// HARDENED: Application code always uses the alias, never the index name
// Index swaps are transparent to the application.
private static final String DOCS_READ_ALIAS = "docs-read";
private static final String DOCS_WRITE_ALIAS = "docs-write";
// Initial setup: create index and aliases
client.indices().create(c -> c.index("docs-v1"));
client.indices().updateAliases(a -> a
.actions(act -> act.add(add -> add
.index("docs-v1")
.alias(DOCS_READ_ALIAS)
))
.actions(act -> act.add(add -> add
.index("docs-v1")
.alias(DOCS_WRITE_ALIAS)
))
);
// All search operations use the read alias
SearchRequest searchRequest = SearchRequest.of(s -> s
.index(DOCS_READ_ALIAS)
.query(q -> q.match(m -> m.field("body").query(userQuery)))
);
// All index operations use the write alias
IndexRequest<DocPage> indexRequest = IndexRequest.of(r -> r
.index(DOCS_WRITE_ALIAS)
.id(docId)
.document(page)
);
Mapping Migration Test
@Test
void mappingMigrationPreservesSearchBehavior() throws Exception {
// Create v1 index with original mapping
createDocsV1Index(client);
// Index representative documents
indexTestDocuments(client, "docs-v1", testDocuments());
// Run representative queries and capture results
List<SearchResult> v1Results = runQueryTestSet(client, "docs-v1");
// Create v2 index with new mapping
createDocsV2Index(client);
// Reindex from v1 to v2
client.reindex(r -> r
.source(src -> src.index("docs-v1"))
.dest(dst -> dst.index("docs-v2"))
);
client.indices().refresh(r -> r.index("docs-v2"));
// Verify document count matches
long v1Count = client.count(c -> c.index("docs-v1")).count();
long v2Count = client.count(c -> c.index("docs-v2")).count();
assertThat(v2Count).isEqualTo(v1Count);
// Run the same queries against v2 and compare results
List<SearchResult> v2Results = runQueryTestSet(client, "docs-v2");
// Compare result sets (ordering may change with analyzer changes)
for (int i = 0; i < v1Results.size(); i++) {
assertThat(v2Results.get(i).documentIds())
.as("Query '%s' result set changed", v1Results.get(i).query())
.containsAll(v1Results.get(i).documentIds());
}
}
This test catches two categories of migration failures: documents lost during reindex (count mismatch) and search behavior regressions (result set changes). Running it in CI before deploying a mapping change prevents the manual debugging session that follows a botched reindex.
Mapping Compatibility Check
public class MappingValidator {
/**
* Validate that a new mapping is compatible with the existing one.
* Returns a list of incompatible changes that require reindexing.
*/
public List<String> validateCompatibility(
Map<String, Property> currentMapping,
Map<String, Property> newMapping) {
List<String> incompatibilities = new ArrayList<>();
for (Map.Entry<String, Property> entry : newMapping.entrySet()) {
String fieldName = entry.getKey();
Property newProp = entry.getValue();
if (!currentMapping.containsKey(fieldName)) {
continue; // New field, always compatible
}
Property currentProp = currentMapping.get(fieldName);
if (!currentProp._kind().equals(newProp._kind())) {
incompatibilities.add(
"Field '%s': type change from %s to %s requires reindex"
.formatted(fieldName, currentProp._kind(), newProp._kind())
);
}
}
for (String fieldName : currentMapping.keySet()) {
if (!newMapping.containsKey(fieldName)) {
incompatibilities.add(
"Field '%s': removal requires reindex (cannot remove fields from mapping)"
.formatted(fieldName)
);
}
}
return incompatibilities;
}
}
The Measurement
Track mapping versions and migration history:
| Version | Change | Reindex Required | Migration Time (1M docs) | Downtime |
|---|---|---|---|---|
| v1 | Initial mapping | N/A | N/A | N/A |
| v2 | Added difficulty_level keyword | No | 0 (PUT _mapping) | 0 |
| v3 | Changed body analyzer | Yes | 12 minutes | 0 (alias swap) |
| v4 | Added embedding knn_vector | Yes | 45 minutes (includes embedding generation) | 0 (alias swap) |
The reindex time is proportional to document count and inversely proportional to bulk indexing throughput. With the alias-based access pattern, downtime is zero: the alias swap is atomic.
The Decision Rule
Always access indices through aliases. Never reference index names directly in application code. This makes reindexing a deployment operation, not a code change.
Version-suffix index names. When a mapping change requires reindexing, create the new versioned index, reindex into it, verify with the migration test, and atomically swap the alias. The old index can be deleted after verification.
Test mapping migrations with Testcontainers in CI. The test should verify document count preservation and search behavior stability. A mapping migration that changes search results should be intentional and measured with the relevance evaluation framework (chapter 9), not discovered in production.