Facets, Aggregations, and the Memory Cost of Cardinality

The documentation search platform displays filter counts alongside search results: 45 results in “Guides,” 23 in “API Reference,” 12 in “Changelog,” across 3 versions. These counts are aggregations. Building them incorrectly either produces wrong numbers or consumes gigabytes of heap memory.

How Aggregations Execute

Aggregations run in two phases, parallel to the query phase:

Shard-level aggregation. Each shard computes its local aggregation result: term buckets with counts, metric values, or histogram bins. The computation uses doc values (column-oriented storage) for keyword, numeric, and date fields.
Coordinating node reduction. The coordinating node merges shard-level results. For terms aggregations, this means combining buckets from all shards, summing counts, and returning the global top N.

The memory cost of aggregations is proportional to the number of unique values (cardinality) in the aggregated field. A terms aggregation on content_type (5 unique values) allocates minimal memory. A terms aggregation on api_method (50,000 unique values) allocates significantly more.

Global Ordinals

For keyword fields, OpenSearch builds a global ordinals mapping that translates field values to integer IDs. This mapping is built once per segment refresh and cached in heap memory. The mapping size is proportional to the field’s cardinality.

For a field with 50,000 unique values across 10 million documents, the global ordinals mapping consumes approximately 50,000 x (average_value_length + 8 bytes overhead) of heap. For short values (10 characters), that is roughly 1MB. For long values (200 characters), it is 10MB. Per shard.

Faceted Search Implementation

// HARDENED: Faceted search with post_filter for accurate facet counts

public record FacetedSearchResult(
    List<Hit<DocPage>> hits,
    Map<String, List<FacetBucket>> facets
) {}

public record FacetBucket(String value, long count) {}

public FacetedSearchResult facetedSearch(String tenantId, String query,
        Map<String, String> selectedFacets) throws IOException {

    SearchRequest.Builder builder = new SearchRequest.Builder()
        .index("docs-v1")
        .routing(tenantId)
        .size(10);

    // Base query: text relevance + tenant filter
    Query baseQuery = Query.of(q -> q
        .bool(b -> b
            .filter(f -> f.term(t -> t.field("tenant_id").value(tenantId)))
            .must(mu -> mu.multiMatch(mm -> mm
                .query(query)
                .fields("title^3", "body")
                .type(TextQueryType.CrossFields)
            ))
        )
    );
    builder.query(baseQuery);

    // Post-filter: narrow results WITHOUT affecting aggregation counts
    // This ensures facet counts reflect the base query, not the filtered results.
    if (!selectedFacets.isEmpty()) {
        BoolQuery.Builder postFilter = new BoolQuery.Builder();
        for (var facet : selectedFacets.entrySet()) {
            postFilter.filter(f -> f.term(t -> t
                .field(facet.getKey())
                .value(facet.getValue())
            ));
        }
        builder.postFilter(Query.of(q -> q.bool(postFilter.build())));
    }

    // Aggregations: compute facet counts on the base query results
    builder.aggregations("content_type", a -> a
        .terms(t -> t.field("content_type").size(20))
    );
    builder.aggregations("version", a -> a
        .terms(t -> t.field("version").size(50))
    );
    builder.aggregations("tags", a -> a
        .terms(t -> t.field("tags").size(30))
    );

    SearchResponse<DocPage> response = client.search(builder.build(), DocPage.class);

    // Extract facets from aggregation results
    Map<String, List<FacetBucket>> facets = new LinkedHashMap<>();

    for (var aggEntry : response.aggregations().entrySet()) {
        List<FacetBucket> buckets = aggEntry.getValue()
            .sterms().buckets().array().stream()
            .map(b -> new FacetBucket(b.key().stringValue(), b.docCount()))
            .toList();
        facets.put(aggEntry.getKey(), buckets);
    }

    return new FacetedSearchResult(response.hits().hits(), facets);
}

The post_filter pattern is critical for faceted search UX. Without it, selecting a “Guides” content type filter would recalculate the facet counts with only guides included, showing content_type: Guides (45) and no other content types. With post_filter, the facet counts reflect the base query, and the selected filter narrows only the result list. Users see all available refinement options with accurate counts.

// FRAGILE: Filtering in the main query instead of post_filter
// Facet counts only reflect the filtered subset, not the full query results.
// Users cannot see alternative filter options.

Query filteredQuery = Query.of(q -> q
    .bool(b -> b
        .filter(f -> f.term(t -> t.field("tenant_id").value(tenantId)))
        .filter(f -> f.term(t -> t.field("content_type").value("guide")))
        .must(mu -> mu.match(m -> m.field("body").query(query)))
    )
);
// Aggregation on content_type now shows only "guide" bucket

High-Cardinality Aggregation Costs

// FRAGILE: Terms aggregation on a high-cardinality field
// api_method has 50,000 unique values. This aggregation builds
// global ordinals for all 50,000 values in heap memory.

builder.aggregations("api_method", a -> a
    .terms(t -> t.field("api_method").size(50000))
);

// HARDENED: Use composite aggregation for high-cardinality fields
// Paginates through buckets without loading all into memory at once.

builder.aggregations("api_method", a -> a
    .composite(c -> c
        .size(100)
        .sources(s -> s.putAll(Map.of(
            "method", CompositeAggregationSource.of(cs -> cs
                .terms(t -> t.field("api_method"))
            )
        )))
    )
);

The composite aggregation returns buckets in pages. The first request returns the first 100 buckets. Subsequent requests pass the after_key from the previous response to get the next 100. Memory usage is bounded by the page size, not the field cardinality.

Aggregation execution diagram showing shard-level computation, coordinating node merge, and the post_filter separation for faceted search

The diagram illustrates the two-layer aggregation execution. Each shard computes its local term buckets from doc values and returns them to the coordinating node. The coordinating node merges buckets by summing counts. The post_filter is applied to the hit list after aggregations are computed, ensuring facet counts reflect the broader query while the result list reflects the user’s selected filters.

The Decision Rule

Use post_filter for all faceted search interfaces. The alternative (filtering in the main query) produces incorrect facet counts that confuse users and break the “drill-down” search pattern.

Use terms aggregations for fields with fewer than 1,000 unique values. Use composite aggregations for fields with higher cardinality. The memory difference is proportional to cardinality and compounds across concurrent queries.

Monitor global ordinals memory via _nodes/stats/indices/fielddata. If fielddata memory exceeds 10% of heap, identify the high-cardinality fields responsible and switch their aggregations to composite or reduce their cardinality through value normalization.