Query Optimization Patterns
Query Optimization Patterns
The Symptom
A search query uses a function_score with four scoring functions (recency decay, popularity boost, content type weight, and a Painless script for custom relevance). The Profile API shows the scoring phase consumes 70% of query time. Every matched document passes through all four scoring functions, even though only the top 10 results are returned to the user.
The Internals
OpenSearch evaluates queries in two contexts:
- Filter context. Binary yes/no matching. No scoring. Results are cached in the node query cache. Subsequent identical filters execute in microseconds.
- Query context. Matching with scoring. Each matching document receives a relevance score. Scoring is the most expensive per-document operation.
The rescore API applies expensive scoring to only the top N documents from an initial cheap query. This reduces the number of function_score evaluations from “all matching documents” to “top N candidates.”
The Implementation
Rescore for Expensive Scoring
// FRAGILE: function_score applied to all 50,000 matching documents
// All four scoring functions execute on every match
SearchRequest expensive = SearchRequest.of(s -> s
.index("docs-v1")
.query(q -> q.functionScore(fs -> fs
.query(baseQuery)
.functions(fn -> fn.decayDate(d -> d
.field("published_date")
.origin(JsonData.of("now"))
.scale(JsonData.of("30d"))
))
.functions(fn -> fn.fieldValueFactor(fvf -> fvf
.field("view_count")
.modifier(FieldValueFactorModifier.Log1p)
))
.functions(fn -> fn.scriptScore(ss -> ss
.script(sc -> sc.inline(is -> is
.source("_score * doc['boost_factor'].value")
))
))
))
.size(10)
);
// HARDENED: Cheap initial query + rescore on top 100 candidates
SearchRequest optimized = SearchRequest.of(s -> s
.index("docs-v1")
.query(baseQuery) // Cheap BM25 scoring only
.rescore(r -> r
.windowSize(100) // Rescore only the top 100 candidates
.query(rq -> rq
.rescoreQuery(q -> q.functionScore(fs -> fs
.functions(fn -> fn.decayDate(d -> d
.field("published_date")
.origin(JsonData.of("now"))
.scale(JsonData.of("30d"))
))
.functions(fn -> fn.fieldValueFactor(fvf -> fvf
.field("view_count")
.modifier(FieldValueFactorModifier.Log1p)
))
.functions(fn -> fn.scriptScore(ss -> ss
.script(sc -> sc.inline(is -> is
.source("_score * doc['boost_factor'].value")
))
))
))
.queryWeight(0.7)
.rescoreQueryWeight(1.2)
)
)
.size(10)
);
Filter Context Optimization
// HARDENED: Move non-scoring clauses to filter context
SearchRequest filterOptimized = SearchRequest.of(s -> s
.index("docs-v1")
.query(q -> q.bool(b -> b
// Scoring clauses: affect ranking
.must(mu -> mu.multiMatch(mm -> mm
.query(query)
.fields("title^3", "body")
.type(TextQueryType.CrossFields)
))
// Filter clauses: binary match, cached, no scoring cost
.filter(f -> f.term(t -> t.field("tenant_id").value(tenantId)))
.filter(f -> f.term(t -> t.field("status").value("published")))
.filter(f -> f.range(r -> r
.field("published_date")
.gte(JsonData.of("2023-01-01"))
))
))
.size(10)
);
Every clause in filter context is cached after the first execution. The second request with the same tenant_id filter executes the filter in microseconds. Clauses in must context are never cached because their scoring makes each execution unique.
Bool Clause Ordering
// HARDENED: Order filter clauses from most selective to least selective
// OpenSearch evaluates filters in order and short-circuits on the first
// non-match within a document.
.filter(f -> f.term(t -> t.field("tenant_id").value(tenantId))) // 2% of docs
.filter(f -> f.term(t -> t.field("status").value("published"))) // 90% of remaining
.filter(f -> f.range(r -> r.field("published_date").gte(JsonData.of("2023-01-01")))) // 60% of remaining
The most selective filter (tenant_id, matching 2% of documents) eliminates 98% of documents before the less selective filters execute. Reversing the order would force the range filter to evaluate all documents before the term filter eliminates most of them.
The Measurement
Query optimization impact on p50 latency (50,000 matching documents, returning top 10):
| Optimization | p50 Latency | p99 Latency | Improvement |
|---|---|---|---|
| function_score on all matches | 180ms | 420ms | Baseline |
| Rescore (window=100) | 35ms | 95ms | 80% faster |
| + Filter context for non-scoring | 28ms | 72ms | 84% faster |
| + Selective filter ordering | 24ms | 65ms | 87% faster |
The combined optimizations reduce p50 latency by 87%. The rescore optimization alone provides the largest single improvement by reducing the scoring computation from 50,000 documents to 100.
The Decision Rule
Use rescore when the scoring function involves decay, field_value_factor, or script_score. Set the rescore window to 10x the requested result size. A window of 100 for a result size of 10 provides sufficient candidate diversity while limiting the scoring cost.
Move every clause that does not affect ranking to filter context. This includes tenant filters, status filters, date range filters, and access control filters. Filter context clauses are cached and add negligible cost after the first execution.
Order filter clauses from most selective to least selective. The first filter that eliminates the most documents provides the largest benefit to subsequent filters and the scoring query.