Tenant Isolation Testing with Testcontainers
Tenant Isolation Testing with Testcontainers
The Symptom
A developer adds a new search feature and forgets to include the routing parameter on a query. The query returns results from all tenants. Tenant A sees Tenant B’s documents in search results. The bug reaches production because the test suite only tests with a single tenant.
The Internals
Tenant isolation in OpenSearch depends on three mechanisms working together:
- Custom routing. Documents are routed to shards by
tenant_id. A query with the same routing value only hits the tenant’s shards. - Filtered alias. The alias includes a
tenant_idfilter, ensuring only the tenant’s documents are returned even if routing is omitted. - Application-layer enforcement. The application always uses the tenant alias, never the concrete index name.
Each mechanism is a defense layer. Testing must verify that all three layers function correctly and that removing any single layer does not expose data from other tenants.
The Implementation
Multi-Tenant Integration Test Suite
@Testcontainers
public class TenantIsolationTest {
@Container
private static final GenericContainer<?> opensearch =
new GenericContainer<>("opensearchproject/opensearch:2.12.0")
.withExposedPorts(9200)
.withEnv("discovery.type", "single-node")
.withEnv("plugins.security.disabled", "true");
private static OpenSearchClient client;
private static MultiTenantIndexManager indexManager;
@BeforeAll
static void setup() throws Exception {
client = createClient(opensearch.getHost(),
opensearch.getMappedPort(9200));
indexManager = new MultiTenantIndexManager(client);
// Create shared index
indexManager.createSharedIndex();
// Onboard two tenants
indexManager.onboardTenant("acme");
indexManager.onboardTenant("globex");
// Index documents for each tenant
indexDocuments("acme", List.of(
new DocPage("acme-auth", "acme", "Authentication Guide",
"How to authenticate with Acme API", "guide", "v1"),
new DocPage("acme-billing", "acme", "Billing API Reference",
"Billing endpoint documentation", "api-ref", "v1")
));
indexDocuments("globex", List.of(
new DocPage("globex-setup", "globex", "Setup Guide",
"How to set up Globex SDK", "guide", "v1"),
new DocPage("globex-webhooks", "globex", "Webhook Reference",
"Webhook event types and payloads", "api-ref", "v1")
));
// Wait for refresh
client.indices().refresh(r -> r.index("docs-shared-v1"));
}
@Test
void tenantSearchReturnsOnlyOwnDocuments() throws IOException {
// Search through Acme's alias
var acmeResults = client.search(s -> s
.index("docs-acme")
.query(q -> q.matchAll(m -> m))
.size(100),
DocPage.class
);
assertEquals(2, acmeResults.hits().total().value());
acmeResults.hits().hits().forEach(hit ->
assertEquals("acme", hit.source().tenantId(),
"All results must belong to Acme"));
}
@Test
void tenantCannotAccessOtherTenantDocuments() throws IOException {
// Acme searches for a term that exists only in Globex documents
var acmeResults = client.search(s -> s
.index("docs-acme")
.query(q -> q.match(m -> m
.field("body")
.query("Globex SDK")
))
.size(100),
DocPage.class
);
assertEquals(0, acmeResults.hits().total().value(),
"Acme must not see Globex documents");
}
@Test
void queryWithoutRoutingStillFiltered() throws IOException {
// Even without explicit routing, the filtered alias ensures isolation
var results = client.search(s -> s
.index("docs-acme")
.query(q -> q.matchAll(m -> m))
.size(100),
DocPage.class
);
results.hits().hits().forEach(hit ->
assertEquals("acme", hit.source().tenantId(),
"Filtered alias must enforce tenant isolation"));
}
@Test
void crossTenantSearchWithSharedAliasDisallowed() throws IOException {
// Direct index access (bypassing alias) returns all tenants
// This test verifies the application never does this
var allResults = client.search(s -> s
.index("docs-shared-v1")
.query(q -> q.matchAll(m -> m))
.size(100),
DocPage.class
);
assertEquals(4, allResults.hits().total().value(),
"Direct index access returns all tenants' data — " +
"application must always use tenant alias");
}
@Test
void tenantDeletionRemovesAllTenantData() throws Exception {
// Onboard and then offboard a test tenant
indexManager.onboardTenant("temp-tenant");
indexDocuments("temp-tenant", List.of(
new DocPage("temp-doc", "temp-tenant", "Temp Doc",
"Temporary content", "guide", "v1")
));
client.indices().refresh(r -> r.index("docs-shared-v1"));
// Verify document exists
assertEquals(1, client.search(s -> s
.index("docs-temp-tenant")
.query(q -> q.matchAll(m -> m))
.size(100),
DocPage.class
).hits().total().value());
// Offboard tenant
indexManager.offboardTenant("temp-tenant");
client.indices().refresh(r -> r.index("docs-shared-v1"));
// Verify data is gone
var remainingDocs = client.search(s -> s
.index("docs-shared-v1")
.query(q -> q.term(t -> t
.field("tenant_id").value("temp-tenant")))
.size(100),
DocPage.class
);
assertEquals(0, remainingDocs.hits().total().value(),
"All tenant data must be deleted after offboarding");
}
private static void indexDocuments(String tenantId,
List<DocPage> docs) throws IOException {
BulkRequest.Builder bulk = new BulkRequest.Builder()
.index("docs-shared-v1")
.refresh(Refresh.True);
for (DocPage doc : docs) {
bulk.operations(op -> op
.index(idx -> idx
.id(doc.slug())
.routing(tenantId)
.document(doc)
)
);
}
client.bulk(bulk.build());
}
}
The Measurement
Test execution time for the isolation suite:
| Test Count | Container Startup | Test Execution | Total |
|---|---|---|---|
| 5 isolation tests | 12s | 3s | 15s |
| + 10 search quality tests | 12s | 8s | 20s |
| + 5 offboarding tests | 12s | 6s | 18s |
Container startup dominates execution time. The @Container annotation with @Testcontainers ensures the container is shared across all tests in the class, amortizing the 12-second startup over the entire suite.
The Decision Rule
Include tenant isolation tests in every CI pipeline. A single missing routing parameter or filter bypasses all isolation. The tests catch this before production.
Test with at least two tenants where both have documents matching the same search terms. Single-tenant tests cannot detect cross-tenant data leakage because there is no other tenant’s data to leak.
Verify that direct index access (bypassing the alias) returns data from all tenants. This test documents the security contract: the application must always use the tenant alias, never the concrete index name. If a code path accidentally uses the concrete index, the test suite catches it before deployment.