Testing Resilience
Testing Resilience
A unit test for a circuit breaker tests whether CircuitBreaker.executeSupplier() throws CallNotPermittedException after the threshold is crossed. This confirms the library works. It does not confirm that your application behaves correctly when the circuit breaker opens. The fallback activates. The metrics emit. The response code is 200 (degraded), not 503. The customer sees a slightly less accurate fraud score, not an error page.
Resilience testing requires integration tests that simulate real failure conditions: network timeouts, connection refusals, slow responses, and partial failures. The dependency must be a real (or realistically simulated) service, not a mock that returns instantly.
The Test Infrastructure
// PRODUCTION - Base class for resilience integration tests
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@Testcontainers
abstract class ResilienceTestBase {
@Container
static GenericContainer<?> fraudService = new GenericContainer<>(
DockerImageName.parse("wiremock/wiremock:latest"))
.withExposedPorts(8080)
.withCommand("--global-response-templating");
@Container
static GenericContainer<?> balanceService = new GenericContainer<>(
DockerImageName.parse("wiremock/wiremock:latest"))
.withExposedPorts(8080);
@DynamicPropertySource
static void configureProperties(DynamicPropertyRegistry registry) {
registry.add("fraud.service.url", () ->
"http://localhost:" + fraudService.getMappedPort(8080));
registry.add("balance.service.url", () ->
"http://localhost:" + balanceService.getMappedPort(8080));
}
protected WireMock fraudWireMock() {
return new WireMock(fraudService.getMappedPort(8080));
}
protected WireMock balanceWireMock() {
return new WireMock(balanceService.getMappedPort(8080));
}
@BeforeEach
void resetWireMock() {
fraudWireMock().resetMappings();
balanceWireMock().resetMappings();
resetCircuitBreakers();
}
@Autowired
private CircuitBreakerRegistry cbRegistry;
private void resetCircuitBreakers() {
cbRegistry.getAllCircuitBreakers()
.forEach(CircuitBreaker::reset);
}
}
Each test starts with clean WireMock mappings and reset circuit breakers. The @Container annotation manages the lifecycle: containers start before the first test and stop after the last. Port mapping is dynamic, avoiding conflicts with other services.
Testing Circuit Breaker State Transitions
// PRODUCTION - Circuit breaker opens after threshold failures
class CircuitBreakerTransitionTest extends ResilienceTestBase {
@Autowired
private TestRestTemplate restTemplate;
@Autowired
private CircuitBreakerRegistry cbRegistry;
@Test
void circuitBreakerOpens_afterFailureThreshold() {
// Configure fraud service to return 500
fraudWireMock().register(
WireMock.post("/fraud/score")
.willReturn(WireMock.serverError()
.withBody("{\"error\":\"internal\"}")));
CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
// Minimum calls before circuit breaker evaluates: 20
// Failure rate threshold: 50%
// Send 20 requests that all fail
for (int i = 0; i < 20; i++) {
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
}
// Circuit breaker should now be OPEN
assertThat(cb.getState())
.isEqualTo(CircuitBreaker.State.OPEN);
// Next request should get fallback (not 500)
ResponseEntity<PaymentResponse> response =
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
assertThat(response.getBody().fraudCheckStatus())
.isEqualTo("FALLBACK");
}
@Test
void circuitBreakerRecovery_halfOpenToClosedTransition() {
// Open the circuit breaker
CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
cb.transitionToOpenState();
// Configure fraud service to return success
fraudWireMock().register(
WireMock.post("/fraud/score")
.willReturn(WireMock.okJson(
"{\"score\":0.1,\"decision\":\"PERMIT\"}")));
// Transition to half-open
cb.transitionToHalfOpenState();
// Send permitted-number-of-calls-in-half-open-state requests (5)
for (int i = 0; i < 5; i++) {
ResponseEntity<PaymentResponse> response =
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
}
// Circuit breaker should transition back to CLOSED
assertThat(cb.getState())
.isEqualTo(CircuitBreaker.State.CLOSED);
}
}
Testing Timeout Behavior
// PRODUCTION - Verify timeout fires at the configured duration
class TimeoutResilienceTest extends ResilienceTestBase {
@Test
void timeout_activatesFallback() {
// Configure fraud service to respond slowly (3 seconds)
fraudWireMock().register(
WireMock.post("/fraud/score")
.willReturn(WireMock.okJson(
"{\"score\":0.1,\"decision\":\"PERMIT\"}")
.withFixedDelay(3000)));
long start = System.nanoTime();
ResponseEntity<PaymentResponse> response =
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
long elapsed = Duration.ofNanos(
System.nanoTime() - start).toMillis();
// Response should arrive before the 3-second delay
// (TimeLimiter timeout is 2s)
assertThat(elapsed).isLessThan(2500);
// Fallback should have activated
assertThat(response.getBody().fraudCheckStatus())
.isEqualTo("FALLBACK");
}
}
Testing Concurrent Load and Bulkhead Rejection
// PRODUCTION - Bulkhead rejects when concurrency exceeds limit
class BulkheadResilienceTest extends ResilienceTestBase {
@Autowired
private BulkheadRegistry bulkheadRegistry;
@Test
void bulkhead_rejectsExcessConcurrency() throws Exception {
// Configure fraud service to respond slowly (hold permits)
fraudWireMock().register(
WireMock.post("/fraud/score")
.willReturn(WireMock.okJson(
"{\"score\":0.1,\"decision\":\"PERMIT\"}")
.withFixedDelay(2000)));
Bulkhead bulkhead = bulkheadRegistry.bulkhead("fraudDetection");
int maxConcurrent = bulkhead.getBulkheadConfig()
.getMaxConcurrentCalls();
// Send more requests than the bulkhead allows
int totalRequests = maxConcurrent + 10;
ExecutorService executor = Executors.newFixedThreadPool(
totalRequests);
CountDownLatch latch = new CountDownLatch(totalRequests);
AtomicInteger rejections = new AtomicInteger();
for (int i = 0; i < totalRequests; i++) {
executor.submit(() -> {
try {
ResponseEntity<PaymentResponse> response =
restTemplate.postForEntity("/payments",
samplePayment(),
PaymentResponse.class);
if ("FALLBACK".equals(
response.getBody().fraudCheckStatus())) {
rejections.incrementAndGet();
}
} finally {
latch.countDown();
}
});
}
latch.await(10, TimeUnit.SECONDS);
executor.shutdown();
// At least some requests should have been rejected by bulkhead
assertThat(rejections.get()).isGreaterThan(0);
// But not all (maxConcurrent requests should succeed)
assertThat(rejections.get()).isLessThan(totalRequests);
}
}
Testing the Full Resilience Stack
// PRODUCTION - End-to-end resilience test: degradation and recovery
class FullResilienceStackTest extends ResilienceTestBase {
@Test
void fullDegradationAndRecovery() {
// Phase 1: Normal operation
fraudWireMock().register(
WireMock.post("/fraud/score")
.willReturn(WireMock.okJson(
"{\"score\":0.1,\"decision\":\"PERMIT\"}")));
ResponseEntity<PaymentResponse> normal =
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
assertThat(normal.getBody().fraudCheckStatus())
.isEqualTo("CHECKED");
// Phase 2: Dependency failure -> circuit breaker opens
fraudWireMock().resetMappings();
fraudWireMock().register(
WireMock.post("/fraud/score")
.willReturn(WireMock.serverError()));
// Send enough requests to open the circuit breaker
for (int i = 0; i < 25; i++) {
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
}
CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
assertThat(cb.getState())
.isEqualTo(CircuitBreaker.State.OPEN);
// Phase 3: Verify fallback during open state
ResponseEntity<PaymentResponse> degraded =
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
assertThat(degraded.getStatusCode()).isEqualTo(HttpStatus.OK);
assertThat(degraded.getBody().fraudCheckStatus())
.isEqualTo("FALLBACK");
// Phase 4: Dependency recovers -> circuit breaker closes
fraudWireMock().resetMappings();
fraudWireMock().register(
WireMock.post("/fraud/score")
.willReturn(WireMock.okJson(
"{\"score\":0.2,\"decision\":\"PERMIT\"}")));
cb.transitionToHalfOpenState();
for (int i = 0; i < 5; i++) {
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
}
assertThat(cb.getState())
.isEqualTo(CircuitBreaker.State.CLOSED);
// Phase 5: Normal operation resumed
ResponseEntity<PaymentResponse> recovered =
restTemplate.postForEntity("/payments",
samplePayment(), PaymentResponse.class);
assertThat(recovered.getBody().fraudCheckStatus())
.isEqualTo("CHECKED");
}
}
This test walks through the complete lifecycle: normal operation, degradation, fallback, recovery, and return to normal. Each phase verifies a different aspect of the resilience configuration. The test takes approximately 30 seconds to run (including container startup) and provides confidence that the entire resilience stack works together.
What These Tests Do Not Cover
Integration tests verify that patterns activate correctly under simulated failures. They do not verify:
- Performance under sustained load. A 30-second test does not reveal thread pool exhaustion that takes 5 minutes to develop. Load tests with tools like Gatling or k6 are needed.
- Cascading failures across services. The test uses WireMock, not real services. A staging environment with multiple services and injected failures (Chapter 15) tests cross-service cascading.
- Clock-dependent behavior. Circuit breaker wait-duration-in-open-state uses real time. Tests that need to verify automatic transitions from OPEN to HALF_OPEN either wait the full duration or manually trigger the transition (as done above).
- JVM-level failures. Out-of-memory, garbage collection pauses, and thread starvation are not reproducible with WireMock delays. Chaos engineering (Chapter 15) addresses these.