Skip to main content
resilience patterns in production

Testing Resilience

5 min read Chapter 29 of 40

Testing Resilience

A unit test for a circuit breaker tests whether CircuitBreaker.executeSupplier() throws CallNotPermittedException after the threshold is crossed. This confirms the library works. It does not confirm that your application behaves correctly when the circuit breaker opens. The fallback activates. The metrics emit. The response code is 200 (degraded), not 503. The customer sees a slightly less accurate fraud score, not an error page.

Resilience testing requires integration tests that simulate real failure conditions: network timeouts, connection refusals, slow responses, and partial failures. The dependency must be a real (or realistically simulated) service, not a mock that returns instantly.

The Test Infrastructure

// PRODUCTION - Base class for resilience integration tests
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@Testcontainers
abstract class ResilienceTestBase {

    @Container
    static GenericContainer<?> fraudService = new GenericContainer<>(
            DockerImageName.parse("wiremock/wiremock:latest"))
            .withExposedPorts(8080)
            .withCommand("--global-response-templating");

    @Container
    static GenericContainer<?> balanceService = new GenericContainer<>(
            DockerImageName.parse("wiremock/wiremock:latest"))
            .withExposedPorts(8080);

    @DynamicPropertySource
    static void configureProperties(DynamicPropertyRegistry registry) {
        registry.add("fraud.service.url", () ->
                "http://localhost:" + fraudService.getMappedPort(8080));
        registry.add("balance.service.url", () ->
                "http://localhost:" + balanceService.getMappedPort(8080));
    }

    protected WireMock fraudWireMock() {
        return new WireMock(fraudService.getMappedPort(8080));
    }

    protected WireMock balanceWireMock() {
        return new WireMock(balanceService.getMappedPort(8080));
    }

    @BeforeEach
    void resetWireMock() {
        fraudWireMock().resetMappings();
        balanceWireMock().resetMappings();
        resetCircuitBreakers();
    }

    @Autowired
    private CircuitBreakerRegistry cbRegistry;

    private void resetCircuitBreakers() {
        cbRegistry.getAllCircuitBreakers()
                .forEach(CircuitBreaker::reset);
    }
}

Each test starts with clean WireMock mappings and reset circuit breakers. The @Container annotation manages the lifecycle: containers start before the first test and stop after the last. Port mapping is dynamic, avoiding conflicts with other services.

Testing Circuit Breaker State Transitions

// PRODUCTION - Circuit breaker opens after threshold failures
class CircuitBreakerTransitionTest extends ResilienceTestBase {

    @Autowired
    private TestRestTemplate restTemplate;

    @Autowired
    private CircuitBreakerRegistry cbRegistry;

    @Test
    void circuitBreakerOpens_afterFailureThreshold() {
        // Configure fraud service to return 500
        fraudWireMock().register(
                WireMock.post("/fraud/score")
                        .willReturn(WireMock.serverError()
                                .withBody("{\"error\":\"internal\"}")));

        CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");

        // Minimum calls before circuit breaker evaluates: 20
        // Failure rate threshold: 50%
        // Send 20 requests that all fail
        for (int i = 0; i < 20; i++) {
            restTemplate.postForEntity("/payments",
                    samplePayment(), PaymentResponse.class);
        }

        // Circuit breaker should now be OPEN
        assertThat(cb.getState())
                .isEqualTo(CircuitBreaker.State.OPEN);

        // Next request should get fallback (not 500)
        ResponseEntity<PaymentResponse> response =
                restTemplate.postForEntity("/payments",
                        samplePayment(), PaymentResponse.class);

        assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
        assertThat(response.getBody().fraudCheckStatus())
                .isEqualTo("FALLBACK");
    }

    @Test
    void circuitBreakerRecovery_halfOpenToClosedTransition() {
        // Open the circuit breaker
        CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
        cb.transitionToOpenState();

        // Configure fraud service to return success
        fraudWireMock().register(
                WireMock.post("/fraud/score")
                        .willReturn(WireMock.okJson(
                                "{\"score\":0.1,\"decision\":\"PERMIT\"}")));

        // Transition to half-open
        cb.transitionToHalfOpenState();

        // Send permitted-number-of-calls-in-half-open-state requests (5)
        for (int i = 0; i < 5; i++) {
            ResponseEntity<PaymentResponse> response =
                    restTemplate.postForEntity("/payments",
                            samplePayment(), PaymentResponse.class);
            assertThat(response.getStatusCode()).isEqualTo(HttpStatus.OK);
        }

        // Circuit breaker should transition back to CLOSED
        assertThat(cb.getState())
                .isEqualTo(CircuitBreaker.State.CLOSED);
    }
}

Testing Timeout Behavior

// PRODUCTION - Verify timeout fires at the configured duration
class TimeoutResilienceTest extends ResilienceTestBase {

    @Test
    void timeout_activatesFallback() {
        // Configure fraud service to respond slowly (3 seconds)
        fraudWireMock().register(
                WireMock.post("/fraud/score")
                        .willReturn(WireMock.okJson(
                                "{\"score\":0.1,\"decision\":\"PERMIT\"}")
                                .withFixedDelay(3000)));

        long start = System.nanoTime();
        ResponseEntity<PaymentResponse> response =
                restTemplate.postForEntity("/payments",
                        samplePayment(), PaymentResponse.class);
        long elapsed = Duration.ofNanos(
                System.nanoTime() - start).toMillis();

        // Response should arrive before the 3-second delay
        // (TimeLimiter timeout is 2s)
        assertThat(elapsed).isLessThan(2500);

        // Fallback should have activated
        assertThat(response.getBody().fraudCheckStatus())
                .isEqualTo("FALLBACK");
    }
}

Testing Concurrent Load and Bulkhead Rejection

// PRODUCTION - Bulkhead rejects when concurrency exceeds limit
class BulkheadResilienceTest extends ResilienceTestBase {

    @Autowired
    private BulkheadRegistry bulkheadRegistry;

    @Test
    void bulkhead_rejectsExcessConcurrency() throws Exception {
        // Configure fraud service to respond slowly (hold permits)
        fraudWireMock().register(
                WireMock.post("/fraud/score")
                        .willReturn(WireMock.okJson(
                                "{\"score\":0.1,\"decision\":\"PERMIT\"}")
                                .withFixedDelay(2000)));

        Bulkhead bulkhead = bulkheadRegistry.bulkhead("fraudDetection");
        int maxConcurrent = bulkhead.getBulkheadConfig()
                .getMaxConcurrentCalls();

        // Send more requests than the bulkhead allows
        int totalRequests = maxConcurrent + 10;
        ExecutorService executor = Executors.newFixedThreadPool(
                totalRequests);
        CountDownLatch latch = new CountDownLatch(totalRequests);
        AtomicInteger rejections = new AtomicInteger();

        for (int i = 0; i < totalRequests; i++) {
            executor.submit(() -> {
                try {
                    ResponseEntity<PaymentResponse> response =
                            restTemplate.postForEntity("/payments",
                                    samplePayment(),
                                    PaymentResponse.class);
                    if ("FALLBACK".equals(
                            response.getBody().fraudCheckStatus())) {
                        rejections.incrementAndGet();
                    }
                } finally {
                    latch.countDown();
                }
            });
        }

        latch.await(10, TimeUnit.SECONDS);
        executor.shutdown();

        // At least some requests should have been rejected by bulkhead
        assertThat(rejections.get()).isGreaterThan(0);
        // But not all (maxConcurrent requests should succeed)
        assertThat(rejections.get()).isLessThan(totalRequests);
    }
}

Testing the Full Resilience Stack

// PRODUCTION - End-to-end resilience test: degradation and recovery
class FullResilienceStackTest extends ResilienceTestBase {

    @Test
    void fullDegradationAndRecovery() {
        // Phase 1: Normal operation
        fraudWireMock().register(
                WireMock.post("/fraud/score")
                        .willReturn(WireMock.okJson(
                                "{\"score\":0.1,\"decision\":\"PERMIT\"}")));

        ResponseEntity<PaymentResponse> normal =
                restTemplate.postForEntity("/payments",
                        samplePayment(), PaymentResponse.class);
        assertThat(normal.getBody().fraudCheckStatus())
                .isEqualTo("CHECKED");

        // Phase 2: Dependency failure -> circuit breaker opens
        fraudWireMock().resetMappings();
        fraudWireMock().register(
                WireMock.post("/fraud/score")
                        .willReturn(WireMock.serverError()));

        // Send enough requests to open the circuit breaker
        for (int i = 0; i < 25; i++) {
            restTemplate.postForEntity("/payments",
                    samplePayment(), PaymentResponse.class);
        }

        CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
        assertThat(cb.getState())
                .isEqualTo(CircuitBreaker.State.OPEN);

        // Phase 3: Verify fallback during open state
        ResponseEntity<PaymentResponse> degraded =
                restTemplate.postForEntity("/payments",
                        samplePayment(), PaymentResponse.class);
        assertThat(degraded.getStatusCode()).isEqualTo(HttpStatus.OK);
        assertThat(degraded.getBody().fraudCheckStatus())
                .isEqualTo("FALLBACK");

        // Phase 4: Dependency recovers -> circuit breaker closes
        fraudWireMock().resetMappings();
        fraudWireMock().register(
                WireMock.post("/fraud/score")
                        .willReturn(WireMock.okJson(
                                "{\"score\":0.2,\"decision\":\"PERMIT\"}")));

        cb.transitionToHalfOpenState();

        for (int i = 0; i < 5; i++) {
            restTemplate.postForEntity("/payments",
                    samplePayment(), PaymentResponse.class);
        }

        assertThat(cb.getState())
                .isEqualTo(CircuitBreaker.State.CLOSED);

        // Phase 5: Normal operation resumed
        ResponseEntity<PaymentResponse> recovered =
                restTemplate.postForEntity("/payments",
                        samplePayment(), PaymentResponse.class);
        assertThat(recovered.getBody().fraudCheckStatus())
                .isEqualTo("CHECKED");
    }
}

This test walks through the complete lifecycle: normal operation, degradation, fallback, recovery, and return to normal. Each phase verifies a different aspect of the resilience configuration. The test takes approximately 30 seconds to run (including container startup) and provides confidence that the entire resilience stack works together.

What These Tests Do Not Cover

Integration tests verify that patterns activate correctly under simulated failures. They do not verify:

  • Performance under sustained load. A 30-second test does not reveal thread pool exhaustion that takes 5 minutes to develop. Load tests with tools like Gatling or k6 are needed.
  • Cascading failures across services. The test uses WireMock, not real services. A staging environment with multiple services and injected failures (Chapter 15) tests cross-service cascading.
  • Clock-dependent behavior. Circuit breaker wait-duration-in-open-state uses real time. Tests that need to verify automatic transitions from OPEN to HALF_OPEN either wait the full duration or manually trigger the transition (as done above).
  • JVM-level failures. Out-of-memory, garbage collection pauses, and thread starvation are not reproducible with WireMock delays. Chaos engineering (Chapter 15) addresses these.