Skip to main content
the readable codebase

Code Review as a Readability Discipline: What Automation Misses and What Human Review Must Catch

2 min read Chapter 22 of 27

Code Review as a Readability Discipline

The logistics platform team runs every pull request through a CI pipeline: Checkstyle for formatting, SpotBugs for bug patterns, SonarQube for complexity and duplication, ArchUnit for dependency rules, and a full test suite. A pull request that passes all of these checks has correct formatting, no known bug patterns, acceptable complexity scores, no dependency violations, and passing tests.

It can still make the codebase harder to understand.

A method can pass every automated check and still be named processData. A class can have acceptable complexity scores and still span three unrelated responsibilities. A package can have no dependency violations and still expose internal classes that should be package-private. A test can pass and still test the wrong thing. A new abstraction can satisfy every linter and still add a layer of indirection that makes the code harder to trace.

Automated tools enforce rules that can be expressed as structural patterns. Naming quality, design coherence, abstraction fitness, and responsibility alignment cannot be expressed as structural patterns. They require a human reader who can hold the context of the change, understand the system’s trajectory, and judge whether the code will be easier or harder to navigate six months from now.

Code review is where these judgments happen. When review culture is strong, each pull request is an opportunity to improve naming, tighten boundaries, and prevent design drift. When review culture is weak, each pull request is a rubber stamp that lets the codebase degrade one merge at a time.

What Automation Catches vs What Review Catches

This diagram divides code quality concerns into two zones. The left zone shows what automated tools catch: formatting violations, cyclomatic complexity thresholds, known bug patterns, dependency rule violations, test failures. The right zone shows what only human review catches: naming quality, responsibility coherence, abstraction fitness, design trajectory, unnecessary complexity. The left zone is necessary but insufficient. The right zone is where readability is defended or lost.