The Curiosity Protocol

Every engineer profiled in this chapter shares one habit. It’s not reading textbooks. It’s not taking courses. It’s not attending conferences. It’s this: when something works, they ask “but how?”

That’s the entire protocol. Four words. But those four words, applied consistently, produce more systems understanding in a year than a computer science degree produces in four — because they’re applied to real systems you actually use, on problems you actually have, in a context where the answer actually matters to you.

Most engineers ask “how” only when something breaks. The codebase throws an exception, you trace the stack, you find the bug. That’s reactive learning — you learn just enough to survive the incident, and then you stop. The Curiosity Protocol is proactive: you ask “how” when things are working, precisely because working systems are stable enough to investigate safely. You have time to explore. No pager is going off. Nobody is losing revenue. You’re just… curious.

The Five Whys, Repurposed

You’ve probably encountered the Five Whys in the context of incident response. A system failed: why? Because the database connection timed out. Why? Because the connection pool was exhausted. Why? Because transactions weren’t being committed. Why? Because the ORM’s auto-commit was disabled. Why? Because someone copy-pasted a configuration from a blog post about batch processing.

The Five Whys work for incidents because they force you past the surface symptom. The same technique works for learning — but instead of asking “why did this fail?” you ask “how does this work?”

The difference is subtle and critical. Incident-response Whys are convergent: they narrow toward a root cause. Learning Whys are divergent: each answer opens new territory. You don’t need to reach a root cause. You just need to go deeper than you were before.

Here’s the protocol applied to a mundane task you probably performed today:

Example 1: `docker build .`

You run docker build . and it takes 3 minutes and 42 seconds. You’ve run this command hundreds of times. You’ve never asked why it takes 3 minutes and 42 seconds.

Why does it take 3 minutes? Because Docker is executing each instruction in the Dockerfile — installing dependencies, copying files, running build commands. Each instruction creates a layer.

How does layering work? Each layer is a filesystem diff. When you RUN apt-get install python3, Docker captures the set of files that changed — the new binaries, libraries, package metadata — and stores that delta as a layer. The final image is a stack of these deltas, applied in order.

How does Docker know when to rebuild a layer? Cache invalidation. Docker hashes the instruction and its inputs. For COPY, it hashes the content of the files being copied. If the hash matches a previously built layer, Docker reuses the cached layer and skips the instruction. If any layer’s cache is invalidated, all subsequent layers are also invalidated — because a later layer might depend on what an earlier layer produced.

What’s a content hash? A cryptographic digest of the file contents. Docker uses SHA-256. Two files with identical content, regardless of filename or modification time, produce the same hash. This is called content-addressable storage: you address stored data by what it is, not where it is.

So why did my build take 3 minutes? Because you changed a source file early in the Dockerfile, which invalidated the cache for every subsequent layer, including the dependency installation step. If you restructure the Dockerfile to copy the dependency manifest first, install dependencies, and then copy the source files, the dependency layer gets cached and the build drops to 12 seconds.

You just learned content-addressable storage, cache invalidation by hash chain, and a practical Dockerfile optimization. Total time: maybe ten minutes. You didn’t study these topics. You followed the thread from a question about your own build time.

Example 2: `git push`

You type git push origin main and it works. It has always worked. You have never asked what happens between pressing Enter and seeing the remote update.

What happens when I run git push? Your Git client connects to the remote server. If the remote URL starts with git@, it’s using SSH. Git spawns an SSH process, which connects to port 22 on the remote host.

How does SSH connect? The SSH client and server perform a key exchange. Your client proposes cryptographic algorithms. The server selects from the proposals. They agree on a key exchange method — commonly Curve25519 for modern setups — and derive a shared session key without either side transmitting it. This is Diffie-Hellman key exchange, one of the most elegant algorithms in computing.

What happens after the connection is established? Git runs the “smart protocol”: the client sends a list of refs (branches, their commit hashes) it wants to push. The server responds with what it already has. Git computes the minimum set of objects (commits, trees, blobs) that the server needs and packs them into a packfile — a compressed binary format that deduplicates content and uses delta compression to minimize transfer size.

What’s delta compression? Instead of sending a full copy of a modified file, Git sends the difference between the old version and the new version. The server already has the old version (it told Git what it has during the negotiation phase), so it can reconstruct the new version from the base object plus the delta.

You just learned SSH key exchange, Git’s smart transfer protocol, packfile construction, and delta compression. From git push. A command you run twenty times a day.

Example 3: `curl https://api.example.com`

What happens? DNS resolution to get an IP address. TCP handshake (SYN, SYN-ACK, ACK). TLS handshake (ClientHello, ServerHello, certificate exchange, key derivation). HTTP request over the encrypted channel. Response.

How does TLS work? The client sends a ClientHello listing supported cipher suites. The server picks one and sends its certificate. The client verifies the certificate against its trusted certificate authorities. Then they perform a key exchange (typically ECDHE) to derive the symmetric encryption keys for the session. All subsequent data is encrypted with AES-GCM or ChaCha20.

Why symmetric encryption if they just did an asymmetric key exchange? Because symmetric encryption (AES) is roughly 1,000 times faster than asymmetric encryption (RSA, ECDSA). The asymmetric exchange is used once to agree on the symmetric keys. Then the fast symmetric cipher handles the actual data. This is why HTTPS doesn’t noticeably slow down your API calls — the per-request cost after the handshake is microseconds.

You just learned TLS handshake mechanics and the asymmetric-then-symmetric encryption pattern from a curl command.

The 15-Minute Rule

The examples above took 5–10 minutes each. Some investigations will take less. Some will take more. The challenge is knowing when to stop — because curiosity without bounds leads to a three-hour Wikipedia rabbit hole about elliptic curve cryptography when you’re supposed to be shipping a feature.

The rule is simple: spend 15 minutes. Not an hour. Not “until you understand fully.” Fifteen minutes.

Set a timer if you need to. When it goes off, stop investigating and write down what you learned. Some days you’ll reach a satisfying stopping point in three minutes. Some days you’ll hit a wall at minute two and spend the remaining thirteen minutes confused. Both outcomes are fine.

Over the course of a month, you’ll have roughly 20 investigation sessions (one per workday, skipping the days you forget or are too busy). Some will yield a clear “aha” — you’ll understand why your Docker builds are slow, or what CLOSE-WAIT means. Others will leave you with a half-formed question you’ll revisit later. The ratio doesn’t matter. What matters is the habit.

Over the course of a year, that’s roughly 240 sessions. At 15 minutes each, that’s 60 hours — about the equivalent of a college course, except every minute is spent on systems and tools you actually use. The retention rate is dramatically higher than textbook study because you’re learning in context, about systems you’ll touch again tomorrow.

The TIL Journal

Write it down. Not in a blog. Not in a polished document. In a file. A single file. One line per day. The format:

2026-02-27: Docker layer cache invalidates all subsequent layers when one layer changes (hash chain)
2026-02-26: git push uses delta compression in packfiles — sends diffs, not full files
2026-02-25: CLOSE-WAIT = remote closed, local app hasn't called close() yet — usually a connection leak
2026-02-24: PostgreSQL EXPLAIN shows "Seq Scan" when no usable index exists OR when the table is small enough that seq scan is faster
2026-02-23: TLS uses asymmetric crypto only for key exchange, then switches to symmetric (AES) for speed

This file serves three purposes:

Retrieval: Six months from now, when you’re debugging a slow Docker build, you’ll search your TIL for “docker layer” and find the entry you wrote. This beats re-Googling because your TIL entry is written in your own words, about your own system, with the context that mattered to you.

Motivation: On days when you feel like you’re not learning anything, scroll through the last thirty entries. You’ll find it hard to argue that you haven’t made progress.

Compounding evidence: After a year, you’ll have 150–200 entries. Read them in sequence and you’ll see your understanding deepen in real time. The early entries are surface-level observations. The later entries reference earlier ones, build on them, correct them. You’re watching yourself develop a mental model.

How Curiosity Compounds

The first month feels slow. You learn disconnected facts: DNS TTLs, TCP states, cgroup limits, query planner heuristics. They feel like trivia. There’s no unifying framework.

By month three, the facts start linking together. You learn that a slow API call might be DNS resolution (a networking fact) or connection pool exhaustion (an application fact) or a full GC pause (a runtime fact), and you can check all three in under a minute because you know which tool to use for each layer. The facts aren’t trivia anymore — they’re an investigative toolkit.

By month six, you stop separating “my application” from “the system it runs on.” They’re the same thing. When you write code, you have a background awareness of what the runtime, the OS, and the network will do with that code. You don’t think about it explicitly every time — that would be paralyzing. But when something behaves unexpectedly, you have a mental model to debug against. You’re no longer guessing. You’re reasoning.

By year one, you’re the engineer other people call when something is broken and nobody knows why. Not because you memorized more facts, but because your mental model covers more layers. When someone describes a symptom, you can generate hypotheses at multiple layers and eliminate them systematically. This is the debugging superpower that the engineers in Section 18 demonstrated, and it was built fifteen minutes at a time.

The protocol is simple. The habit is daily. The payoff is permanent.

Start tomorrow. Pick any command you run every day and ask: “but how?”