Your Code Is Disposable. Your Decisions Aren't.

Every developer using Claude Code has a CLAUDE.md file. Every Cursor user has a rules file. Every team on Copilot has a copilot-instructions.md. These files tell the AI how you build — your conventions, your constraints, your preferences, what to avoid. They persist across sessions. They shape every interaction. They’re the reason the AI produces slop that fits your codebase instead of generic slop.

They’re architecture decision records — or at least, they’re trying to be. What they capture is the what: the conventions, the patterns, the constraints. What they miss is the why: the alternatives you considered, the trade-offs you accepted, the context that made the decision necessary. That missing piece is the difference between a conventions document and an architecture decision record. And it’s the piece that matters most when the LLM needs to make a ‘judgment call’.

The inversion

For at least thirty years, code was the permanent artifact and documentation was the overhead. The code shipped. The docs — if they existed — gathered dust in a wiki nobody read. The implicit hierarchy was clear: code is the product, documentation is the cost of explaining it.

AI inverted this. When regenerating a function costs thirty seconds, the function itself isn’t the valuable thing. The decision that shaped which functions exist, how they interact, and why the database schema looks the way it does — that’s the durable artifact. The code is the output. The reasoning is the asset.

GitHub’s engineering team is making this explicit with what became well-known as spec-driven development: the specification becomes the source of truth, and the code is generated from it. The specification captures intent — what the system should do, what constraints it operates under, what trade-offs were accepted. The code implements it. When the implementation needs to change, you regenerate from the spec. When the spec needs to change, that’s an architecture decision — and it requires a human.

Most codebases are already bifurcating along these lines. A fraction of the code is durable — core business logic, domain rules, the decisions that define what the system is. The rest is glue: CRUD endpoints, data transformations, UI scaffolding, configuration. The proportion varies by domain — a trading system has more durable code than a content site — but the pattern holds. The glue can be regenerated repeatedly over the life of the system. The core will evolve deliberately. The question is whether the reasoning that shaped the core is written down — or whether it lives in the head of someone who might leave next quarter.

The decision layer and the map layer

I’ve argued before that context quality matters more than prompting skill — that the differentiator in AI-assisted work is the quality of the context the model receives, not the cleverness of the instruction. One of the questions this post answers is: what kind of context?

Two tools have existed for years. Both were designed for human communication. Both turn out to be exactly what LLMs need.

Michael Nygard proposed Architecture Decision Records in 2011: a lightweight format for capturing the decision, the alternatives considered, the trade-offs accepted, and the context that made the decision necessary. The format includes a status — proposed, accepted, deprecated, superseded — that tracks whether the decision is still current. Fifteen years later, adoption sits at roughly 48% according to IcePanel’s 2025 State of Software Architecture survey. Half the industry doesn’t use them. And they’ve never been more critical.

Each ADR is a piece of retrievable context for every future AI interaction with your codebase. The decision you documented today will shape the code an AI generates six months from now — if it can find the record. But there’s a risk that cuts both ways: a stale ADR is worse than no ADR. An outdated record that says “we chose Kafka for ordering guarantees” when you migrated to SQS last quarter will cause the AI to generate code against an architecture that no longer exists. The status field — deprecated, superseded — exists precisely to prevent this. Treat ADRs as living documents or don’t write them at all.

Simon Brown’s C4 model offers the other half: the map. Four levels of zoom — system context, containers (Brown’s term for deployable units, not Docker containers), components, code. Level 2 — the container diagram showing your services, databases, and message brokers — is the architectural orientation an LLM needs before touching anything. Without it, the model navigates from file names and import statements — which is like finding your way through a city at street level without ever seeing the map.

C4 was designed for human communication — helping teams build shared understanding of system structure. Brown’s focus has remained there. But the fit with AI tooling is natural. Diagram-as-code tools like Structurizr DSL and C4-PlantUML mean the architectural map lives in the repository alongside the code: versioned, diffable, and directly feedable to AI. The teams still exporting PNGs from draw.io and committing them to a docs folder are maintaining a map the AI can’t read.

The compound return

I started treating architecture documentation as AI infrastructure about a year ago — not as overhead, not as compliance, but as the context layer that determines whether AI-assisted development produces coherent output or confident nonsense.

The moment that crystallised it came from two related systems at the same company. The core trading engine had an ADR for fixed-point arithmetic: every monetary value stored as a scaled long integer, every calculation done in primitive longs, zero allocations. BigDecimal’s object allocation and garbage collection overhead made it unusable on a hot path processing millions of operations per second. The LLM never once generated BigDecimal in the engine. It respected the constraint from the first interaction.

The back-office system had no such ADR. It handled the same currency types, the same prices and quantities — but without documented constraints, the LLM defaulted to doubles. The path of least resistance for any model trained on a million codebases where monetary precision wasn’t a concern. The code compiled. The calculations looked right to four decimal places. Had it reached production, a month of reconciliation across thousands of trades would have surfaced discrepancies that didn’t match the trading engine’s exact figures — the kind of bug that’s difficult to catch in testing and expensive in production.

The fix was an ADR for the back office: BigDecimal or the trading engine’s scaled domain types only — doubles rejected because precision loss compounds over volume and the back office must reconcile exactly with the core system’s fixed-point output. Same company, same currency types, different architectural constraints — and the ADR was the only place those constraints were captured.

A C4 container diagram had the same effect at a different scale in a different project. Once the LLM could ‘see’ the service boundaries — which services owned which data, where the message broker sat, what talked to what — it stopped proposing changes that crossed boundaries. It understood the topology before it touched the code.

The practical minimum is smaller than most teams think — but slightly larger than “just write something.” Four elements per decision: what was decided, what alternatives were considered, why this option won, and whether it’s still current. A C4 Level 2 diagram per bounded context — one for the whole system only works if the system is small. A project-level instructions file that captures conventions and constraints. All committed to the repo. All treated as code.

The investment compounds in a way that code doesn’t. Code depreciates — it’s refactored, rewritten, replaced. Every line has a half-life. Documentation of decisions appreciates — each record makes every subsequent AI interaction more accurate, every new developer’s onboarding faster, every regeneration cycle more reliable. The more decisions you’ve captured, the better the AI understands your system. The better it understands your system, the less time you spend correcting it.

Your code will be rewritten — probably by an AI, probably sooner than you expect. The decisions that shaped it are the only thing that survives the rewrite. If they’re written down, the next version will respect them. If they’re not, it will guess. And the guess will be wrong in exactly the ways that are most expensive to discover: silently, structurally, and months after the code shipped.