Your AI Problem Is a Documentation Problem

Your development team is using AI coding agents. Subscriptions are paid, some developers use them actively, others are still stuck at chat-completion. And the productivity gain the tool vendors promised still hasn’t materialized.

It’s not the model. It’s not the tool: The missing lever is context, and most teams aren’t providing it systematically.

This isn’t a new problem. AI agents just make it more visible. What agents can already do today — and where the limits lie — I’ve described in KI in der Softwareentwicklung .

What this actually costs

When an agent starts working without structured context, it explores the repository first: build scripts, conventions, architecture decisions. It guesses. It hallucinates. And when the output doesn’t match, it gets corrected — and starts over.

This is not a one-off. Every task begins with this discovery phase, and every iteration costs tokens, time, and frustration. A team that uses AI agents without documented context pays this price with every single task.

In business terms: you are paying for tool subscriptions that deliver no productivity. Agent license costs add up — and when every task starts with guessing, correcting, and restarting instead of actual progress, you are burning money on something that was supposed to help.

I’ve described this pattern in 5 Anzeichen, dass euer Team KI-Tools falsch einsetzt . What appears there as developer behavior often has this single root cause: missing context.

Empirical Evidence
A controlled study by Lulla et al. (124 pull requests, 10 repositories, OpenAI Codex) found a well-maintained AGENTS.md reduces median task execution time by 28.6% and token consumption by 16.6%. Same tasks. Better context. Faster. Cheaper.

Source: Lulla et al., “On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents”, arXiv:2601.20404 (January 2026)

Four problems teams face with their documentation

In practice, good AI context rarely fails because of a single problem. More often, several layers compound and reinforce each other.

1. No documentation, or at the wrong abstraction level

The most common problem is the simplest: There is no systematic documentation. Or it exists, but at an abstraction level that’s irrelevant for AI agents.

A detailed description of how a specific algorithm is implemented in Kotlin doesn’t help an agent. The agent can read the code itself. What it’s missing is the “why”: Why was this approach chosen? What alternatives were considered and rejected? What constraints govern the code?

What you write for humans is simultaneously the best context for AI agents. If the documentation exists, you’ve already covered half the ground.

2. Documentation lives somewhere the agent can’t reach

Word files on a shared drive, Confluence pages behind a login, Visio diagrams on the lead developer’s machine. None of this is visible to AI agents. They can’t parse Word metadata or navigate through a browser.

The simplest solution: Documentation in the repository. Next to the code. In a text-based format that an agent can read. Markdown or AsciiDoc. If existing formats like Word are indispensable, an MCP server can serve as a bridge. But you create that overhead when you think about documentation from the tool upward, not from the fundamentals.

3. Dump everything → Context becomes unusable

“I’ll just throw everything into the AGENTS.md!” That was my first instinct. Every architecture decision, every diagram, every rule from the arc42 chapters. In it goes.

Anthropic warns about this explicitly: “Bloated CLAUDE.md files cause Claude to ignore your actual instructions.” The logic is simple: the more information a model has to process, the harder it becomes to distinguish relevant instructions from noise. The Chroma study (2025)1 confirmed this empirically. It tested 18 LLMs and found that performance decreases continuously with increasing context length, not in jumps but gradually. The more information in the context, the less reliable recall becomes. Distractors amplify the effect.

The problem gets worse when context isn’t curated but generated. An ETH Zurich study (Gloaguen et al., February 2026)2 found a sobering result: LLM-generated context files degrade performance by three percentage points, and context files increase inference costs by over 20%, even when they provide only marginal benefit.

Quality over Quantity
The ETH study by Gloaguen et al. shows that auto-generated context files harm performance. Curated, human-written documents with minimal, essential requirements are more effective than generated content or “everything in” approaches.

4. Documentation and code drift apart

When architecture diagrams live in Visio, ADRs exist in the lead architect’s head, and build conventions sit in a Word file nobody updates, documentation and code diverge immediately. An agent reading the documentation orients itself to an outdated reality. A human who consults it does the same.

Text-based diagrams in PlantUML or Mermaid solve this problem because they live directly in the repository, are version-controlled, and can be discussed in pull requests via diffs. An agent can read them, and it can update them. Visio files are invisible to it.

How architecture documentation works in practice — and why it often fails exactly at this point — I’ve described in Endlich eine gute Architekturdokumentation .

What productive AI context looks like in practice

Once you’ve avoided the four mistakes, the question becomes: how do you tell if AI context is good? Not by length. Not by completeness. By three criteria you can apply to your own documentation tomorrow.

The context is actionable when every paragraph contains a decision or rule an agent can act on. “What” is not context. “How” and “why not” are. An agent doesn’t need a list of API endpoints — it needs the rule explaining why the API was designed that way and when you break it. An agent doesn’t need every configuration parameter — it needs the decision behind why the default works in 95% of cases and when you override it.

The context is manageable when it fits within a reasonable context window. If an agent needs 40k tokens for onboarding before starting the first task, that’s not a feature — that’s compute waste. The three percentage point performance drop measured by the ETH study isn’t the only argument. The economic one carries more weight: 40k extra tokens per task are 40k tokens not available for the actual work.

The context is removable when you know exactly what doesn’t belong. The best AI context is the one you can shorten without losing anything important. If you can’t judge that, your context is too unstructured. Curate instead of copy — that’s the mindset that makes the difference.

In an earlier essay Surprising Documentation I wrote about what documentation really needs — and why most teams start at the wrong end. The core insight is the same: less is more, if what’s less is the right thing.

A case from practice

At my project einfache-erechnung.de , I had the task of improving the PDF-to-XML parsing algorithm for invoice data extraction. The task seems technical but is complex: invoices never look the same, information is placed differently, and the rule set contains edge cases.

I had written an ADR that specifies how to handle discrepancies: when the sum of line items doesn’t match the total amount on the invoice.

An agent that hadn’t read this ADR simply recalculated the total amount and overwrote the original. That produces incorrect XML files, because the amount on the invoice is legally relevant.

A second agent, which had read the ADR, recognized the problem immediately and autonomously threw an error instead of self-correcting. No human intervention needed. The agent could make the decision on its own. The context was there.

Your AI problem is a documentation problem. The good news: You already know how to document.

Writing an AGENTS.md is the easy part. Curating context so that it actually helps — not too much, not too little, always current and relevant for the decisions your team needs to make — that is the real challenge. And that is where experience matters: anyone who has worked with architecture documentation in practice can immediately tell which architectural rules are relevant for an agent and which just create noise in the context window.

When this doesn’t apply

Good documentation is no cure-all. If your software is fundamentally misdesigned, if the domain model is wrong or service boundaries are misplaced, better documentation won’t help an AI agent much. An agent working on a fundamentally broken foundation only writes bad code faster, and at the wrong place.

And if the team doesn’t master basic development practices — no tests, no review process, no clean version control — AI is not the right first step. The foundation has to come first, with or without AI.

Order Matters
Documentation improves output quality, but it doesn’t make a bad product good. If the foundation isn’t sound, address it first. With or without AI.

What you can start on tomorrow

You don’t need to rebuild your entire architecture documentation overnight. Three steps with the biggest leverage:

1. Create an AGENTS.md in the repository root. It should not be a mere list of rules, but an overview of the project and its domain. And it should include pointers to where specific decisions are documented — and when to consult them. An agent that knows where to find the relevant information is better than one that expects everything on a single page.

2. Create your next architecture diagram in a text-based format. PlantUML or Mermaid. Directly in the repository. Not in Visio, not in Confluence. The agent can read it, and you can discuss it in pull requests.

3. Write the next ADR. Not for the filing cabinet. For the next four weeks, when your team and the next agent need to make decisions that stay consistent with existing context. An ADR with status, context, decision, and consequences.

The first step is quick to complete. The second and third are not. Learning to curate context — recognizing which architectural rules actually matter for your team and which just create noise in the context window — that takes experience. And keeping the context current as your system grows and changes is an organizational challenge, not a technical one. That is where most teams fail: not in writing, but in maintaining.

When the foundation is solid, documentation becomes a real lever. I describe how this approach works in a real team in Agentic Coding im Team — when the first steps are taken, the leverage kicks in.

Beyond Code
AI agents amplify your existing way of working. Good documentation doesn’t just improve AI output quality. It helps your developers even when they work without an agent. The context you build for AI is the same context you need for your team.

The point where AI becomes truly productive

At JavaLand 2026, in my talk “Context Is Everything” in March, I asked around 50 developers about their position in AI adoption. At Stage 1, every hand went up. At Stage 3, already one step beyond occasional chat-completion, only four or five hands remained.

This is not a tool problem. It is a preparation problem. Companies that enable their developers to use AI agents systematically and with good context today will deliver faster in two years. Not because they have better developers, but because their existing developers work with significantly better tools.

You don’t need a new tool. You need a new understanding of what your documentation already does — and what it could do.

Want to improve your development practices but unsure where to start?

Let's talk →

  1. Hong, Troynikov, Huber (Chroma Research): “Context Rot”, July 2025. research.trychroma.com/context-rot  ↩︎

  2. Gloaguen, Mündler, Müller, Raychev, Vechev (ETH Zürich SRI Lab): “Evaluating AGENTS.md: Are Repository-Level Context Files Helpful?”, February 2026. arxiv.org/html/2602.11988v1  ↩︎

Top