Draft
This article is a draft.
Please do not share or link to this URL until I remove this notice
Context Anchoring
AI conversations are ephemeral by design — decisions made early fade as sessions lengthen, and nothing survives the session boundary. Developers hold on to long conversations not because long sessions are productive, but because the context lives nowhere else. I propose externalizing decision context into a living document — external memory that persists what the context window cannot, turning transient alignment into durable shared understanding.
09 March 2026
This article is part of a series:
When I work with a colleague on a feature that spans several days, we keep a shared document. Not formal documentation: a working record. What we decided, why, what we rejected, what questions remain open. If either of us is absent for a day, the other picks up where we left off. Neither of us relies on memory alone. The document is our external memory — it persists what individual recall cannot.
With AI coding assistants, the conversation is still largely the record.
Some tools now offer persistent memory features (Claude's project memory,
Cursor's rules files, Copilot's workspace indexing) but these operate at
the project level, not the feature level. They remember that the project
uses Fastify, not that yesterday's session rejected a
RetryQueue abstraction for specific reasons. For feature-level
decisions, every constraint, every piece of reasoning still lives in the
chat history and nowhere else. This creates a dynamic I have come to
recognize as a vicious cycle: developers keep conversations running far
longer than they should, not because long sessions are productive, but
because closing the session means losing everything. The context lives
nowhere else. There is no external record. And so the conversation stretches
on, growing unwieldy, while the AI's ability to recall earlier decisions
quietly degrades. The longer I hold on, the less reliable the thing I am
holding on to becomes.
Could I close this conversation right now and start a new one without anxiety
Here is a test I find revealing: could I close this conversation right now and start a new one without anxiety? If that question creates discomfort, if I feel I would lose something important — my context is trapped inside a medium that was never designed to preserve it.
Why Context Erodes
The degradation is not random. It follows from how large language models process context.
Every model has a finite context window: a hard limit on how many tokens it can attend to at once. Current models offer windows ranging from hundreds of thousands to over a million tokens. These numbers sound generous, but a productive development session generates context quickly: code snippets, design discussions, decision rationale, file contents. The window fills faster than most developers expect.
Research confirms what practitioners experience intuitively. A 2023 study from Stanford and Berkeley (“Lost in the Middle” by Liu et al.) demonstrated that language models perform significantly worse on information placed in the middle of long contexts compared to the beginning or end. The effect is substantial: recall accuracy drops measurably for content that is neither recent nor at the very start of the conversation. This is not a quirk of a particular model; it is a property of the attention mechanism itself. Recent tokens and system-level instructions receive disproportionate weight. Everything in between competes for a shrinking share of the model's focus.
The study establishes that things fade by position. What it does not address — and what I have observed repeatedly in practice — is what fades first. In my experience, the reasoning behind decisions degrades faster than the decisions themselves. The AI might remember “we are using PostgreSQL” but forget why PostgreSQL was chosen over MongoDB: the need for JSONB support, the team's operational expertise, the multi-tenancy requirements that ruled out document stores. This is a subtle but expensive failure mode: the AI continues to follow the stated decision while making suggestions that violate its intent. It proposes a schema structure that would work well in a document store but fights against PostgreSQL's relational strengths. Technically compliant with the stated choice, but architecturally misguided.
The solution is the same one developers apply instinctively to their own cognition: externalize what matters. Persist it outside the medium that forgets.
Some tools attempt to manage this problem automatically, compacting or summarizing earlier conversation history as the context window fills. But this introduces a different concern: the compaction is a black box. The developer has no visibility into what was preserved verbatim, what was summarized, and what was silently dropped. The algorithm optimizes for general coherence, not for the specific nuances that matter to a particular design decision. And the reasoning behind decisions, being verbose, explanatory, and contextual, is precisely the kind of content most vulnerable to automated compression. The what survives; the why does not. Trusting an opaque process to preserve what matters is not a strategy; it is a hope.
This is the missing piece in the alignment techniques I have described elsewhere — sharing curated project context with AI (what I call Knowledge Priming) and structuring design conversations in sequential levels (Design-First collaboration) both build a shared mental model between human and AI. But that alignment is, by default, as transient as the conversation that created it. The shared mental model we invest in building erodes as the session lengthens — and vanishes entirely when the session ends.
Context anchoring is the practice of making that alignment durable.
External Memory
The solution is to treat decision context as external state: a living document that exists outside the conversation, captures decisions as they happen, and serves as the authoritative reference for both human and AI across sessions.
This is not the same as the priming document from that earlier work. The distinction matters:
A priming document captures project-level context: the tech stack, architecture patterns, naming conventions, code examples. It is relatively stable, updated quarterly, or when significant architectural changes occur. It is shared across all features and all sessions. It tells the AI “here is how this project works.”
A feature document captures feature-level context: the specific decisions made during development, the constraints that shaped them, what was considered and rejected, what remains open, and the current state of progress. It evolves rapidly, potentially every session. It tells the AI “here is where we are on this specific piece of work, and how we got here.”
Together, they form two layers of the same context strategy. When starting a new session, both are loaded: the project context as the stable foundation, the feature context as the record of where things stand. The priming document provides the vocabulary. The feature document provides the history.
A natural objection is that modern AI tools (Cursor with its file references, Copilot with workspace indexing) can already read codebases directly. If the AI can see the code, why maintain a separate document?
Because code captures outcomes, not reasoning. A codebase that uses
BullMQ directly for retry handling tells the reader nothing about whether
a RetryQueue abstraction was proposed, debated, and deliberately
rejected — or whether the direct approach was simply the first thing
generated and never questioned. The rejected alternative is invisible in
the code. The constraint that drove the decision is invisible. The open
question that remains is invisible.
There is a practical byproduct worth noting. A feature document of fifty lines carries the same decision context that hundreds or thousands of lines of implementation code cannot express at all, and it does so at a fraction of the token cost. Less context in the window means the model's attention holds up better; the degradation that long contexts produce simply has less to degrade. Token efficiency is not the reason to maintain a feature document (reasoning preservation is) but it is a compounding benefit whose cost implications at scale deserve separate examination.
This is exactly the gap that Michael Nygard identified when he proposed Architecture Decision Records (ADRs) in 2011. Code shows what was built. It does not show what was rejected, what constraints shaped the choice, what trade-offs were accepted, or what remains unresolved. ADRs exist because experienced engineers recognized that the reasoning behind code is at least as valuable as the code itself — and far more fragile.
The feature document fills this same gap for AI collaboration. It is, in essence, a living ADR, one that evolves in real-time as decisions are made, rather than being written after the fact.
For teams already using ADRs, the feature document is an ADR in progress. When the feature ships, significant decisions graduate to formal ADRs. For teams not yet using ADRs, this is a natural entry point: lighter-weight, more iterative, and immediately practical.
There is one more dimension that purely individual tools miss: coordination across the team. When multiple developers work on the same feature (each with their own AI sessions) the feature document becomes the shared record. Developer A's design decisions, made with AI in one session, are available to Developer B's AI session started independently. Without the document, Developer B's AI might re-propose the very abstractions Developer A already rejected. The shared mental model is not just shared between one human and one AI; it is shared across the team, across sessions, across time.
The feature doc survives what the context window cannot.
What This Looks Like in Practice
The notification service from that earlier Design-First work provides a useful illustration.
After the design conversation (capabilities confirmed, components debated, contracts agreed) I had a set of decisions worth preserving. BullMQ used directly for retries, no wrapper abstraction. Functional services, no classes. Email-only for v1. SendGrid for delivery. These decisions, and crucially the reasoning behind each, went into a feature document alongside the current constraints, open questions, and the state of implementation.
The document was brief, under fifty lines. Not a formal template, but a working record: decisions with their reasoning, current constraints the AI must respect, open questions that remain unresolved, and a simple checklist of what was done versus what remained. Enough to capture the essential state without becoming documentation for its own sake.
# Feature: Notification Service v1 ## Decisions | Decision | Reason | Rejected Alternative | |-----------------------------+-----------------------------------------+-----------------------------------------------------| | BullMQ directly, no wrapper | Native retry with backoff is sufficient | RetryQueue abstraction (unnecessary indirection) | | Functional services | Match codebase convention | Class-based (rejected: convention) | | SendGrid for delivery | Deliverability + team experience | SES (cheaper, less reliable), Mailgun (no team exp) | ## Constraints - Email-only for v1 (no SMS/push) - All queries include tenantId (multi-tenant) - Must use existing auth middleware ## Open Questions - [ ] Rate limiting strategy (awaiting product input) ## State - [x] Design approved (all 5 levels) - [x] NotificationHandler + TemplateRenderer implemented - [ ] DeliveryTracker (next session)
The value became clear at the start of the third session. Rather than reconstructing forty-five minutes of prior conversation (re-explaining the tech stack, re-establishing the design decisions, re-stating the constraints) I shared the feature document. The AI had full alignment in thirty seconds. Not because it remembered the previous sessions, but because the decisions had been externalized into a form it could read fresh. Every new session became a warm start rather than a cold one. The shared mental model did not need to be rebuilt; it was loaded.
In practice, the updates happened at natural pause points: at the end of a design level, when a significant decision was made, or when an open question was resolved. Sometimes I wrote the update myself. Sometimes I asked the AI to summarize the decision and its reasoning, then edited that summary into the document. The effort was minimal: a few lines after each significant moment, not a documentation exercise.
There was a secondary benefit I had not anticipated. The discipline of updating the document streamlined my own thinking. Writing down why we chose direct BullMQ integration over a wrapper forced me to articulate the reasoning clearly — and occasionally revealed that my reasoning was weaker than I thought. The document was not just external memory for the AI. It was a forcing function for clarity in my own decision-making.
Over three sessions, the document evolved: new decisions accumulated, open questions were resolved, the implementation state progressed. A colleague joining the feature — or a new AI session — could read this document and have the full context of days of work in minutes. No repetition. No re-explanation. The document carried the shared understanding forward.
Calibration
Context anchoring is not needed everywhere. It is specifically valuable when a feature spans multiple sessions — when the risk of losing context is real and the cost of re-establishing it is high.
| Scenario | Anchoring Needed? | Why |
|---|---|---|
| Quick question, single utility | No | Conversation is short enough that decay is irrelevant |
| Single-session feature (under an hour) | Lightweight — capture key decisions if revisiting is possible | A few bullet points of decisions and state, enough to restart |
| Multi-day feature spanning sessions | Yes — full feature document | The cost of lost context is hours, not minutes |
| Feature with multiple developers | Yes — shared document | Coordinates decisions across independent AI sessions |
For a quick debugging question or a one-off utility, the overhead of maintaining a document is not justified. For a feature that takes an afternoon, a lightweight capture may be worthwhile if there is any chance of revisiting. For work that stretches across days, full context anchoring pays for itself many times over. This is where the vicious cycle of clinging to long conversations is most likely to emerge, and where externalizing decisions breaks the cycle most effectively.
The litmus test returns here: if I can close my chat session and start a new one without anxiety, without feeling I have lost something that cannot be recovered — my context is properly anchored. If I feel the need to keep a session alive, that discomfort is the signal. It means decisions exist only in the conversation, and the conversation is the wrong place for them to live permanently.
Conclusion
This is, at its core, a shift from chat-driven development to document-driven development. The conversation remains the medium for making decisions, but the document becomes the record. Conversations are disposable by design — they are where thinking happens, not where conclusions are stored. The document persists.
The shared mental model between human and AI does not have to be transient. It can be documented, durable, and shareable. Together with the techniques that precede it — sharing curated project context before a session begins, structuring design conversations in sequential levels before any code is written — context anchoring completes a progression: static context, dynamic alignment, persistent decisions. Each layer builds on the last.
And the simplest test of whether it is working is also the most practical: close the session. Start fresh. If that feels effortless — if starting over costs thirty seconds of document-sharing rather than thirty minutes of re-explanation — the context is where it belongs. Outside the conversation, in a form that both human and AI can read, anytime.
Significant Revisions
09 March 2026: first published

