I Fixed the Dumbest Thing About AI Agents

AI agents are clever right up until you ask them to remember something.

Then they turn into goldfish.

Every new session starts cold. Every task needs the same briefing again. Every decision risks contradicting the last one because the system has no persistent memory of what happened yesterday.

That’s not intelligence. That’s a context restart penalty baked into every interaction. And if you’re running a serious multi-agent stack — not a demo, an actual production system — it quietly kills you every single day.

So I fixed it.

The problem is worse when you’re running multiple agent technologies (not just one)

Neural network illustration representing multiple AI agent technologies connected across runtimes — Photo by Google DeepMind on Pexels

Here’s what makes my situation messier than most people posting agent content on LinkedIn:

I’m not running a single framework on a single model.

I’ve got OpenClaw agents — Jarvis handling PM and orchestration, Archie doing engineering, Dexter on design — all running on GPT-4 via ChatGPT Team OAuth. I’ve got Claude Code CLI doing the heavy infrastructure lifting over SSH. I’ve got Claude.ai for strategic thinking. I’ve got OpenFang running a separate fleet of agents on OpenRouter with its own model routing.

Four different runtimes. Four different models. Four completely separate session lifecycles.

The agents do talk to each other — Jarvis orchestrates Archie and Dexter, tasks flow through the stack, results come back signed and verified. That bit works. What didn’t work was the memory layer underneath all of it. Each agent was operating on its own island of context. Jarvis knew what Jarvis had been told. Archie knew what Archie had been told. And the moment any session ended, that context was gone.

You can have perfect agent-to-agent communication and still have a system with collective amnesia. Agents that forget everything between sessions don’t get smarter over time — they just restart. That’s exactly what I had.

What I actually built (and what I deliberately didn’t build)

Code on dark screen representing the git-backed markdown vault architecture — Photo by Stanislav Kondratiev on Pexels

Not a vector database. Not another SaaS dependency. Not some inflated “cognitive architecture” dressed in enough buzzwords to survive a Series A pitch.

A git-backed markdown vault. Running on a dedicated VPS — always on, never tied to any single agent session, accessible by every agent in the stack regardless of what runtime they’re sitting in.

The format is plain markdown, which means it’s Obsidian-compatible. I use Obsidian as the local window into the vault — graph view, linked notes, the full visual knowledge map across every agent, every project, every decision. But Obsidian isn’t the point. The point is a versioned, structured, persistent memory layer for AI agents that’s built on the same principles Obsidian is built on: plain files, git-backed, no proprietary lock-in. If Obsidian disappeared tomorrow, the vault would still work. The principles matter more than the tool.

That last bit matters for the agents too. The shared memory had to be model-agnostic and runtime-agnostic from day one. Because if your long-term memory for AI agents only works for one framework, you haven’t solved the problem. You’ve just moved it.

So the setup works like this:

OpenClaw agents read and write via the HMAC-signed trust bridge
Claude Code reads and writes directly over SSH
Claude.ai gets a context snapshot at session start
OpenFang agents get their own feed

Same brain. Four technologies. Zero platform dependency. It doesn’t sleep.

The vault stores everything worth keeping — session logs, bug records, architecture decisions, project state, runbooks, design specs, commercial thinking. Not scattered across chat histories. Linked. Versioned. Committed to git. There when the next session starts — for any agent, on any runtime.

The security layer is the bit most people don’t bother building

Laptop with digital lock representing cryptographic trust boundaries and agent permissions — Photo by Dan Nelson on Pexels

Every agent connects through a cryptographically signed bridge.

All agents can read the vault. Write access is scoped per agent, per operation — enforced in code, not convention.

Jarvis writes session logs and project notes. Archie writes bug records. Dexter writes design specs. Platform architecture is locked to a single trusted writer. If an agent tries to write somewhere it isn’t supposed to, the bridge returns PERMISSION_DENIED. That’s a real response, not a gentleman’s agreement.

Shared memory, yes. Shared vandalism, no.

This matters more than it sounds. In a multi-agent system, the thing that breaks your trust infrastructure isn’t usually a malicious agent — it’s two agents writing to the same place at the same time and corrupting each other’s work. The locking, permission model, and atomic write protocol exist to prevent exactly that. Context persistence doesn’t mean much if the persisted context is garbage.

The part that actually changes how agents behave mid-session

This is the bit I’m most pleased about.

Agents don’t just read the shared knowledge base at the start of a session. They write to it during work — at key milestones, not just at the end.

Archie diagnoses and fixes a bug — writes a bug note immediately, root cause and fix documented before moving on. Jarvis completes a task handoff — writes a session log before the context window moves to the next thing. Dexter finishes a component — writes a design spec note while the context is fresh. A major architectural decision gets made — it’s recorded with links to related decisions and runbooks.

Why does this matter?

Because context windows fill up. Sessions end unexpectedly. Models get swapped out mid-build. Without mid-session writes, everything lives in the context window until it doesn’t — and then it’s gone. The next session starts from scratch, and you’re back to the twenty-minute recap.

With mid-session writes, knowledge gets committed to persistent memory before the context window becomes a problem. The next session — any agent, any runtime — picks up exactly where things left off.

No panic at the edge of the context window. No agent amnesia. No “right, let me remind you where we were.”

Just continuity.

Why I didn’t start with semantic search (and why that was the right call)

The obvious trap would’ve been jumping straight to embeddings, vector search, semantic retrieval, and all the other shiny bits people use to make a half-built system sound more advanced than it is.

No thanks.

First make memory real. Then make it clever.

The rule on this is: boring engineering first. Which is usually the bit that actually works.

Qdrant is on the roadmap. But a vault with thirty notes doesn’t need semantic search — it needs to exist first. Getting the infrastructure right, the permissions right, the write protocol right — that’s Phases 1 and 2. The fancy bits come later, once the vault has enough depth to deserve them.

The proof

I restarted Jarvis on Telegram this morning. Clean session. Zero context from me. No preamble, no summary, no “here’s where we are.”

I just asked: “What’s the current state of the vault?”

He came back with the last updated timestamp, the active project, the last session log, open P1 items, and recent activity across the team.

Unprompted. From persistent memory. Cold start.

That’s the line I wanted to cross. Not “the demo works.” Not “prototype complete.” Not “MVP pending polish.”

A fresh session that already knows what happened. An agent that walks in with context instead of asking for it. Agents that remember across sessions — all of them, not just one.

That’s what changes the game.

The moat isn’t the model (it never was)

Models will keep getting cheaper. Faster. More capable. More interchangeable. That part isn’t going to save you, and anyone building their competitive advantage on top of a specific model is going to have a bad time in about eighteen months.

The moat is everything wrapped around execution:

Always-on infrastructure that doesn’t depend on any single session. Cryptographic trust boundaries that enforce who can do what. Per-agent, per-operation permissions so shared context stays clean. Mid-session writes so nothing gets lost to context drift. And continuity — real continuity — across every agent, every session, every runtime in the stack.

Build that, and you don’t have a clever demo.

You have a system.

What’s next

The vault is live with thirty-two notes across seven categories. All four agent technologies are wired in and sharing the same persistent knowledge base. The write protocol is running in production. Obsidian is synced locally and the graph view is genuinely useful — you can see the links between agents, decisions, bugs, and runbooks in a way that a flat file list never gives you.

Next phase is Qdrant semantic search — once the vault has a few months of real operational history behind it. After that, I’m building an MCP server so Claude.ai gets direct read/write access to the vault rather than a context snapshot. That makes this conversation — right now — part of the same persistent memory system as everything else. Shared context across every agent, every session, every runtime. No more cold starts. No more re-briefing.

But that’s next week’s problem.

I’m in the process of cleaning both repos up for public release — the trust bridge and the vault scaffold. If you want early access before they go public, drop me an email or find me on X at @keith_de_A.

If your agents start every session blind, you don’t have a system.

You have a very expensive goldfish.

Your move.

Write a comment