Claude has memory now. Here’s what it doesn’t do.
The Claude Memory feature is good. It is also product-side, chat-only, scoped to facts about you, and tied to a single model. Five structural limits that matter once you are building anything past the chat box.
In March 2026, Anthropic expanded Claude Memory to the free tier, ending a roughly six-month period where it had been a paid-only feature. The rollout was a real product win. Casual users get a Claude that remembers their tone, their projects, their preferred level of detail across sessions; the engineering work behind it - storage, retrieval, the user-facing review surface where you can edit what Claude remembers about you - is genuinely well-executed. If you are using claude.ai for personal work, you should turn it on.
The marketing rollout, predictably, framed this as Claude has memory now, full stop. For people building software on top of the Claude API, that framing is misleading in a way that costs real engineering time. What Anthropic shipped is a memory feature for the consumer chat product - a specific, deliberate, scoped feature that solves a specific user-facing problem. It is not a memory layer for your agent, your application, or your Claude Code workflow. The five structural limits below are not bugs. They are the product surface, accurately described. The gap they leave - cross-model, project-scoped, artifact-aware persistent memory at the API layer - is exactly what we ship at Ragionex behind POST /v1/memory/write and POST /v1/memory/search.
Limit 1: It is claude.ai only. The API does not see it.
This is the most consequential limit, and it is the one that most surprises developers reading the announcement. The Memory feature lives inside the consumer chat product. It is not exposed through the Anthropic API, it is not available to applications that call messages.create, and it is not part of Claude Code's default context. Independent reviewers writing about the rollout in 2026 have flagged this as the single largest gap between expectation and reality.
What this means in practice is that every developer integration starts fresh. If you build a customer support agent on the Claude API, the agent does not know that this user has already had three conversations on claude.ai about their refund policy preferences. If you use Claude Code on a repository, the assistant does not inherit your claude.ai memory of the language conventions you prefer. The integration boundary is hard: chat product on one side, API on the other, no bridge. For an application developer, the announcement reads Claude users now have memory, but the API surface that the application is built on still does not.
Limit 2: Project chats and standalone chats are siloed
Inside claude.ai itself, the memory is not one big pool. Anthropic has been clear about this in their Memory documentation: project chats keep their own scoped memory, standalone chats keep theirs, and the two do not see each other. There are good product reasons for this - keeping a confidential client project from leaking into a casual conversation, for instance - and it is the right default for a consumer surface where users would otherwise be surprised by cross-context bleed.
The trade is that your memory splits when your work splits. A user who treats their Acme Refactor project as a workspace will have memory there that the standalone chat cannot recall, and vice versa. For casual chat that is a feature. For an application that wants one consistent recall pool per user across every interaction, it is a constraint to engineer around, not a feature you can rely on. The architectural decision about where memory lives matters here in a way that the marketing copy does not flag.
Limit 3: It captures preferences, not artifacts or decisions
This one is more subtle and more important. The Memory feature is shaped, by design, around remembering facts about you. You prefer Postgres. You write in British English. You like terse code review feedback. These are valuable, and they are the right things for a consumer chat product to remember, because they make every future conversation feel less like talking to a stranger.
What it does not capture, by design, is the corpus of decisions and artifacts that pile up during real engineering work. It does not remember the schema you settled on for the billing service last Tuesday after a forty-minute deliberation. It does not remember why you chose Postgres specifically over MongoDB - what the constraints were, what the trade was, which option you rejected. It does not store the exact API contract you agreed on for the internal microservice. Reviewers tracking the limits have been blunt about this: the feature captures preferences, not provenance.
Knowing that the user prefers Postgres is not the same as knowing why they chose Postgres last Tuesday and the schema they settled on.
The distinction matters because the second category - decisions, artifacts, reasoning chains - is exactly what an engineering agent needs to recall later. A coding assistant that knows your language preferences but not your architectural decisions is operating with the easy half of the context. The interesting half lives elsewhere, and Claude Memory is not designed to hold it.
Limit 4: It is personal, not shared. There is no team or multi-agent surface.
Claude Memory is scoped to your individual user account. There is no concept of a team-shared memory pool that several engineers can write into and read from. There is no concept of a multi-agent shared store where, say, your code-review agent and your deployment agent can both see the same architectural decisions. The memory is mine; my colleague has theirs; the two never interact.
For a consumer product this is correct. Memory should not leak between users by default. For software that needs this team's coding conventions, this organization's architectural decisions, this incident's shared context across multiple agents working on the same problem, the personal-only model leaves the workload uncovered. Engineering memory is fundamentally collaborative. The Claude Memory primitive is fundamentally individual. The mismatch is not a small one.
Limit 5: It is Claude-only. Switching models means starting over.
The memory belongs to Claude, not to you. If you decide tomorrow that GPT-5 is better for your workflow, your accumulated memory on claude.ai does not come with you. There is no export to a portable format, no shared memory layer that multiple model vendors plug into, no model-agnostic primitive. The lock-in is structural - the memory is part of the Claude product, not a separable layer.
For users this is fine; vendor lock-in to a chat assistant is a low-stakes problem. For developers building agents that may use different models for different tasks (a small fast model for triage, a large model for hard reasoning, a specialized model for code), or for organizations that hedge against vendor risk by maintaining model optionality, the personal-memory-tied-to-one-vendor model is the wrong shape. Memory should be portable. The user, not the model, should own it.
What an API-level memory layer fills in
The five limits above are not arguments that Claude Memory is bad. They are arguments that it is a different product than what an application developer needs. The five gaps - API access, cross-context recall, decision-and-artifact storage, multi-agent sharing, model portability - are the exact shape of the problem an external memory API exists to solve. None of them are hypothetical. All of them are showing up in real agent codebases as the gap between “Claude has memory” and “my agent on the Claude API has memory.”
The shape of an API-level memory layer follows directly from the gaps:
- API-native, not chat-product-native. One HTTP call to write, one to recall. Works with any LLM you call from your code. The integration path is small enough that the developer cost is hours, not weeks.
- Project-scoped recall. Memories live under a project label your agent passes on every call. The user can have one global recall pool that spans projects, or scope queries to a single project, or both - because both are useful at different times.
- Decision-shaped, not preference-shaped. The store does not opinionate on what kind of fact you put in it. If you want to write “User chose Postgres for the billing service on 2026-04-15 because of transactional integrity across the ledger and invoice tables,” you write that. The recall is by meaning, not by category.
- Multi-agent shared store. Memories belong to the user, not to a particular agent or model. A code-review agent and a deployment agent that share an API key share a recall pool, by design.
- Model-agnostic. Switching from Claude to GPT to a local model on Tuesday does not orphan your memory. The store does not know or care which model is doing the reasoning on top.
The integration shape
This is the smallest possible illustration of what the API-level layer looks like in code. It is exactly the surface that Claude Memory does not expose, and the gap it fills:
curl -X POST https://api.ragionex.com/v1/memory/write \
-H "X-API-Key: rgx_memory_..." \
-H "Content-Type: application/json" \
-d '{
"content": "User chose Postgres for the billing service on 2026-04-15. Reason: transactional integrity across the ledger and invoice tables. Schema settled on: ledger_entries(id, account_id, delta, txn_id), invoices(id, txn_id, status). MongoDB was explicitly rejected.",
"project": "acme-billing"
}'
curl -X POST https://api.ragionex.com/v1/memory/search \
-H "X-API-Key: rgx_memory_..." \
-H "Content-Type: application/json" \
-d '{
"query": "Why did we pick this database for billing and what is the schema?",
"scope": "full",
"results": 3,
"project": "acme-billing"
}' That call works the same whether the agent on top is Claude, GPT, Gemini, or a local model. The store is the durable layer. The model is the reasoning layer. The two are deliberately separated so that the team can swap either one without re-architecting the other - the same separation we argue for on the documentation side in why we don't call an LLM at query time.
The honest summary
Claude Memory is the right product for a consumer chat assistant. If you spend hours a week in claude.ai, turn it on. If you read the rollout announcement and concluded that this means your agent on the Anthropic API now has persistent memory, the conclusion does not survive contact with the actual product surface. The chat product and the API are two different things, and the memory feature lives entirely on one side of that line.
For application developers, the architectural primitive that fills the gap is the same one that solves the broader stateless-reset problem: an external memory layer with a small HTTP surface, a project scoping model that matches how engineers actually organize work, and an opinion-light store that holds decisions and artifacts as readily as it holds preferences. Claude Memory is a complement to this, not a replacement. They solve different halves of the same problem, and most software that ships in 2026 will need both.