Add Persistent Memory to Cursor, Claude Code, and Windsurf in 3 Lines

If you are reading this in Cursor or Claude Code, the integration takes one config block, three lines of JSON inside it, and one environment variable. By the end of the next paragraph you will have the working snippet. The rest of this post is the verification, the troubleshooting, and the boring details about where each tool puts its config file.

The reason this is short is the Model Context Protocol. Anthropic published it in late 2024 as a standard interface between AI agents and external tools. Cursor adopted it. Windsurf adopted it. Claude Desktop ships with it. So one MCP server, written once, works in all three coding agents without per-tool wrappers. The Ragionex Memory MCP server is published as @ragionex/mcp-memory on npm and runs locally via npx, which means you do not install anything globally.

Here is the full config. It is the same JSON in every tool. The three lines that change between tools are the file path you paste it into.

{
  "mcpServers": {
    "ragionex-memory": {
      "command": "npx",
      "args": ["-y", "@ragionex/mcp-memory"],
      "env": { "RAGIONEX_API_KEY": "rgx_memory_..." }
    }
  }
}

Get a key at app.ragionex.com/keys. Replace the placeholder. Restart your editor. Done. Your agent now has memory_write, memory_search, memory_list, and four other tools available, and it will use them when the conversation calls for them.

The 3-Line Config: One MCP Server, Three Coding Agents

The three lines that matter are command, args, and env. Everything else in the JSON above is the wrapper MCP requires. Cursor, Claude Desktop, and Windsurf all parse the same mcpServers object shape. They differ only in which file they read it from and whether the file is JSON or TOML, which is the next section.

Why npx instead of a global install? Two reasons. First, MCP servers are versioned and most editors restart them on demand, so a stale global install is a real failure mode. npx -y always pulls the latest published version. Second, you do not have to manage a Node version manually. The -y flag suppresses the prompt the first time it is fetched.

The MCP server itself is a thin wrapper around the public Memory API. Every tool call it exposes maps to a single HTTP endpoint. memory_write is POST /v1/memory/write. memory_search is POST /v1/memory/search. The full mapping is in the MCP package README. If you want to skip MCP and integrate directly, the curl path is at the bottom of this post.

Setup: Where Each Tool Looks for the Config

Each tool reads the same JSON from a different location. None of them require restarting your shell or your machine, only the editor itself.

Claude Desktop reads claude_desktop_config.json. On macOS the path is ~/Library/Application Support/Claude/claude_desktop_config.json. On Windows it is %APPDATA%\Claude\claude_desktop_config.json. On Linux there is no official desktop build yet, but the same file location pattern under ~/.config/Claude/ works for community builds. If the file does not exist, create it with the JSON above as the entire content. Quit and reopen Claude Desktop. The MCP indicator in the bottom-right of the input box should show one connected server.

Claude Code (the CLI / VS Code extension) reads MCP servers from ~/.claude.json at the user level, or from .mcp.json at the project root for per-project servers. The official documentation at docs.claude.com/en/docs/claude-code/mcp covers both. Per-project is the right scope for memory if you want different agents in different repos to keep their stores separated; user-level is the right scope if you want one shared brain across everything.

Cursor reads MCP servers from its in-app settings. Open the command palette, run Cursor Settings, and the MCP tab is under the agent panel. There is also a JSON-on-disk mode at ~/.cursor/mcp.json that mirrors the same shape as the Claude Desktop file. The Cursor team documents both at docs.cursor.com/context/model-context-protocol. Cursor reloads MCP servers on settings save without an editor restart, which is one fewer step.

Windsurf reads MCP servers from ~/.codeium/windsurf/mcp_config.json. The Cascade panel exposes a + Add server button that writes to that file for you. The current docs are at docs.codeium.com/windsurf/mcp. Windsurf needs a Cascade reload after a config change, which is the small refresh icon on the panel header.

Across all four locations the JSON shape is identical. If you set it up in one and want to migrate to another, copy the file contents over and you are done.

Verify It Works: Three Smoke Tests

The fastest way to confirm the integration is two prompts in two separate sessions. The point is to prove that what you save in session A is recallable in session B, which is the property that distinguishes a real memory layer from in-context conversation history.

Test one, in any of the three tools. Open a fresh chat and type: "Save a memory for me. The project label is conventions. The content is: I prefer fail-fast error handling. No silent fallbacks, no try/except wrapping every call. Errors should propagate." The agent should call memory_write and respond with a memory ID like mem_K7X2P9Q4R1. If it does not, the MCP server is not connected. Skip to the troubleshooting section below.

Test two: close the session entirely. Close the conversation tab in Cursor, or the chat in Claude Desktop, or the Cascade panel in Windsurf. Open a new one. There is no shared in-context history at this point. Ask: "How do I prefer to handle errors in this codebase?" The agent should call memory_search with a query like "user error handling preferences", get back the memory you saved, and answer correctly. If it answers based on a hallucinated guess, check that the agent has tool-use enabled and the API key is valid.

Test three: cross-tool recall. If you have configured the same API key in two tools, save in Cursor and recall in Claude Desktop. Same key means same user, same memory store, same recall. This is the property that markdown-vault patterns cannot give you without manual sync. For a deeper take on why solo dev vault patterns hit a ceiling here, see Beyond CLAUDE.md.

The integration is one config block. The interesting work is what you choose to save and how you organize it across projects.

What This Replaces (And What It Doesn't)

Persistent memory replaces three patterns developers reach for when they want their agent to remember things. It replaces the sprawling CLAUDE.md or .cursorrules file that grows past the context budget and starts getting truncated. It replaces the manual context paste between sessions where you copy your last conversation summary into a new chat. And it replaces the project-wide system prompt that has to be updated every time a new convention is decided. Anthropic's own field guide on agent context, Effective Context Engineering for AI Agents, makes the same point: as the agent's context budget tightens, retrieval beats stuffing.

It does not replace the model's reasoning, the in-session scratchpad work the agent does while it figures out a problem, or your version-controlled documentation. The right mental model is that memory is a small, queryable pool of stable facts and decisions that the agent reaches for when needed. Code and architecture documents still live in your repo. Specs still live in your project tracker. The memory layer is where the agent stores the "we already decided this, we already tried that" residue that has nowhere else to live. If you also want the agent to retrieve from a documentation corpus the same way - meaning-based, no model on the read path - that is what a Context Engine does for docs.

It also does not replace conversation history within a single session. Inside one long-running chat, the conversation buffer is fine. Memory is what survives the buffer reset. For more on why the framing of "amnesia" is the wrong reframe for what is actually a retrieval problem, see Your Agent Doesn't Have a Memory Problem.

Direct API: Skip MCP If You Are Building a Backend

If your agent is not in an MCP-aware editor and you are calling Claude or another model directly from your own backend, the API is a straightforward HTTP call. The same RAGIONEX_API_KEY works.

curl -X POST https://api.ragionex.com/v1/memory/write \
  -H "X-API-Key: rgx_memory_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "I prefer fail-fast error handling. No silent fallbacks.",
    "project": "conventions"
  }'

Recall is symmetrical. Note that scope is required and accepts either segment for the matching slice of a memory or full for the entire original memory.

curl -X POST https://api.ragionex.com/v1/memory/search \
  -H "X-API-Key: rgx_memory_..." \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How does the user prefer to handle errors?",
    "scope": "segment",
    "results": 5,
    "project": "conventions"
  }'

The response is a JSON object with a results array. Each item carries the matched content and metadata. Drop those into your agent's context window and the recall path is done.

Free Tier: What 1,000 Memories and 10,000 Searches Cover

The free tier on Ragionex Memory is sized to be useful for an actual solo developer rather than a toy. The current limits are 1,000 memories total, 100 projects, 500 writes per UTC month, and 10,000 searches per UTC month. To put that in workload terms: at a steady writing rate of one memory per workday across a year of paid working days, you write 250 memories. Even a heavy capture habit of three per day across a year of weekdays is roughly 780 memories, comfortably under the cap.

The 10,000 search budget is the one that varies most by use case. An agent that calls memory_search on every turn at 50 turns per workday hits 1,000 searches per work month, well inside the budget. An agent that searches on every keystroke would burn through it in days, but no real workflow does that. The right pattern is for the agent to call search at the start of a task or when the user introduces a new topic.

The 100-project ceiling matters more than people expect. Project labels are how you keep different codebases, different clients, and different parts of your life from polluting each other's recall. We recommend one project label per repo, plus one for personal conventions, plus one for ad-hoc scratch. Most users sit at five or six.

If It Doesn't Work: Three Things to Check

If the agent does not call the memory tools, the issue is almost always one of three things. One: the editor was not restarted after the config write. Claude Desktop and Windsurf cache MCP servers at startup; Cursor reloads on settings save. Two: the API key is wrong or the prefix is missing. Memory keys start with rgx_memory_ followed by 32 URL-safe characters. Knowledge keys start with rgx_knowledge_ and will not work for memory endpoints. Three: the agent's tool-use is disabled. In Claude Desktop check the tools panel; in Cursor check that the model you selected supports tool calls; in Windsurf check that Cascade is in agent mode rather than plain chat.

Once those three are verified, the integration is reliable. Karpathy's llm-wiki gist from April 2026 captures the broader cultural shift toward agents that have a place to put what they learn. The MCP standard is what made the multi-tool version of that pattern actually work without a per-editor SDK. The three-line config is the part you actually have to write.