How to Give Claude Perfect Memory
Three layers of memory, each building on the last. Layer one takes five minutes and covers 90%+ of users.
By default, Claude’s memory is basically decorative. It forgets context mid-conversation. You re-explain yourself constantly. Even after you do, the next session starts from zero.
Most people have been living with this for months, assuming it’s just how LLMs work. It’s how LLMs work absent a system. With a system, everything changes.
I use Claude every single day. More screen time than any other app on my Mac. I need it sharp, consistent, and carrying forward every decision, preference, and hard-won lesson from the sessions before.
The Idea (60 Seconds)
Three layers of memory, each building on the last. Layer one takes five minutes and covers 90%+ of users.
Why Build a Memory System Instead of Re-explaining
Every time you start a new Claude session, you burn tokens re-establishing context. Over a month, that compounds into hours of wasted time and inconsistent outputs. A memory system pays for itself on day one. Layer one alone saves ten minutes per session. Layer three makes Claude genuinely useful for long-running projects where consistency across weeks matters.
Layer two takes about an hour and changes how Claude operates entirely.
Layer three turns Claude into a self-evolving second brain, trained on all your data, with persistent search and recall across every conversation you’ve ever had.
Here are all three.
Layer 1: Basic Memory (5 Minutes)
Four quick wins. Minutes to set up. Immediate improvement in every conversation.
1. Memory Editing Tool
Go to Settings → Memory right now.
This is the most overlooked page in Claude. Most people have zero awareness it exists.
What you’ll find: everything Claude has stored about you, accumulated passively across every conversation. Preferences, facts, habits, working styles. Left alone, your memory fills up with garbage fast.
The fix: read through everything on this page. Delete anything outdated, inaccurate, or irrelevant. Then manually add the context you actually want Claude to carry permanently.
Stick to the basics here (your role, key preferences). We’ll build highly specific systems soon.
2. Project Instructions
If you use Claude Projects (you should), fill in your Project Instructions field.
My advice: create projects for all your most-used workflows, then voice-prompt all your context into a Google Doc and upload it as a PDF for each project.
3. Tell Claude Directly
The simplest memory hack on this list. Mid-conversation, just tell Claude what to remember.
Things like:
“Remember that I prefer responses under 400 words.”
“Remember that my role is [x].”
“Update your memory with [x].”
Claude stores these immediately. You can also tell it to forget things: “Forget that I mentioned [x].”
4. Memory Imports and Exports
If you’ve been using ChatGPT (or another LLM) and have built up significant context there, you have two options to transfer it:
a) Tell ChatGPT you’re switching platforms and ask it to generate a memory export document: “I’m switching this project to Claude, give me a summary document...”
b) Use Import/Export in Claude. In Settings → Memory, you can import full data from other LLMs.
These four edits cover 90%+ of users and make an immediate impact on how Claude responds.
The next section is for people who want a real system.
Layer 2: Context File System (~60 Minutes)
Layer 1 fixes the basic memory problems. Layer 2 builds something more powerful: a file-based memory architecture that lives on your computer, loads automatically into Cowork and Claude Code.
The concept: instead of prompting Claude for context every time, you store all of that context in .MD desktop files that Claude has access to. You can also attach these markdown files to any LLM or AI agent system.
Create a new desktop folder, label it “Claude Master Folder”, and build these four markdown files within it (Claude can help you do this):
1. Instructions.md
This file tells Claude all your rules and instructions:
## Who you are
## What you do
## Rules
## What good outputs look like
Important to include: “Update Memory.md with my preferences over time.”
This line is crucial. It’s how you get Claude to create a running memory log of your data in the second markdown file.
2. Memory.md
This is the “brain” of Claude, continuously updated over time.
## Preferences
## Corrections
## Patterns
## Decisions
Now whehas yet to you say something like “stop using em dashes,” Claude goes into the memory file and updates it.
3. Context.md
The specific context file for a given project. What’s in this file changes depending on your project. You can also create a general “business context” or “life context” markdown mega file.
4. Archive Copies
This one is purely protective but worth doing.
Claude will update your memory files automatically as you work. Occasionally, it overwrites something incorrectly or makes a change you missed. Absent a backup system, that context is gone.
The fix: once a week, copy your entire master folder (Instructions, Memory, Context, and everything else) into a separate archive folder that Claude has zero access to. Label it with the date.
If anything breaks or gets overwritten incorrectly, restore from the archive.
Setting It Up
Just create a new folder called “Claude Master Folder,” attach it to a new Cowork chat, and paste this prompt:
Go into my "Claude Master Folder" in my connected workspace and build
these four markdown files inside it:
Instructions.md - includes sections for: Who You Are, What You Do,
Rules, What Good Outputs Look Like, and a line telling Claude to
update Memory.md with my preferences over time.
Memory.md - includes sections for: Preferences, Corrections,
Patterns, Decisions, and Personal Context. Pre-fill with placeholder
examples so I know what to add.
Context.md - includes sections for: About This Project/Business,
Audience, Key People & Collaborators, Active Projects & Priorities,
Tools & Stack, and Important Background/History. Use a template
format with placeholders I can fill in.
Archive-Guide.md - a step-by-step guide explaining why to archive,
how to do it weekly (duplicate the folder, rename with the date,
move it somewhere Claude has zero access to), what to include,
how to restore if something breaks, and where to store the backups.
Anytime you’re working in Cowork or Claude Code, attach your Master Folder and Claude uses it as a mini memory database. It edits the memory markdown file, leaving you with something you can attach to any LLM, new chat, or AI agent.
This system is a complete game-changer. But Layer 3 takes it further.
Layer 3: AI Second Brain (1-2 Hours)
This is the deepest level. It requires setup and ongoing maintenance, but for those who build it, it’s the most advanced memory system available for Claude today.
Two options depending on how you work. Option 1 is the fast path. Option 2 is the power-user path, requiring 1-2 hours of dedicated building.
Keep in mind: for your AI second brain memory vault to be effective, you have to spend time maintaining it and updating your databases. This is a living system, a set-and-forget approach produces decay.
Option 1: Claude x Notion (5 Minutes)
Connecting Claude to Notion is the highest-leverage thing you can do in 5 minutes.
Go to Claude → Settings → Connectors, then enable the Notion connector.
Once connected, Claude reads your Notion workspace directly inside any chat.
All your tasks, CRMs, notes, tables are now accessible and editable for Claude.
I recommend creating a new “Memory Database” where you store all your AI preferences, rules, and important AI context. As you’re working with Claude, you can say: “Send this to my Notion Memory Database.”
You can then export this Notion data to other LLMs or AI platforms via a CSV file or by using the Notion MCP connector.
This setup is similar to Layer 2, except you gain Notion’s built-in board views, to-do lists, and additional functionality.
Option 2: Claude x Obsidian x AI Engram (1-2 Hours)
This is the setup I personally use. It combines three things:
Obsidian for local markdown storage (your files, your machine, your control)
Karpathy’s LLM Knowledge Base schema for structuring how Claude organizes and compounds knowledge over time
AI Engram for persistent search and memory across every conversation
Here’s why this stack matters: Layer 2 gives Claude a folder of files to read. Layer 3 Option 2 gives Claude a searchable, evolving knowledge system that compounds with every conversation.
Step 1: Download Obsidian
Go to obsidian.md and download the app.
Create a new Vault (think of this as a desktop folder where Claude Code stores and accesses your data). Your data stays local. Zero cloud dependency.
Step 2: Point Claude at Your Vault
Open the Claude desktop app and click ‘Select Folder.’ Point it at your Obsidian Vault folder. Claude now has direct read and write access to everything inside it.
Step 3: Inject the Knowledge Base Schema
Paste Andrej Karpathy’s LLM Knowledge Base system prompt into the chatbox. This is the instruction set that tells Claude Code how to build, maintain, and evolve your wiki over time.
The prompt is available here: gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
I wrote about this system in detail in my earlier article, “Build an LLM Knowledge Base That Actually Compounds.” The key architecture:
your-vault/
├── raw/ # Immutable source documents (AI reads, has yet to modifies)
├── wiki/ # AI-maintained wiki with domain folders
│ ├── index.md # Navigation hub
│ └── log.md # Append-only action log
├── outputs/ # Generated reports and query answers
└── AGENTS.md # Schema defining how the AI organizes, ingests, and queries
The AGENTS.md schema is the single most important file. It defines identity, architecture, conventions, and workflows. Every wiki page gets YAML frontmatter. Wiki-links cross-reference topics. Source citations are required. Contradictions get flagged.
Three core workflows defined in the schema:
Ingest: Read a source, extract key information, create/update summary pages, update index, add backlinks, flag contradictions, log it. A single source touches 10-15 wiki pages.
Query: Read index first, find relevant pages, synthesize answer with citations, offer to file insights back into wiki.
Lint (monthly): Check contradictions, stale claims, orphan pages, missing cross-references, unattributed claims. Output a severity-leveled report.
This system alone is powerful. But it has a gap: every new conversation starts with zero recall of past conversations. Claude reads your wiki files, sure, but it has zero memory of the decisions, preferences, and insights from previous chat sessions.
That gap is exactly what AI Engram fills.
Step 4: Install AI Engram
AI Engram is an MCP (Model Context Protocol) server that gives Claude persistent conversation memory and deep search over your markdown workspace. It runs entirely locally. Zero cloud services. Zero API calls.
pip install ai-engram
# or clone from github.com/MikeS071/ai-engram
Add it to your Claude Desktop MCP config:
{
"mcpServers": {
"ai-engram": {
"command": "python",
"args": ["aiengram_mcp.py"],
"cwd": "/path/to/your/vault"
}
}
}
AI Engram gives Claude 13 new tools, split into two groups:
Content Search (6 tools):
Tool What It Does search_blog BM25 keyword search with relevance scoring and snippets semantic_search_blog Meaning-based search via sentence-transformer embeddings build_index Pre-build or refresh the semantic embedding cache list_blog_files List markdown files, filterable by collection blog_stats File counts and word totals across collections read_blog_file Read full markdown file content (with fuzzy path matching)
Conversation Memory (7 tools):
Tool What It Does remember Store a memory with category and optional tags recall Semantic search across stored memories recall_all Cross-search memories AND blog content via RRF fusion list_memories Browse memories by category, newest first forget Delete a specific memory by ID memory_stats Memory counts by category and storage size get_system_prompt Load the context memory protocol instructions
The search pipeline combines BM25 (keyword) and semantic (embedding) search via Reciprocal Rank Fusion. BM25 catches exact terms. Semantic catches meaning. Together, they find things that either approach alone would miss.
Step 5: How Memory Actually Works
AI Engram stores memories as JSONL entries (append-only, easy to inspect, easy to recover). Each memory has an ID, category, content, tags, timestamp, and source. Six categories:
Category Use Case decision Architectural choices, workflow rules, rejected approaches preference Tool choices, formatting styles, workflow preferences insight Key learnings, patterns discovered, breakthroughs context Background information, project state, environment details task Completed work, milestones, deliverables note General purpose, anything worth persisting
The Context Memory Protocol works like this:
At conversation start, Claude calls recall_all with a relevant query, then list_memories with category “decision” to load workflow decisions from past sessions.
During conversation, Claude automatically stores decisions, preferences, completed tasks, important context, insights, and notes using the remember tool.
The result: every conversation builds on every conversation before it. Decisions persist. Preferences stick. Insights compound.
The Final Product
Your Obsidian Vault now contains:
your-vault/
├── raw/ # Source documents (immutable)
├── wiki/ # Evolving knowledge base
│ ├── index.md # Navigation hub
│ ├── log.md # Append-only action log
│ └── [domain folders] # Topic-organized wiki pages
├── outputs/ # Generated reports
├── AGENTS.md # Knowledge base schema
├── .aiengram_memory.jsonl # Persistent conversation memory
└── .aiengram_cache.pkl # Semantic embedding cache
Claude reads your wiki. Claude searches your files with hybrid BM25+semantic search. Claude remembers every decision across every session. Your knowledge base compounds. Your memory persists.
Where This System Breaks
Context window ceiling. Around 100 articles or 400K words, selective reading via the index introduces blind spots. Claude reads the index first and may miss relevant pages further down.
Error compounding. The AI writes a subtle mistake into your wiki. A later query uses that mistake. It files back insights reinforcing the error. This is the compounding downside of a compounding system.
Hallucination persists. Your wiki looks authoritative with citations and structured formatting. But the AI can still synthesize false connections. The structure makes mistakes look more credible.
Cost adds up. Frontier models run $1-2 per ingest operation. Ten sources a day adds up. Cheaper models work for simple updates, frontier models for complex ingestion.
AI Engram requires maintenance. The JSONL memory file grows. Occasionally you need to review, prune, and forget outdated memories. A set-and-forget approach produces the same decay as Layer 1’s unmanaged memory page.
Scaling caps out around 10K sources. This system serves individuals and small teams well. Enterprise-scale knowledge management requires a different architecture.
Which Layer Should You Build?
Layer Time Best For 1: Basic Memory 5 minutes Everyone. Start here. 2: Context Files ~60 minutes Power users with repeatable workflows 3 Option 1: Notion 5 minutes People already in Notion who want visual dashboards 3 Option 2: Obsidian + Engram 1-2 hours People who want local control, deep search, and persistent memory across sessions
My recommendation: start at Layer 1 today. Build Layer 2 this week. Graduate to Layer 3 Option 2 when you’re ready to stop repeating yourself across every conversation.
The difference between Claude with default memory and Claude with a second brain is the difference between a goldfish and an elephant. Same fishbowl. Completely different relationship with time.
This article was built from real systems: the LLM Knowledge Base architecture (covered in detail at archonhq.ai) and AI Engram (github.com/MikeS071/ai-engram), an open-source MCP server for persistent AI memory. Both run locally. Both compound. Go build yours.


