NeuralMemory FAQ¶
Installation¶
Q: pip install neural-memory installs what?¶
- Core: pydantic, networkx, python-dateutil, typer, aiohttp, aiosqlite, rich
- CLI tools:
nmem,neural-memory,nmem-mcp - Optional extras:
[server]— FastAPI + Uvicorn[neo4j]— Neo4j graph database[nlp-en]— English NLP (spaCy)[nlp-vi]— Vietnamese NLP (underthesea, pyvi)[all]— All of the above[dev]— Development tools (pytest, ruff, mypy, etc.)
Q: pip not working on Windows?¶
Use python -m pip instead:
Q: How to install from source?¶
git clone https://github.com/nhadaututtheky/neural-memory.git
cd neural-memory
pip install -e ".[all,dev]"
The -e flag enables editable mode — code changes take effect immediately without reinstalling.
Q: Too many commands — is there a simpler UI/UX approach?¶
Yes. If you use VS Code, the NeuralMemory extension provides a full GUI — no terminal commands needed:
- Encode memory: Select text →
Ctrl+Shift+M E - Query memory:
Ctrl+Shift+M Q→ type your question - Start/stop server:
Ctrl+Shift+P→ NeuralMemory: Start Server - Switch brain:
Ctrl+Shift+P→ NeuralMemory: Switch Brain - View graph:
Ctrl+Shift+P→ NeuralMemory: Open Graph View
The sidebar panel also shows neurons, fibers, and brain stats at a glance.
Q: How to update to the latest version?¶
If installed from source:
VS Code Extension¶
Q: How to use NeuralMemory without writing Python code?¶
Install the VS Code extension:
- Open VS Code
- Go to Extensions (
Ctrl+Shift+X) - Search NeuralMemory
- Click Install
Note: The extension still requires Python +
neural-memorypackage installed on the machine as its backend.
Q: Extension not showing data?¶
- Start the server:
Ctrl+Shift+P→ NeuralMemory: Start Server - Switch brain if needed:
Ctrl+Shift+P→ NeuralMemory: Switch Brain → select your brain - Click refresh
Q: Server running on a different port than the extension expects?¶
The extension defaults to port 8000. If your server runs on a different port, update the setting:
- Open VS Code Settings (
Ctrl+,) - Search
neuralmemory.serverUrl - Set it to your server's URL (e.g.
http://127.0.0.1:8080)
Q: Server is running, correct port set, but extension still shows nothing?¶
A new brain starts empty — there is no data to display yet. You need to encode at least one memory first:
- Select any text in your editor
- Press
Ctrl+Shift+M Eto encode it - Click refresh in the NeuralMemory sidebar
After encoding, neurons and fibers will appear in the sidebar.
Q: I opened the server URL in the browser but only see JSON info, not my data?¶
The root URL (/) only shows basic API info. To view your data:
- Graph visualization (UI): go to
/ui(e.g.http://127.0.0.1:8000/ui) - API documentation (Swagger): go to
/docs(e.g.http://127.0.0.1:8000/docs) - Neurons list (API): go to
/memory/neurons(requiresX-Brain-IDheader)
The VS Code sidebar is the main way to browse your neurons and fibers.
Per-Project Configuration¶
Q: How do I keep memories from different projects separate?¶
You don't need to do anything. NeuralMemory auto-detects which git repo and branch you're in, then tags your memories accordingly. When you recall, it prioritizes memories from your current context.
Just use NeuralMemory normally — the right memories surface for the right project.
Q: What if I want complete separation between projects?¶
Think of a brain as a separate notebook. By default, everything goes into one notebook called default. If you want a completely separate notebook for a different project, just tell the AI:
"Create a new brain called work-api and switch to it"
To switch back later:
"Switch to my default brain"
Memories in work-api and default never mix.
Q: Can I make Claude Code auto-pick the right brain per project?¶
Yes. Drop a .mcp.json file in each project root:
// ~/projects/work-api/.mcp.json
{
"mcpServers": {
"neuralmemory": {
"command": "python",
"args": ["-m", "neural_memory.mcp"],
"env": { "NEURALMEMORY_BRAIN": "work-api" }
}
}
}
Now when you open that project in Claude Code, it automatically uses the work-api brain.
Q: Can I save project rules that the AI always follows?¶
Yes. Just tell the AI in plain language:
"Remember: this project uses PostgreSQL, never SQLite"
"Remember: all API responses must use the {data, error, meta} format"
These are saved in your current brain and automatically surface when relevant — you don't need to know any commands.
For bigger project context, just say:
"Save this project context: name is work-api, tech stack is Python, FastAPI, PostgreSQL"
Next session, say "recap" or "what were we working on?" and everything reloads instantly.
Database Training¶
Q: Can NeuralMemory learn from my database?¶
Yes. The nmem_train_db tool teaches your brain to understand database structure:
This extracts schema knowledge (tables, relationships, patterns) — NOT raw data rows.
Q: Which databases are supported?¶
SQLite only in v1. The architecture supports adding PostgreSQL and MySQL dialects in the future.
Q: Will it read my actual data?¶
No. DB-to-Brain opens the database in read-only mode and only reads PRAGMA metadata (table definitions, foreign keys, indexes). No SELECT queries are run on your data tables (except COUNT(*) for row count estimates).
Q: How do I check what schema knowledge is stored?¶
This shows how many schema entities have been trained into the current brain.
AI Agent Skills¶
Q: What are NeuralMemory skills?¶
Composable AI agent workflows that follow the ship-faster SKILL.md pattern. They give Claude Code structured methods for memory management:
| Skill | What it does |
|---|---|
memory-intake |
Converts messy notes into well-typed, tagged memories via 1-question-at-a-time clarification |
memory-audit |
6-dimension quality review (purity, freshness, coverage, clarity, relevance, structure) with A-F grading |
memory-evolution |
Evidence-based optimization — consolidation, enrichment, pruning, tag normalization |
Q: How do I install the skills?¶
Copy from the NeuralMemory repo to your Claude Code skills directory:
# Linux/macOS
cp -r skills/memory-* ~/.claude/skills/
# Windows
xcopy /E /I skills\memory-intake %USERPROFILE%\.claude\skills\memory-intake
xcopy /E /I skills\memory-audit %USERPROFILE%\.claude\skills\memory-audit
xcopy /E /I skills\memory-evolution %USERPROFILE%\.claude\skills\memory-evolution
Q: How do I use them?¶
In Claude Code, invoke by name:
/memory-intake "Meeting notes: decided on Redis caching, Bob handles migration, deadline Friday"
/memory-audit
/memory-evolution "Focus on the auth topic"
Q: Do they require anything besides NeuralMemory?¶
Just the NeuralMemory MCP server configured in Claude Code. The skills use existing MCP tools (nmem_remember, nmem_recall, nmem_stats, nmem_health, etc.) — no new dependencies.
Data & Multi-tool Sharing¶
Q: Do I need to install NeuralMemory per project?¶
No. Install once globally and it works for the entire machine:
Data is stored in ~/.neuralmemory/ — not tied to any specific project. All tools (CLI, AntiGravity, Claude Code, VS Code extension) read from the same location.
To separate data per project, use different brains:
Q: How to share brain data between AntiGravity and Claude Code?¶
They already share the same brain automatically. Both read/write to the same files:
- Config:
~/.neuralmemory/config.toml - Data:
~/.neuralmemory/brains/<name>.db
As long as both tools point to the same current_brain, all memories are synced. Verify with:
Q: How do I let Claude Code query my brain using natural language?¶
Add the NeuralMemory MCP server to ~/.claude.json:
After restarting Claude Code, just ask naturally:
"What did we decide about the volume spike feature?"
The agent will automatically call the recall tool to search your brain.
NeuralMemory vs RAG¶
Q: I have 100 document files (law, specs, etc.) — should I use NeuralMemory or RAG?¶
RAG/Vector Search is better for document lookup. Here's why:
| Aspect | RAG | NeuralMemory |
|---|---|---|
| Best for | "Find the paragraph about X" | "Why did we decide X?" |
| Query style | Semantic similarity search | Associative recall through graph traversal |
| 100 law docs | Chunks + embeddings → sub-10ms exact match | SimHash + keyword anchors → weaker semantic matching |
| Embedding quality | LLM understands synonyms, paraphrasing | No embeddings — pure algorithmic, may miss paraphrases |
| Returns | Exact text chunks from source docs | Fiber summaries, not original verbatim text |
| Cost | ~$0.02/1K queries (embedding API) | $0.00 — fully offline |
When to use which:
| Scenario | Winner |
|---|---|
| "What does Article 42 say?" (document lookup) | RAG |
| "Which law covers remote work?" (search) | RAG |
| "Why did the team choose Article 42 over 43?" (decision chain) | NeuralMemory |
| "What pattern do we see in deploy failures?" (habit detection) | NeuralMemory |
| "What did Alice suggest in yesterday's meeting?" (episodic recall) | NeuralMemory |
Can I use both? Yes — and that's often the best approach: - RAG for document corpus search (laws, specs, API docs) - NeuralMemory for agent memory (decisions, errors, workflows, context across sessions)
They complement each other. RAG answers "what does the document say?", NeuralMemory answers "what did we learn and decide?"
Q: What about nmem_train for documents — doesn't that compete with RAG?¶
nmem_train ingests documents into the neural graph as permanent (pinned) knowledge. It works well for:
- Project docs you reference occasionally
- Onboarding guides where relationships matter
- Small doc sets (<20 files) where graph traversal adds value
For large document corpus search (100+ files, legal compliance, exact text retrieval), RAG with a proper vector database (Chroma, Pinecone, Weaviate) will outperform NeuralMemory significantly. NeuralMemory's strength is associative recall and relationship tracing, not semantic text search.
Q: Is NeuralMemory basically a Knowledge Graph database?¶
Close, but no. They share graph structure, but serve fundamentally different purposes:
| Aspect | Knowledge Graph DB (Neo4j, FalkorDB) | NeuralMemory |
|---|---|---|
| Purpose | Static knowledge map — store entities & relationships | Living agent memory — learn from experience |
| Schema | Explicit: entities, relationships defined upfront | Emergent: neurons form connections organically |
| Query | Cypher/GQL — precise graph queries | Spreading activation — signal propagates like a reflex |
| Lifecycle | Data persists unchanged until explicitly modified | Memories decay, strengthen with use (Hebbian learning), consolidate during "sleep" |
| Scale target | Millions of nodes, enterprise workloads | Agent-scale (thousands of neurons), sub-ms latency |
| Behavior | Database — stores what you tell it | Brain — forgets what you don't use, strengthens what you repeat |
In short: - Knowledge Graph = encyclopedia (static facts, explicit relationships) - NeuralMemory = biological memory (learns, forgets, associates, consolidates)
NeuralMemory can use a knowledge graph DB as its backend (FalkorDB opt-in is shipped), but the memory system itself adds biological behaviors on top: decay, Hebbian learning, sleep consolidation, spreading activation, and memory stages (episodic → semantic).
Technical & Architecture¶
Q: Does NeuralMemory use LLMs or embeddings?¶
No. NeuralMemory does not call any LLM or embedding API for encoding or retrieval. This is a common misconception.
| Operation | How it works |
|---|---|
| Encoding | Rule-based decomposition: keyword extraction, regex patterns, entity detection, relation extraction |
| Retrieval | Spreading activation on a neural graph — signal propagates through weighted synapses |
| Ranking | Multi-factor scoring: activation level, freshness, frequency, conductivity, memory stage |
This means:
- Zero API cost — no OpenAI/Anthropic calls for memory operations
- Zero hallucination risk at the retrieval layer — what you stored is what you get back
- Fully offline — works without internet after install
- Deterministic — same query on same brain = same results (no model randomness)
The only AI involvement is the agent itself (Claude, GPT, etc.) deciding when to call NeuralMemory tools. The memory system is pure algorithmic graph traversal.
Q: What are the performance benchmarks?¶
NeuralMemory includes reproducible benchmarks. Key results (median, 10 runs each):
Activation Engine (synthetic graphs):
| Graph Size | Hybrid Latency | Speedup vs Classic |
|---|---|---|
| 100 neurons | 0.34ms | 4.6x faster |
| 1,000 neurons | 0.40ms | 5.5x faster |
| 5,000 neurons | 0.45ms | 5.1x faster |
Full Pipeline (15 memories, 5 queries):
| Query | Latency | Confidence |
|---|---|---|
| "What did Alice suggest?" | 1.6ms | 1.0 |
| "Why did we choose PostgreSQL?" | 0.9ms | 1.0 |
Sub-millisecond retrieval even at 5,000 neurons. See full benchmarks for methodology.
To regenerate on your hardware:
Q: How does consolidation work? When should I run it?¶
Consolidation maintains brain health through three strategies:
| Strategy | What it does | When to use |
|---|---|---|
mature |
Advances episodic memories to semantic stage | Weekly — promotes stable patterns |
merge |
Combines overlapping fibers, prunes weak synapses | When brain grows large (1000+ fibers) |
full |
All strategies + topic summarization | Monthly deep cleanup |
# Recommended schedule
nmem consolidate --strategy mature # Weekly
nmem consolidate --strategy full # Monthly
# Check what would be affected first
nmem brain health
Important: Consolidation never deletes memories outright. It:
- Prunes weak synapses (connections with very low weight) — the neurons remain
- Merges overlapping fibers — combines near-duplicate memory traces
- Summarizes topic clusters — creates summary neurons linking related memories
If you're unsure, run nmem brain health first to see diagnostics and recommendations.
Q: How does multi-device sync handle conflicts?¶
NeuralMemory uses a local-first architecture with hub-and-spoke sync:
Conflict resolution strategies:
| Strategy | Behavior |
|---|---|
prefer_recent |
Most recently modified memory wins (default) |
prefer_local |
Local device always wins |
prefer_remote |
Hub version always wins |
prefer_stronger |
Memory with higher activation/confidence wins |
Offline support: Full. Each device has a complete SQLite brain. Sync is incremental — only changed fibers/neurons are transmitted. Changes queue locally and sync when connectivity resumes.
Concurrent writes: Each device tracks a vector clock. When two devices modify the same neuron, the configured strategy resolves the conflict automatically. No manual merge required.
# Configure sync
nmem sync config set --hub-url https://your-hub:8000 --strategy prefer_recent
# Manual sync
nmem sync push # Send local changes
nmem sync pull # Get remote changes
nmem sync full # Bidirectional
Q: Is NeuralMemory production-ready?¶
NeuralMemory is designed for AI agent memory — not as a general-purpose database. Here's an honest assessment:
| Aspect | Status |
|---|---|
| Test suite | 584+ tests, 70%+ coverage enforced by CI |
| Security | Input validation, ReDoS protection, activation queue caps, sensitive content detection |
| Stability | 51+ releases, used daily by the maintainers in production AI workflows |
| Scalability | Tested up to 5,000 neurons with sub-ms latency; designed for agent-scale data, not big data |
| Third-party audit | Not yet — contributions welcome |
| SLA | None — this is open-source MIT software |
| Contributors | Small team — bus factor risk mitigated by clean architecture + MIT license |
Good fit for:
- AI coding assistants (Claude Code, Cursor, Windsurf)
- Personal knowledge management
- Small team agent workflows
- Research and experimentation
Not designed for:
- Enterprise databases with SLA requirements
- High-throughput write-heavy workloads (100K+ writes/sec)
- Multi-tenant SaaS platforms
- Replacing production search infrastructure
Q: How does NeuralMemory handle concurrent access from multiple agents?¶
SQLite with WAL (Write-Ahead Logging) mode enables:
- Multiple concurrent readers — agents can query simultaneously
- Single writer at a time — writes are serialized by SQLite's locking
- No corruption risk — WAL mode prevents reader-writer conflicts
For multi-agent write scenarios, NeuralMemory uses a deferred write queue — non-critical writes (Hebbian weight updates, conductivity changes) are batched and flushed after the response, reducing lock contention.
If you need true multi-writer concurrency across processes, use the FastAPI server as a central write coordinator.