NeuralMemory FAQ¶

Installation¶

Q: `pip install neural-memory` installs what?¶

Core: pydantic, networkx, python-dateutil, typer, aiohttp, aiosqlite, rich
CLI tools: nmem, neural-memory, nmem-mcp
Optional extras:
[server] — FastAPI + Uvicorn
[neo4j] — Neo4j graph database
[nlp-en] — English NLP (spaCy)
[nlp-vi] — Vietnamese NLP (underthesea, pyvi)
[all] — All of the above
[dev] — Development tools (pytest, ruff, mypy, etc.)

Q: `pip` not working on Windows?¶

Use python -m pip instead:

python -m pip install neural-memory[all]

Q: How to install from source?¶

git clone https://github.com/nhadaututtheky/neural-memory.git
cd neural-memory
pip install -e ".[all,dev]"

The -e flag enables editable mode — code changes take effect immediately without reinstalling.

Q: Too many commands — is there a simpler UI/UX approach?¶

Yes. If you use VS Code, the NeuralMemory extension provides a full GUI — no terminal commands needed:

Encode memory: Select text → Ctrl+Shift+M E
Query memory: Ctrl+Shift+M Q → type your question
Start/stop server: Ctrl+Shift+P → NeuralMemory: Start Server
Switch brain: Ctrl+Shift+P → NeuralMemory: Switch Brain
View graph: Ctrl+Shift+P → NeuralMemory: Open Graph View

The sidebar panel also shows neurons, fibers, and brain stats at a glance.

Q: How to update to the latest version?¶

pip install --upgrade neural-memory[all]

If installed from source:

git pull
pip install -e ".[all,dev]"

VS Code Extension¶

Q: How to use NeuralMemory without writing Python code?¶

Install the VS Code extension:

Open VS Code
Go to Extensions (Ctrl+Shift+X)
Search NeuralMemory
Click Install

Note: The extension still requires Python + neural-memory package installed on the machine as its backend.

Q: Extension not showing data?¶

Start the server: Ctrl+Shift+P → NeuralMemory: Start Server
Switch brain if needed: Ctrl+Shift+P → NeuralMemory: Switch Brain → select your brain
Click refresh

Q: Server running on a different port than the extension expects?¶

The extension defaults to port 8000. If your server runs on a different port, update the setting:

Open VS Code Settings (Ctrl+,)
Search neuralmemory.serverUrl
Set it to your server's URL (e.g. http://127.0.0.1:8080)

Q: Server is running, correct port set, but extension still shows nothing?¶

A new brain starts empty — there is no data to display yet. You need to encode at least one memory first:

Select any text in your editor
Press Ctrl+Shift+M E to encode it
Click refresh in the NeuralMemory sidebar

After encoding, neurons and fibers will appear in the sidebar.

Q: I opened the server URL in the browser but only see JSON info, not my data?¶

The root URL (/) only shows basic API info. To view your data:

Graph visualization (UI): go to /ui (e.g. http://127.0.0.1:8000/ui)
API documentation (Swagger): go to /docs (e.g. http://127.0.0.1:8000/docs)
Neurons list (API): go to /memory/neurons (requires X-Brain-ID header)

The VS Code sidebar is the main way to browse your neurons and fibers.

Per-Project Configuration¶

Q: How do I keep memories from different projects separate?¶

You don't need to do anything. NeuralMemory auto-detects which git repo and branch you're in, then tags your memories accordingly. When you recall, it prioritizes memories from your current context.

Just use NeuralMemory normally — the right memories surface for the right project.

Q: What if I want complete separation between projects?¶

Think of a brain as a separate notebook. By default, everything goes into one notebook called default. If you want a completely separate notebook for a different project, just tell the AI:

"Create a new brain called work-api and switch to it"

To switch back later:

"Switch to my default brain"

Memories in work-api and default never mix.

Q: Can I make Claude Code auto-pick the right brain per project?¶

Yes. Drop a .mcp.json file in each project root:

// ~/projects/work-api/.mcp.json
{
  "mcpServers": {
    "neuralmemory": {
      "command": "python",
      "args": ["-m", "neural_memory.mcp"],
      "env": { "NEURALMEMORY_BRAIN": "work-api" }
    }
  }
}

Now when you open that project in Claude Code, it automatically uses the work-api brain.

Q: Can I save project rules that the AI always follows?¶

Yes. Just tell the AI in plain language:

"Remember: this project uses PostgreSQL, never SQLite"

"Remember: all API responses must use the {data, error, meta} format"

These are saved in your current brain and automatically surface when relevant — you don't need to know any commands.

For bigger project context, just say:

"Save this project context: name is work-api, tech stack is Python, FastAPI, PostgreSQL"

Next session, say "recap" or "what were we working on?" and everything reloads instantly.

Database Training¶

Q: Can NeuralMemory learn from my database?¶

Yes. The nmem_train_db tool teaches your brain to understand database structure:

nmem_train_db(action="train", connection_string="sqlite:///myapp.db", domain_tag="myapp")

This extracts schema knowledge (tables, relationships, patterns) — NOT raw data rows.

Q: Which databases are supported?¶

SQLite only in v1. The architecture supports adding PostgreSQL and MySQL dialects in the future.

Q: Will it read my actual data?¶

No. DB-to-Brain opens the database in read-only mode and only reads PRAGMA metadata (table definitions, foreign keys, indexes). No SELECT queries are run on your data tables (except COUNT(*) for row count estimates).

Q: How do I check what schema knowledge is stored?¶

nmem_train_db(action="status")

This shows how many schema entities have been trained into the current brain.

AI Agent Skills¶

Q: What are NeuralMemory skills?¶

Composable AI agent workflows that follow the ship-faster SKILL.md pattern. They give Claude Code structured methods for memory management:

Skill	What it does
`memory-intake`	Converts messy notes into well-typed, tagged memories via 1-question-at-a-time clarification
`memory-audit`	6-dimension quality review (purity, freshness, coverage, clarity, relevance, structure) with A-F grading
`memory-evolution`	Evidence-based optimization — consolidation, enrichment, pruning, tag normalization

Q: How do I install the skills?¶

Copy from the NeuralMemory repo to your Claude Code skills directory:

# Linux/macOS
cp -r skills/memory-* ~/.claude/skills/

# Windows
xcopy /E /I skills\memory-intake %USERPROFILE%\.claude\skills\memory-intake
xcopy /E /I skills\memory-audit %USERPROFILE%\.claude\skills\memory-audit
xcopy /E /I skills\memory-evolution %USERPROFILE%\.claude\skills\memory-evolution

Q: How do I use them?¶

In Claude Code, invoke by name:

/memory-intake "Meeting notes: decided on Redis caching, Bob handles migration, deadline Friday"
/memory-audit
/memory-evolution "Focus on the auth topic"

Q: Do they require anything besides NeuralMemory?¶

Just the NeuralMemory MCP server configured in Claude Code. The skills use existing MCP tools (nmem_remember, nmem_recall, nmem_stats, nmem_health, etc.) — no new dependencies.

Q: Do I need to install NeuralMemory per project?¶

No. Install once globally and it works for the entire machine:

python -m pip install neural-memory[all]

Data is stored in ~/.neuralmemory/ — not tied to any specific project. All tools (CLI, AntiGravity, Claude Code, VS Code extension) read from the same location.

To separate data per project, use different brains:

nmem brain create my-project
nmem brain switch my-project

They already share the same brain automatically. Both read/write to the same files:

Config: ~/.neuralmemory/config.toml
Data: ~/.neuralmemory/brains/<name>.db

As long as both tools point to the same current_brain, all memories are synced. Verify with:

nmem config show

Q: How do I let Claude Code query my brain using natural language?¶

Add the NeuralMemory MCP server to ~/.claude.json:

{
  "mcpServers": {
    "neuralmemory": {
      "command": "python",
      "args": ["-m", "neural_memory.mcp"]
    }
  }
}

After restarting Claude Code, just ask naturally:

"What did we decide about the volume spike feature?"

The agent will automatically call the recall tool to search your brain.

NeuralMemory vs RAG¶

Q: I have 100 document files (law, specs, etc.) — should I use NeuralMemory or RAG?¶

RAG/Vector Search is better for document lookup. Here's why:

Aspect	RAG	NeuralMemory
Best for	"Find the paragraph about X"	"Why did we decide X?"
Query style	Semantic similarity search	Associative recall through graph traversal
100 law docs	Chunks + embeddings → sub-10ms exact match	SimHash + keyword anchors → weaker semantic matching
Embedding quality	LLM understands synonyms, paraphrasing	No embeddings — pure algorithmic, may miss paraphrases
Returns	Exact text chunks from source docs	Fiber summaries, not original verbatim text
Cost	~$0.02/1K queries (embedding API)	$0.00 — fully offline

When to use which:

Scenario	Winner
"What does Article 42 say?" (document lookup)	RAG
"Which law covers remote work?" (search)	RAG
"Why did the team choose Article 42 over 43?" (decision chain)	NeuralMemory
"What pattern do we see in deploy failures?" (habit detection)	NeuralMemory
"What did Alice suggest in yesterday's meeting?" (episodic recall)	NeuralMemory

Can I use both? Yes — and that's often the best approach: - RAG for document corpus search (laws, specs, API docs) - NeuralMemory for agent memory (decisions, errors, workflows, context across sessions)

They complement each other. RAG answers "what does the document say?", NeuralMemory answers "what did we learn and decide?"

Q: What about `nmem_train` for documents — doesn't that compete with RAG?¶

nmem_train ingests documents into the neural graph as permanent (pinned) knowledge. It works well for: - Project docs you reference occasionally - Onboarding guides where relationships matter - Small doc sets (<20 files) where graph traversal adds value

For large document corpus search (100+ files, legal compliance, exact text retrieval), RAG with a proper vector database (Chroma, Pinecone, Weaviate) will outperform NeuralMemory significantly. NeuralMemory's strength is associative recall and relationship tracing, not semantic text search.

Q: Is NeuralMemory basically a Knowledge Graph database?¶

Close, but no. They share graph structure, but serve fundamentally different purposes:

Aspect	Knowledge Graph DB (Neo4j, FalkorDB)	NeuralMemory
Purpose	Static knowledge map — store entities & relationships	Living agent memory — learn from experience
Schema	Explicit: entities, relationships defined upfront	Emergent: neurons form connections organically
Query	Cypher/GQL — precise graph queries	Spreading activation — signal propagates like a reflex
Lifecycle	Data persists unchanged until explicitly modified	Memories decay, strengthen with use (Hebbian learning), consolidate during "sleep"
Scale target	Millions of nodes, enterprise workloads	Agent-scale (thousands of neurons), sub-ms latency
Behavior	Database — stores what you tell it	Brain — forgets what you don't use, strengthens what you repeat

In short: - Knowledge Graph = encyclopedia (static facts, explicit relationships) - NeuralMemory = biological memory (learns, forgets, associates, consolidates)

NeuralMemory can use a knowledge graph DB as its backend (FalkorDB opt-in is shipped), but the memory system itself adds biological behaviors on top: decay, Hebbian learning, sleep consolidation, spreading activation, and memory stages (episodic → semantic).

Technical & Architecture¶

Q: Does NeuralMemory use LLMs or embeddings?¶

No. NeuralMemory does not call any LLM or embedding API for encoding or retrieval. This is a common misconception.

Operation	How it works
Encoding	Rule-based decomposition: keyword extraction, regex patterns, entity detection, relation extraction
Retrieval	Spreading activation on a neural graph — signal propagates through weighted synapses
Ranking	Multi-factor scoring: activation level, freshness, frequency, conductivity, memory stage

This means:

Zero API cost — no OpenAI/Anthropic calls for memory operations
Zero hallucination risk at the retrieval layer — what you stored is what you get back
Fully offline — works without internet after install
Deterministic — same query on same brain = same results (no model randomness)

The only AI involvement is the agent itself (Claude, GPT, etc.) deciding when to call NeuralMemory tools. The memory system is pure algorithmic graph traversal.

Q: What are the performance benchmarks?¶

NeuralMemory includes reproducible benchmarks. Key results (median, 10 runs each):

Activation Engine (synthetic graphs):

Graph Size	Hybrid Latency	Speedup vs Classic
100 neurons	0.34ms	4.6x faster
1,000 neurons	0.40ms	5.5x faster
5,000 neurons	0.45ms	5.1x faster

Full Pipeline (15 memories, 5 queries):

Query	Latency	Confidence
"What did Alice suggest?"	1.6ms	1.0
"Why did we choose PostgreSQL?"	0.9ms	1.0

Sub-millisecond retrieval even at 5,000 neurons. See full benchmarks for methodology.

To regenerate on your hardware:

python benchmarks/run_benchmarks.py

Q: How does consolidation work? When should I run it?¶

Consolidation maintains brain health through three strategies:

Strategy	What it does	When to use
`mature`	Advances episodic memories to semantic stage	Weekly — promotes stable patterns
`merge`	Combines overlapping fibers, prunes weak synapses	When brain grows large (1000+ fibers)
`full`	All strategies + topic summarization	Monthly deep cleanup

# Recommended schedule
nmem consolidate --strategy mature     # Weekly
nmem consolidate --strategy full       # Monthly

# Check what would be affected first
nmem brain health

Important: Consolidation never deletes memories outright. It:

Prunes weak synapses (connections with very low weight) — the neurons remain
Merges overlapping fibers — combines near-duplicate memory traces
Summarizes topic clusters — creates summary neurons linking related memories

If you're unsure, run nmem brain health first to see diagnostics and recommendations.

Q: How does multi-device sync handle conflicts?¶

NeuralMemory uses a local-first architecture with hub-and-spoke sync:

Device A (SQLite) ──┐
                    ├──→ Hub Server ──→ Device B (SQLite)
Device C (SQLite) ──┘

Conflict resolution strategies:

Strategy	Behavior
`prefer_recent`	Most recently modified memory wins (default)
`prefer_local`	Local device always wins
`prefer_remote`	Hub version always wins
`prefer_stronger`	Memory with higher activation/confidence wins

Offline support: Full. Each device has a complete SQLite brain. Sync is incremental — only changed fibers/neurons are transmitted. Changes queue locally and sync when connectivity resumes.

Concurrent writes: Each device tracks a vector clock. When two devices modify the same neuron, the configured strategy resolves the conflict automatically. No manual merge required.

# Configure sync
nmem sync config set --hub-url https://your-hub:8000 --strategy prefer_recent

# Manual sync
nmem sync push    # Send local changes
nmem sync pull    # Get remote changes
nmem sync full    # Bidirectional

Q: Is NeuralMemory production-ready?¶

NeuralMemory is designed for AI agent memory — not as a general-purpose database. Here's an honest assessment:

Aspect	Status
Test suite	584+ tests, 70%+ coverage enforced by CI
Security	Input validation, ReDoS protection, activation queue caps, sensitive content detection
Stability	51+ releases, used daily by the maintainers in production AI workflows
Scalability	Tested up to 5,000 neurons with sub-ms latency; designed for agent-scale data, not big data
Third-party audit	Not yet — contributions welcome
SLA	None — this is open-source MIT software
Contributors	Small team — bus factor risk mitigated by clean architecture + MIT license

Good fit for:

AI coding assistants (Claude Code, Cursor, Windsurf)
Personal knowledge management
Small team agent workflows
Research and experimentation

Not designed for:

Enterprise databases with SLA requirements
High-throughput write-heavy workloads (100K+ writes/sec)
Multi-tenant SaaS platforms
Replacing production search infrastructure

Q: How does NeuralMemory handle concurrent access from multiple agents?¶

SQLite with WAL (Write-Ahead Logging) mode enables:

Multiple concurrent readers — agents can query simultaneously
Single writer at a time — writes are serialized by SQLite's locking
No corruption risk — WAL mode prevents reader-writer conflicts

For multi-agent write scenarios, NeuralMemory uses a deferred write queue — non-critical writes (Hebbian weight updates, conductivity changes) are batched and flushed after the response, reducing lock contention.

If you need true multi-writer concurrency across processes, use the FastAPI server as a central write coordinator.