Two layers of persistent memory: a knowledge graph powered by wikilinks, and on-demand vector search over your codebase. Both run locally. No server. No embeddings pipeline.
You have tens of notes, not millions. Vector databases are overkill — and they hide what your agents actually know.
.md node[[wikilinks]] connect knowledge[[this-agent]]
Each memory is a node. Each [[wikilink]] is an edge. Agents read only what's linked to them — 18 entries (~72 tokens) instead of everything (~5000 tokens).
.md files in memory-vault/. Each references agents with [[wikilinks]]. Before each task, agents grep for their own name and load only those files.
Layer 1 (wikilinks) is the fast index — always loaded, zero cost. Layer 2 uses a real vector store via MCP, but only when the agent needs deep context. The key difference: you choose the backend, not us.
git log.tasuki vault sync indexes your entire project into the vector store. Runs automatically on onboard and when agents write new memories..mcp.json. Replace rag-memory-mcp with Qdrant or pgvector. The agent query stays the same.The vault is auto-initialized during tasuki onboard. Agents write to it when they learn something non-obvious. You can open it in Obsidian.
Memory isn't decorative — it's a feedback loop. Each pipeline run reads from the vault, and Stage 9 writes back.
[[its-name]] before any task. Loads ~18 entries (~72 tokens) — not 200.tasuki error "used print() not logger" → creates node AND appears in project-facts.md "Do NOT" section.ls. Not guessed.Without the cap, 6 months of active use would give each agent 200+ entries. Context fills with noise. Valuable insights get buried.
| Decision | Alternative evaluated | Why this option |
|---|---|---|
| Individual .md nodes + wikilinks | Single MEMORY.md per agent with appended lines | Flat files become noise after 50 entries. Individual nodes are navigable. Compatible with Obsidian. |
| Behavioral memory (agent writes it) | PostToolUse hook that auto-extracts learnings | A shell script can't evaluate significance. "Root cause required 3 investigation rounds" needs LLM reasoning, not regex. |
| No RAG/Vector DB by default | ChromaDB or Pinecone from the start | Grep works for tens of notes. Vector DBs are optimized for millions. Over-engineering creates dependencies and infra costs. |
| Max 20 entries per agent | Unlimited entries | Without the limit, 6 months = 200+ entries per agent. Context fills with noise. The limit forces curation. |
| Local SQLite for Layer 2 | External vector API (OpenAI embeddings) | $0, offline, <50ms queries, no API key required. Works in air-gapped environments. |
| Agent-specific filtering ([[wikilinks]]) | Load all memory for every agent | Backend Dev doesn't need Frontend Dev's heuristics. Filtering reduces context from ~5000 tokens to ~700 tokens per agent. |
Numbers from a real Django + PostgreSQL project after onboarding and one pipeline run.
.md files. Open memory-vault/ in Obsidian or any text editor. Edit, delete, add [[wikilinks]] to connect nodes. The vault is fully human-readable and human-editable.[[wikilink]] becomes a dead link — same as in Obsidian. You can run tasuki vault sync to update the RAG index after deleting.rag-memory-mcp) uses SQLite locally. Everything runs on your machine — no API calls, no cloud, no embeddings service. Works in air-gapped environments.bugs/ or lessons/, the agent creates a permanent entry in heuristics/. The original episodic entries remain for historical context, but the heuristic becomes the canonical rule. Example: two separate SQL injection bugs → one "always use parameterized queries" heuristic..mcp.json. Replace rag-memory-mcp with qdrant-mcp or pgvector. The agent query never changes — only the backend. See the Growth Path table above for options.[[wikilink]]. Backend Dev reads [[backend-dev]] entries (~18 memories, ~72 tokens). Security reads [[security]] entries. A memory can be tagged with multiple agents if it's relevant to more than one.Onboard once. The knowledge graph initializes automatically.