Storage & Persistence
Synapses uses SQLite for all persistence, split across two databases with distinct responsibilities.
Dual-Database Design
Graph DB — The structural representation of the codebase. Rebuilt frequently as files change.
Knowledge DB — Long-lived agent state: tasks, memories, rules, and session data. Accumulates over time and survives full graph rebuilds.
Both databases are opened and managed independently by the store package.
Graph DB Tables
| Table | Purpose |
|---|---|
nodes | All code entities (functions, structs, files, packages, etc.) |
edges | Relationships between nodes (calls, contains, imports, implements) |
manual_edges | User-defined edges that survive re-parsing |
edge_learned_weights | Dynamically adjusted edge weights from usage patterns |
call_sites | Unresolved and resolved function call sites |
file_hashes | Content hashes for change detection |
node_embeddings | Vector embeddings for semantic search |
nodes_fts | FTS5 full-text search index over node names and metadata |
Knowledge DB Tables
| Table | Purpose |
|---|---|
plans | Agent work plans |
tasks | Individual tasks within plans |
session_state | Per-session state (current file, active plan, etc.) |
dynamic_rules | Architectural rules added at runtime |
violation_log | History of rule violations detected |
agents | Registered agent identities for coordination |
events | Event log for agent activity |
web_cache | Cached web content fetches |
messages | Agent message history |
memories | Long-term memory entries |
episodes | Episodic memory (session-scoped recollections) |
memory_embeddings | Vector embeddings for memory search |
memory_anchors | Links between memories and code entities |
memory_versions | Version history for memory entries |
annotations | User and agent annotations on code entities |
gaps | Detected knowledge gaps |
adr | Architecture Decision Records |
Index Strategy
Edges are the most frequently queried table. Covering indexes are defined on key column combinations:
(from_id, type, to_id)— outgoing edge lookups(to_id, type, from_id)— incoming edge lookups (caller queries)(type, from_id)— type-filtered scans
These covering indexes mean most edge queries are served entirely from the index without touching the main table.
Graph Serialization
SaveGraph writes the full in-memory graph to the database:
- Clears existing nodes and edges
- Batch-inserts all nodes and edges in a transaction
- Preserves manual edges (they are not cleared)
- Writes file hashes for change detection
LoadGraph reads the database back into the in-memory representation at startup.
SmartReindex
Rather than re-parsing the entire codebase on startup, SmartReindex uses mtime-based change detection:
- Compare each file’s modification time against the stored
file_hashesentry - If the mtime changed, hash the content and compare
- Only re-parse files whose content actually changed
- Remove nodes/edges for deleted files
This makes startup fast even on large codebases — only changed files pay the parse cost.
Vector Search (HNSW)
Both node embeddings and memory embeddings use an HNSW (Hierarchical Navigable Small World) index for approximate nearest-neighbor search. This powers:
- Semantic code search (find nodes similar to a natural language query)
- Memory retrieval (find relevant past experiences for a current context)
The HNSW index is built in-memory from the SQLite-stored vectors at startup.