Storage & Persistence

Synapses uses SQLite for all persistence, split across two databases with distinct responsibilities.

Dual-Database Design

Graph DB — The structural representation of the codebase. Rebuilt frequently as files change.

Knowledge DB — Long-lived agent state: tasks, memories, rules, and session data. Accumulates over time and survives full graph rebuilds.

Both databases are opened and managed independently by the store package.

Graph DB Tables

Table	Purpose
`nodes`	All code entities (functions, structs, files, packages, etc.)
`edges`	Relationships between nodes (calls, contains, imports, implements)
`manual_edges`	User-defined edges that survive re-parsing
`edge_learned_weights`	Dynamically adjusted edge weights from usage patterns
`call_sites`	Unresolved and resolved function call sites
`file_hashes`	Content hashes for change detection
`node_embeddings`	Vector embeddings for semantic search
`nodes_fts`	FTS5 full-text search index over node names and metadata

Knowledge DB Tables

Table	Purpose
`plans`	Agent work plans
`tasks`	Individual tasks within plans
`session_state`	Per-session state (current file, active plan, etc.)
`dynamic_rules`	Architectural rules added at runtime
`violation_log`	History of rule violations detected
`agents`	Registered agent identities for coordination
`events`	Event log for agent activity
`web_cache`	Cached web content fetches
`messages`	Agent message history
`memories`	Long-term memory entries
`episodes`	Episodic memory (session-scoped recollections)
`memory_embeddings`	Vector embeddings for memory search
`memory_anchors`	Links between memories and code entities
`memory_versions`	Version history for memory entries
`annotations`	User and agent annotations on code entities
`gaps`	Detected knowledge gaps
`adr`	Architecture Decision Records

Index Strategy

Edges are the most frequently queried table. Covering indexes are defined on key column combinations:

(from_id, type, to_id) — outgoing edge lookups
(to_id, type, from_id) — incoming edge lookups (caller queries)
(type, from_id) — type-filtered scans

These covering indexes mean most edge queries are served entirely from the index without touching the main table.

Graph Serialization

SaveGraph writes the full in-memory graph to the database:

Clears existing nodes and edges
Batch-inserts all nodes and edges in a transaction
Preserves manual edges (they are not cleared)
Writes file hashes for change detection

LoadGraph reads the database back into the in-memory representation at startup.

SmartReindex

Rather than re-parsing the entire codebase on startup, SmartReindex uses mtime-based change detection:

Compare each file’s modification time against the stored file_hashes entry
If the mtime changed, hash the content and compare
Only re-parse files whose content actually changed
Remove nodes/edges for deleted files

This makes startup fast even on large codebases — only changed files pay the parse cost.

Vector Search (HNSW)

Both node embeddings and memory embeddings use an HNSW (Hierarchical Navigable Small World) index for approximate nearest-neighbor search. This powers:

Semantic code search (find nodes similar to a natural language query)
Memory retrieval (find relevant past experiences for a current context)

The HNSW index is built in-memory from the SQLite-stored vectors at startup.