SynapsesOS
Internals

Storage & Persistence

Synapses uses SQLite for all persistence, split across two databases with distinct responsibilities.

Dual-Database Design

Graph DB — The structural representation of the codebase. Rebuilt frequently as files change.

Knowledge DB — Long-lived agent state: tasks, memories, rules, and session data. Accumulates over time and survives full graph rebuilds.

Both databases are opened and managed independently by the store package.

Graph DB Tables

TablePurpose
nodesAll code entities (functions, structs, files, packages, etc.)
edgesRelationships between nodes (calls, contains, imports, implements)
manual_edgesUser-defined edges that survive re-parsing
edge_learned_weightsDynamically adjusted edge weights from usage patterns
call_sitesUnresolved and resolved function call sites
file_hashesContent hashes for change detection
node_embeddingsVector embeddings for semantic search
nodes_ftsFTS5 full-text search index over node names and metadata

Knowledge DB Tables

TablePurpose
plansAgent work plans
tasksIndividual tasks within plans
session_statePer-session state (current file, active plan, etc.)
dynamic_rulesArchitectural rules added at runtime
violation_logHistory of rule violations detected
agentsRegistered agent identities for coordination
eventsEvent log for agent activity
web_cacheCached web content fetches
messagesAgent message history
memoriesLong-term memory entries
episodesEpisodic memory (session-scoped recollections)
memory_embeddingsVector embeddings for memory search
memory_anchorsLinks between memories and code entities
memory_versionsVersion history for memory entries
annotationsUser and agent annotations on code entities
gapsDetected knowledge gaps
adrArchitecture Decision Records

Index Strategy

Edges are the most frequently queried table. Covering indexes are defined on key column combinations:

  • (from_id, type, to_id) — outgoing edge lookups
  • (to_id, type, from_id) — incoming edge lookups (caller queries)
  • (type, from_id) — type-filtered scans

These covering indexes mean most edge queries are served entirely from the index without touching the main table.

Graph Serialization

SaveGraph writes the full in-memory graph to the database:

  1. Clears existing nodes and edges
  2. Batch-inserts all nodes and edges in a transaction
  3. Preserves manual edges (they are not cleared)
  4. Writes file hashes for change detection

LoadGraph reads the database back into the in-memory representation at startup.

SmartReindex

Rather than re-parsing the entire codebase on startup, SmartReindex uses mtime-based change detection:

  1. Compare each file’s modification time against the stored file_hashes entry
  2. If the mtime changed, hash the content and compare
  3. Only re-parse files whose content actually changed
  4. Remove nodes/edges for deleted files

This makes startup fast even on large codebases — only changed files pay the parse cost.

Vector Search (HNSW)

Both node embeddings and memory embeddings use an HNSW (Hierarchical Navigable Small World) index for approximate nearest-neighbor search. This powers:

  • Semantic code search (find nodes similar to a natural language query)
  • Memory retrieval (find relevant past experiences for a current context)

The HNSW index is built in-memory from the SQLite-stored vectors at startup.