Rate Limits
Rate Limits
Token-bucket rate limiting prevents runaway agents from overwhelming Synapses. Each limit is per-session.
{ "rate_limits": { "write_ops_per_minute": 30, "expensive_reads_per_minute": 20, "cross_project_per_minute": 60 }}Fields
| Field | Type | Default | Description |
|---|---|---|---|
write_ops_per_minute | int | 30 | Limit for write operations: memory(action="save"), annotate(action="add") |
expensive_reads_per_minute | int | 20 | Limit for memory(action="search") calls (involve embedding + search) |
cross_project_per_minute | int | 60 | Limit for cross-project federation queries |
How It Works
Rate limits use a token-bucket algorithm:
- Each bucket starts full (at the per-minute limit)
- Each operation consumes one token
- Tokens refill at a rate of
limit / 60per second - When the bucket is empty, the operation returns a rate limit error
When to Adjust
- High-frequency agents: If an agent makes many
memory(action="search")calls, increaseexpensive_reads_per_minuteto 40 - Multi-agent environments: Lower
write_ops_per_minuteto 15 per agent to prevent contention - Heavy federation use: Increase
cross_project_per_minuteto 120 for monorepo workflows