Hybrid recall
Recall runs entirely inside one profile database — three channels over the same rows, merged by reciprocal-rank fusion. There is no cross-profile search, structurally.
POST /v1/memory/{ns}/{profile}/recall requires at least one of query, embedding,
topic_key.
The three channels
Section titled “The three channels”| Channel | Mechanism | Driven by | RRF weight |
|---|---|---|---|
| topic | exact topic_key lookup over active memories | topic_key | 2.0 |
| keyword | FTS5 (BM25) over summary + keywords | query | 1.0 |
| vector | DiskANN ANN over memory embeddings | embedding (or query with auto-embedding) | 1.0 |
The vector channel is best-effort: a missing vector table or a dimension mismatch degrades recall to keyword + topic rather than failing the request. Task memories skip the vector channel entirely — see Typed memories.
Rank fusion
Section titled “Rank fusion”Channel results merge by reciprocal-rank fusion: score = Σ w / (60 + rank) across the channels
that returned the memory. The merged list drops superseded and expired rows, tiebreaks on
recency, and truncates to k (default 8).
Each hit reports which channels found it (channels), so an agent can weigh an exact topic match
differently from a fuzzy semantic one.
Empty is a valid answer. Recall never pads results to reach k, and recall against a profile
that does not exist returns {"memories": [], "txid": 0} without creating anything.
Hybrid recall over 10k memories (FTS5 + topic + ANN, rank-fused) measured 11.7 ms p50 on the prototype.
Request
Section titled “Request”curl -X POST http://localhost:8080/v1/memory/acme/alice/recall \ -H 'content-type: application/json' \ -d '{"query": "what can this user eat?", "topic_key": "user.diet", "types": ["fact"], "k": 8}'| Field | Meaning |
|---|---|
query | free text for the keyword channel (and the vector channel when auto-embedding is configured) |
embedding | f32 vector for the vector channel; bring-your-own by default |
topic_key | exact key for the topic channel |
types | filter to a subset of fact / event / instruction / task |
session_id | restrict to memories ingested with this session |
source | restrict to memories one agent ingested (e.g. "claude-code") — see provenance |
k | result cap, default 8 |
include_superseded | include superseded rows (default false) |
include_turns | also search the raw transcript (see below) |
Response
Section titled “Response”{ "memories": [ { "id": "mem_b2…", "type": "fact", "topic_key": "user.diet", "summary": "vegan since 2026", "content": {"diet": "vegan"}, "keywords": "food preference", "session_id": null, "source": "claude-code", "created_at": 1781136000000, "superseded_by": null, "score": 0.04918, "channels": ["topic", "keyword"] } ], "txid": 43}The vegetarian fact this one superseded does not appear; pass
"include_superseded": true or fetch it by id to see the chain. Every response carries txid
(also as the Memoturn-Txid header) — send Memoturn-Min-Txid for read-your-writes after an
ingest. See Consistency.
The raw-turn channel
Section titled “The raw-turn channel”Typed memories are distilled; the verbatim transcript is still searchable. With
"include_turns": true (requires an embedding, or a query plus a configured embedder),
recall additionally runs a brute-force cosine search over the transcript layer — optionally
scoped by session_id — and returns matching turns in a separate turns array:
{ "memories": [ … ], "turns": [ {"session_id": "s-417", "seq": 12, "role": "user", "content": {"text": "I went vegan in January"}, "distance": 0.18} ], "txid": 43}Turns are not memories, so they are reported alongside the fused ranking, never mixed into it. The transcript layer itself is covered in Sessions.
Targeted lookups
Section titled “Targeted lookups”Two non-fused reads complement recall:
# one memory by id, with its full supersession chaincurl http://localhost:8080/v1/memory/acme/alice/memories/mem_b2…
# the profile's sessionscurl http://localhost:8080/v1/memory/acme/alice/sessionsRelated
Section titled “Related”- Typed memories — what recall searches
- Embeddings — auto-embedding bare
querystrings - API reference — the full memory route list
- SDKs and MCP —
memory_recallas a tool