Observability
Memoturn’s operability contract is a small set of latency SLOs plus one queue-depth signal (segment-shipping backlog). This page lists the targets, what the deployment ships to watch them, and the measured numbers behind them.
SLO targets
Section titled “SLO targets”The targets the architecture is designed and dashboarded against:
| SLO | Target |
|---|---|
| provision latency p50 | < 100 ms (target ~10 ms) |
| cold-wake p50 / p99 (≤16 MB) | < 200 ms / < 1 s |
| hot KV read p50 | < 1 ms |
| SQL / document write p50 | < 5 ms |
| branch create/rewind p50 | < 50 ms |
| replication lag p99 (in-region) | < 1 s |
| lease failover (kill → writes resume) | < 15 s |
| segment-shipping backlog | ~0 sustained |
Segment-shipping backlog is the one to alarm on: sustained backlog means object storage is not
keeping up, which widens the Standard-durability RPO window (see Architecture).
Durable-mode writes — MEMOTURN_DURABILITY=durable, or the per-request Memoturn-Durability
header — are immune to backlog by construction: they ack only after the ship completes.
What the deployment ships
Section titled “What the deployment ships”The Helm deployment design includes an optional observability subchart: kube-prometheus-stack, an OpenTelemetry collector, and Grafana dashboards with the SLO panel above. The current prototype chart ships the data plane and MinIO; the observability subchart lands with the full cell (gateway, control API, etcd) — see Deployment and the Roadmap.
Available today on every node:
- Structured logs via
tracing, filtered withRUST_LOG(defaultmemoturnd=info,memoturn_api=info,memoturn_replication=info). /healthz— the readiness probe the chart wires up.txidon every read response — replication lag is directly observable from the client side by comparing writer and replica txids (see Consistency).- Request-surface guardrails — body cap (
413), request timeout, global concurrency cap, a control-endpoint rate limit (429), and a per-database write-queue cap (429+Retry-After), all tunable per node — see Configuration. - Write-pressure logs — a database whose write queue is backing up (or shedding) gets a
per-database
write pressurewarning every 30 s with queue depth, sheds, and the group-commit coalescing factor — see Scaling.
Measured: prototype, single node
Section titled “Measured: prototype, single node”Measured, not promised — reference points from the working prototype (in-process object store, so these are engine costs without network):
| Metric | Target | p50 |
|---|---|---|
| memory ingest (typed fact, 256-dim embedding, supersession) | <10 ms | 3.9 ms |
| hybrid recall over 10k memories (FTS5 + topic + ANN, rank-fused) | <25 ms | 11.7 ms |
| provision database | <100 ms | 17 µs |
| hot KV read / SQL write / doc insert | <1 ms / <5 ms / <5 ms | 3 µs / 16 µs / 15 µs |
| segment ship (write + WAL capture + PUT) | <10 ms | 61 µs |
| branch create (copy-on-write) | <50 ms | 47 µs |
| cold wake (restore + open + query) | <200 ms | 0.7 ms (+object-store RTT in prod) |
| 10k databases provisioned | — | 93 ms (107k/s), hot pool flat |
Measured: full stack on Kubernetes
Section titled “Measured: full stack on Kubernetes”The same operations through a real cell — kind, Helm chart, auth on, MinIO as the object store,
measured through kubectl port-forward (which sets a ~1.6 ms network floor):
| Metric | p50 | p99 |
|---|---|---|
| memory ingest (typed fact, namespace token) | 2.81 ms | 7.53 ms |
| hybrid recall @1k memories | 4.08 ms | 5.97 ms |
| provision database | 1.61 ms | 3.43 ms |
| hot SQL write | 1.59 ms | 2.83 ms |
| hot KV write / read | 1.66 / 1.63 ms | 3.53 / 2.94 ms |
| branch create (copy-on-write) | 3.01 ms | 3.69 ms |
| write + segment ship (to MinIO) | 6.54 ms | 8.67 ms |
The segment-ship row reflects a real object-storage PUT round-trip. On cloud object storage, expect cold wake and segment ship to gain same-region RTTs (~10–40 ms) — still inside targets.
Reproducing the numbers
Section titled “Reproducing the numbers”# the benchmark harness behind the prototype tablecargo run --release -p memoturn-bench
# the end-to-end agent-story walkthrough against a running nodescripts/demo.shFor the Kubernetes numbers, deploy the chart per Deployment and run the HTTP benchmark against the forwarded service:
python3 scripts/bench-http.py http://127.0.0.1:8080 --platform-key ... --n 200Sources: the README and docs/deployment-proof.md in the
repository.