The recorder behind L4's coalesce, L9's lanes & DLQ, and L31's non-coalescable WITH. ~13 min.
Three earlier lessons leaned on this one package without opening it: L4 said writes are "coalesced" into UNWINDs,
L9 said they flow through "two lanes with a DLQ + halt," and L31 said a statement with a WITH is "non-coalescable."
All three live in pkg/graphwrite. This lesson opens it — and the central trick is a small piece of misdirection: a
transaction object that doesn't actually run anything.
MERGEs to Memgraph would be 500 round-trips of parse +
plan + execute. You'd batch them by hand into one UNWIND $rows — but then every handler has to know about batching.
The recorder lets handlers stay one-event-simple and does the batching for them.
Recorder implements the graphstore.Tx interface — but instead of executing, every neo4jExec call just
appends the statement to a slice. Handlers call it once per event, writing clean one-event-at-a-time code; the
recorder accumulates. When the batch closes, Statements() returns the buffer coalesced.
// handler code stays simple — one neo4jExec per event: neo4jExec(ctx, tx, "MERGE (t {id:$tok})-[h:HOLDS]->(o {id:$holder}) SET h.qty=$q", params) // …but tx is a *Recorder: that call RECORDS, it doesn't hit Memgraph. // Statements() later folds all the recorded MERGEs into one UNWIND.
Query via neo4jExec, plus SetCursor). UpsertNode,
ApplyMutations, Commit, etc. panic — deliberately. The recorder isn't a general Tx; it's a narrow
capture surface for the one code path that uses it. A panic on misuse is louder and safer than a silent no-op that would
mask a write going nowhere.
coalesceByTemplate buckets the recorded statements by (Cypher text, sorted param-key set). Every statement
sharing a template lands in one bucket; the bucket's rows are the per-statement params. At emit time each bucket becomes a
single UNWIND $rows AS row <template>:
500 × "MERGE …HOLDS… SET h.qty=$q" (one per Transfer) │ coalesceByTemplate ▼ 1 × "UNWIND $rows AS row MERGE …HOLDS… SET h.qty=row.q" with rows=[…500…]
| Case | Emit |
|---|---|
| N>1 statements, same template | one UNWIND, N rows (the big win) |
| singleton (1 row) | pass through unrewritten — no pointless 1-row UNWIND, and no rewrite cost |
| order | buckets emit at their FIRST occurrence; SetCursor recorded last stays last |
MERGEs
keyed on stable ids. Coalescing leans entirely on that: it can shuffle freely precisely because each statement's effect
doesn't depend on its neighbours. Break the independence invariant in a handler and the coalesce silently changes results.
coalescable(s) is a four-gate predicate. A statement coalesces only if all hold:
| Gate | Reason |
|---|---|
| has params | nothing to fold otherwise |
not already an UNWIND | the big batch_writer UNWINDs skip — don't nest $rows inside $rows |
references a $param | a literal statement is singleton by definition |
no WITH clause | this is L31's fact, in code |
WHERE coalesce(r.updated_block,0) <= $cursor behind a WITH barrier, and that
this makes it "non-coalescable, so it publishes as a singleton." Here's the actual gate: hasWithClause(s.Text) →
not coalescable. A WITH introduces a query-scope boundary that a blind UNWIND $rows wrapper can't safely
re-scope, so those guarded writes are emitted one-by-one. The temporal guard and the singleton-publish are the same
clause doing double duty — exactly as L31 claimed, now verified.
The same function runs at two scopes. Intra-batch: within one applyBatch, fold repeated templates.
Cross-entry (CoalesceAcrossEntries): the BatchAccumulator already runs N consecutive same-lane stream
entries inside one ExecuteWrite (amortizing BEGIN/COMMIT); this flattens all their statements and runs the same
coalesceByTemplate over the union — so a template repeated across entries collapses too. One coalescer, two
fan-in levels.
L9's "two lanes" live in lane_params.go. A (lane, chainID) pair derives a coherent set of Redis +
lease coordinates (stream, consumer group, halt flag, lease key) via LaneParamsFor. The plan (FORTA-2843) is a
physical split into a state lane and a meta lane under separate writers, so high-churn state writes can't
head-of-line-block slower meta writes. Two guards keep it honest:
"") must derive bit-identical keys to the pre-lane wire — so shipping the lane machinery changes nothing until the split is deliberately turned on.AssertCoherent fails loudly at startup if a hand-assembled LaneParams has keys disagreeing on the lane — a misconfiguration caught before it can split-brain the stream.Finally L9's "DLQ + halt," and it's a direct enforcement of L8's block-determinism. The bug it fixed
(dlq.go, FORTA-2642):
| Before | After | |
|---|---|---|
| a poison entry (Cypher parse fail) past retry | XACK + log at WARN → the mutation is lost | atomically XADD to graph-writes-dlq:{chain} + remove from pending, in one MULTI/EXEC, then halt the Consumer |
| consequence | state at cursor N is no longer a pure function of blocks 0..N — determinism broken (L8) | nothing is lost; the entry is preserved with dlq_reason / source_id / retries / moved_at for a human |
MULTI/EXEC: on EXEC failure the source stays pending
(retried next drain), so there's no "acked but not in the DLQ" gap. It's L8's "fail loud, never corrupt silently," reduced
to a Redis transaction.
coalesceByTemplate, intra + cross-entry),
L9's lanes + DLQ (lane_params.go, dlq.go), and L31's non-coalescable WITH (coalescable). The recorder
is the quiet seam where clean per-event handler code is turned into an efficient, ordered, durable write stream — without
the handlers ever knowing.
Recorder do when a handler calls neo4jExec on it?coalesceByTemplate buckets statements by what key?WITH clause. How does the recorder treat it, and why does L31 care?CoalesceAcrossEntries add over the intra-batch coalesce?coalesceByTemplate (bucket by template+keys → one UNWIND) and the independence invariant it relies on.coalescable gates and explain why WITH excludes a statement (closing L31).Grounded in: pkg/graphwrite/recorder.go (Recorder buffering Tx + deliberate panics, coalesceByTemplate bucket-by-(text,keys)→UNWIND + singleton passthrough + first-occurrence order, coalescable 4 gates incl. hasWithClause, CoalesceAcrossEntries, rewriteForUnwind/alreadyUnwoundRe), lane_params.go (LaneParamsFor, default-lane bit-identical, AssertCoherent, FORTA-2843 state/meta split), dlq.go (moveToDLQ MULTI/EXEC + halt, DLQMaxLen=10k, dlq_reason/source_id/retries/moved_at, FORTA-2642 determinism fix). Verify against source — the code is the truth.