Lesson 41 · Hot-Path Mechanisms · Deeper Track

Many writes, one UNWIND

The recorder behind L4's coalesce, L9's lanes & DLQ, and L31's non-coalescable WITH. ~13 min.

Builds on: L4 · L9 · L31 Anchor: 500 Transfers → one batch New: a Tx that records, not executes New: coalesce by template

Three earlier lessons leaned on this one package without opening it: L4 said writes are "coalesced" into UNWINDs, L9 said they flow through "two lanes with a DLQ + halt," and L31 said a statement with a WITH is "non-coalescable." All three live in pkg/graphwrite. This lesson opens it — and the central trick is a small piece of misdirection: a transaction object that doesn't actually run anything.

Your anchor: a block is a flood of near-identical writes
One block can carry hundreds of ERC-20 Transfers. Each one, in the handler code, is "upsert a HOLDS edge" — the same Cypher template, different addresses. Sending 500 separate MERGEs to Memgraph would be 500 round-trips of parse + plan + execute. You'd batch them by hand into one UNWIND $rows — but then every handler has to know about batching. The recorder lets handlers stay one-event-simple and does the batching for them.

1 · The recorder is a Tx that buffers

Recorder implements the graphstore.Tx interface — but instead of executing, every neo4jExec call just appends the statement to a slice. Handlers call it once per event, writing clean one-event-at-a-time code; the recorder accumulates. When the batch closes, Statements() returns the buffer coalesced.

// handler code stays simple — one neo4jExec per event:
neo4jExec(ctx, tx, "MERGE (t {id:$tok})-[h:HOLDS]->(o {id:$holder}) SET h.qty=$q", params)
// …but tx is a *Recorder: that call RECORDS, it doesn't hit Memgraph.
// Statements() later folds all the recorded MERGEs into one UNWIND.
Why most Tx methods panic
The recorder only supports the hot path (Query via neo4jExec, plus SetCursor). UpsertNode, ApplyMutations, Commit, etc. panic — deliberately. The recorder isn't a general Tx; it's a narrow capture surface for the one code path that uses it. A panic on misuse is louder and safer than a silent no-op that would mask a write going nowhere.

2 · coalesce by template — the core

coalesceByTemplate buckets the recorded statements by (Cypher text, sorted param-key set). Every statement sharing a template lands in one bucket; the bucket's rows are the per-statement params. At emit time each bucket becomes a single UNWIND $rows AS row <template>:

500 × "MERGE …HOLDS… SET h.qty=$q"  (one per Transfer)
        │  coalesceByTemplate
        ▼
1 × "UNWIND $rows AS row MERGE …HOLDS… SET h.qty=row.q"   with rows=[…500…]
CaseEmit
N>1 statements, same templateone UNWIND, N rows (the big win)
singleton (1 row)pass through unrewritten — no pointless 1-row UNWIND, and no rewrite cost
orderbuckets emit at their FIRST occurrence; SetCursor recorded last stays last
The invariant that makes reordering safe
Coalescing groups statements by template, which reorders them within the batch. That's only sound because the handler invariant (L4) requires recorded statements in one batch to be independent — order-free, idempotent MERGEs keyed on stable ids. Coalescing leans entirely on that: it can shuffle freely precisely because each statement's effect doesn't depend on its neighbours. Break the independence invariant in a handler and the coalesce silently changes results.

3 · What's not coalescable — and L31's WITH

coalescable(s) is a four-gate predicate. A statement coalesces only if all hold:

GateReason
has paramsnothing to fold otherwise
not already an UNWINDthe big batch_writer UNWINDs skip — don't nest $rows inside $rows
references a $parama literal statement is singleton by definition
no WITH clausethis is L31's fact, in code
Now L31 closes
L31 said a reconcile heal carries WHERE coalesce(r.updated_block,0) <= $cursor behind a WITH barrier, and that this makes it "non-coalescable, so it publishes as a singleton." Here's the actual gate: hasWithClause(s.Text) → not coalescable. A WITH introduces a query-scope boundary that a blind UNWIND $rows wrapper can't safely re-scope, so those guarded writes are emitted one-by-one. The temporal guard and the singleton-publish are the same clause doing double duty — exactly as L31 claimed, now verified.

Two levels of coalescing

The same function runs at two scopes. Intra-batch: within one applyBatch, fold repeated templates. Cross-entry (CoalesceAcrossEntries): the BatchAccumulator already runs N consecutive same-lane stream entries inside one ExecuteWrite (amortizing BEGIN/COMMIT); this flattens all their statements and runs the same coalesceByTemplate over the union — so a template repeated across entries collapses too. One coalescer, two fan-in levels.

4 · Lanes — partitioning the one writer

L9's "two lanes" live in lane_params.go. A (lane, chainID) pair derives a coherent set of Redis + lease coordinates (stream, consumer group, halt flag, lease key) via LaneParamsFor. The plan (FORTA-2843) is a physical split into a state lane and a meta lane under separate writers, so high-churn state writes can't head-of-line-block slower meta writes. Two guards keep it honest:

5 · The DLQ — durability over XACK-and-forget

Finally L9's "DLQ + halt," and it's a direct enforcement of L8's block-determinism. The bug it fixed (dlq.go, FORTA-2642):

BeforeAfter
a poison entry (Cypher parse fail) past retryXACK + log at WARN → the mutation is lostatomically XADD to graph-writes-dlq:{chain} + remove from pending, in one MULTI/EXEC, then halt the Consumer
consequencestate at cursor N is no longer a pure function of blocks 0..N — determinism broken (L8)nothing is lost; the entry is preserved with dlq_reason / source_id / retries / moved_at for a human
Halt, don't limp
Moving a poison entry to the DLQ halts the writer rather than skipping ahead. Skipping would let block N+1 apply on top of an N that's missing a mutation — silent corruption that violates "state = f(blocks 0..N)." Halting freezes the pipeline loudly so an operator intervenes. And the move is one MULTI/EXEC: on EXEC failure the source stays pending (retried next drain), so there's no "acked but not in the DLQ" gap. It's L8's "fail loud, never corrupt silently," reduced to a Redis transaction.
Three invariants, one package
You've now opened the machinery three lessons pointed at: L4's coalesce (coalesceByTemplate, intra + cross-entry), L9's lanes + DLQ (lane_params.go, dlq.go), and L31's non-coalescable WITH (coalescable). The recorder is the quiet seam where clean per-event handler code is turned into an efficient, ordered, durable write stream — without the handlers ever knowing.

Check yourself

1. What does the Recorder do when a handler calls neo4jExec on it?
2. Why do most of the Recorder's Tx methods (UpsertNode, Commit, …) panic instead of being implemented?
3. coalesceByTemplate buckets statements by what key?
4. Coalescing reorders statements within a batch. What makes that safe?
5. A statement contains a WITH clause. How does the recorder treat it, and why does L31 care?
6. What does CoalesceAcrossEntries add over the intra-batch coalesce?
7. The DLQ replaced an earlier "XACK + log at WARN" on a poison entry. What was wrong with the old behavior?
8. Why does moving an entry to the DLQ also halt the Consumer rather than skip ahead?
↳ Ask your teacher
Try: "Show me rewriteForUnwind — how is a MERGE turned into an UNWIND body?" · "What's alreadyUnwoundRe matching, exactly?" · "How does the Consumer resume after a DLQ halt is cleared?" · "What's the state-vs-meta lane split meant to separate?" · "How does SetCursor stay last through coalescing?"

What you can now do

Grounded in: pkg/graphwrite/recorder.go (Recorder buffering Tx + deliberate panics, coalesceByTemplate bucket-by-(text,keys)→UNWIND + singleton passthrough + first-occurrence order, coalescable 4 gates incl. hasWithClause, CoalesceAcrossEntries, rewriteForUnwind/alreadyUnwoundRe), lane_params.go (LaneParamsFor, default-lane bit-identical, AssertCoherent, FORTA-2843 state/meta split), dlq.go (moveToDLQ MULTI/EXEC + halt, DLQMaxLen=10k, dlq_reason/source_id/retries/moved_at, FORTA-2642 determinism fix). Verify against source — the code is the truth.