Where "single writer," "Redis is rebuildable," and "heal safely" are actually implemented. ~14 min.
The last corners. L9 asserted "all writes go through one leased pod"; L8 said "Redis is a rebuildable
projection"; L30 said healers "ship shadow-first." Each was a principle. This capstone opens the small packages where
those principles are actually code — and the lowest-level one, statetracker, turns out to hold the sharpest
distributed-systems idea in the whole repo: fencing tokens.
statetracker — the coordination primitivesThis tiny package is the Redis-backed coordination layer under everything: the lease, the writer epoch, the cursor cache, and the transient-error classifier. It's where "one writer" stops being a diagram and becomes a lock.
AcquireLease is a Redis SET NX with a TTL: the first pod to set lease:graph-writer wins, others
get ErrLeaseHeld. A keepalive renews it every ttl/3. The subtlety is the renew:
// renew ONLY if the key still holds OUR id — atomic GET-and-PEXPIRE in one Lua script if redis.call("get", KEYS[1]) == ARGV[1] then return redis.call("pexpire", KEYS[1], ARGV[2]) else return 0 end
A naive EXPIRE would extend whatever id sits at the key — including a thief's, after a Redis brown-out let a
second pod steal the slot. The atomic check returns 0 = "you lost it"; the keepalive then stops and never re-acquires
(the legitimate next holder is whoever wins the next SET NX). IsHeld is the authoritative check, wired to
/healthz so a pod that lost the lease gets recycled.
IncrWriterEpoch does a Redis INCR once per lease acquisition, so each writer generation gets a strictly
higher number. That epoch is stamped on BlockCursor.writer_epoch on every commit, and the database rejects a
write carrying an epoch lower than what it already persisted. So even a zombie that wrongly thinks it holds the lease
is fenced out at the DB — its stale epoch loses. Two layers: the lease elects, the epoch enforces.
IsTransientRedisErr classifies a Redis error as retryable (so the caller backs off and retries) or fatal
(bubble up). Retryable = sentinel brown-outs (LOADING, NOREPLICAS, CLUSTERDOWN), network errors, and
notably OOM (maxmemory cap → treat as backpressure, pause ingest, alert). This one predicate turned 2,599
production crashes in 9 days into a retry-with-backoff — the difference between a crashloop and a pod that rides out a
Redis hiccup. It's L8's "fail loud, don't corrupt" with a crucial refinement: distinguish a transient blip from a real
fault before you decide to die.
CacheBlockCursor writes cursor:{chain}:committed to Redis after the Neo4j tx (which holds the
authoritative cursor) commits. So the Redis cursor is a fast, rebuildable mirror of a source-of-truth that lives in
the graph — exactly L8's "Redis is a projection." If it's lost, it's re-derived from Memgraph on startup.
genesis — birth and self-healL8 told you "Redis is a rebuildable projection; the system self-heals on startup." Here's the literal function:
PopulateBalancesFromNeo4j rebuilds the entire Redis balance cache from Neo4j HOLDS.quantity_raw — the
canonical balance as of the committed cursor. It exists to recover from a crash between the Neo4j commit and the
deferred Redis SET, and it's idempotent: safe to run every startup, free when nothing's missing, cheap when
the cache is hot.
vaultheal — the healer pattern, and when to skip shadow modeThe last corner is a second chainref healer (VAULT_ASSET / RECEIPT_FOR), riding the same reconcile
transport from L31. By itself that just confirms the pattern generalizes — but it carries one sharp new idea: it ships
live, with no shadow mode. Why is it allowed to write immediately when the OWNS / ADMIN_CTRL healers (L30/L31) couldn't?
asset() and an aToken's UNDERLYING_ASSET_ADDRESS() are immutable on chain — they never
change. So a chain-confirmed (src → asset) pair that disagrees with the graph is unambiguously wrong right
now, with no race against a newer block. Contrast HOLDS balances, which move every block — there a healer must shadow-first
because "stale" is ambiguous. Whether you can safely auto-write depends on whether the truth you're checking can change
under you. That's the real lesson L30's shadow mode was pointing at.
One more detail: because asset() is single-valued, fixing a drift takes two legs — PRUNE the stale
(src)→(old asset) edge AND HEAL toward (src)→(correct asset). Healing alone would leave the vault with two
VAULT_ASSET edges, which every reader sees and which re-surfaces as a drift forever. The prune+heal pair is the
convergence invariant.
IsTransientRedisErr classifies OOM (maxmemory cap) as transient, not fatal. Why?CacheBlockCursor writes the Redis cursor only after the Neo4j tx commits. What does that ordering reflect?PopulateBalancesFromNeo4j runs on every startup and is idempotent. What failure does it recover from?Grounded in: pkg/statetracker/lease.go (AcquireLease SET NX, renewIfHeldScript atomic GET-and-PEXPIRE, IsHeld authoritative, lost-lease-never-reacquire), writer_epoch.go (IncrWriterEpoch once-per-acquire monotonic fence, stamped on BlockCursor.writer_epoch), transient.go (IsTransientRedisErr — sentinels/network/OOM→backpressure, the 2,599-crash fix), cursor.go (CacheBlockCursor after Neo4j commit), pkg/genesis/balance_rebuild.go (PopulateBalancesFromNeo4j idempotent self-heal), pkg/reconcile/vaultheal/healer.go (live-not-shadow on immutable asset()/underlying(), prune+heal two-leg convergence). Verify against source — the code is the truth.