The three loose ends: what fires a run, the cells you haven't seen, and how it's all kept honest. ~14 min.
You've seen how at_risk computes (L19–L22). Three operational questions remain: What kicks off a run? What do the boring, non-oracle cells look like? And how does anyone know the answer is right? They're the bookends of one lifecycle — trigger → compute → proof — so we'll do all three here and close the book on at_risk.
AtRiskScheduler (at_risk_scheduler.go) drives the metric on a periodic cadence:
load the full risk graph, run EnrichAtRisk over the live focus-token list, write the per-token
at_risk_summary / at_risk_cells / at_risk_updated_at back to Memgraph. Default tick: 30 min.
AT_RISK_PARTIAL_LOAD), but only after parity is verified.
(Don't over-generalize "the risk engine is incremental" — at_risk's flagship cells are a periodic full recompute.)
Three properties tie straight back to earlier lessons:
Run() loop is the only caller, so there are never concurrent at_risk writes. Off by default (AT_RISK_SCHEDULER_ENABLED=false); cycles are sequential — a slow cycle delays the next tick, never stacks.at_risk.cycle.errors at WARN and the loop continues; three consecutive failures tip the AtRiskSchedulerStalled alarm at 90 min (2× cadence) — observability you met in L11.HeavyMu / WithHeavy (heavy_gate.go) is a process-wide mutex — at_risk shares the box with centrality, DebtRank, and other heavy computers, so the gate serializes them instead of letting 7+ fan out and thrash memory.L20 showed the gnarliest path. These are the opposite: the plain majority. Path 3
(emitHoldsBasedCells) walks every venue that HOLDS the focus token and emits a cell, with
target_role read straight off the venue's subtype:
switch vKind { case "bridge": targetRole = TargetRoleBridge case "pool": targetRole = TargetRolePool default: targetRole = TargetRoleVault // catch-all } // at_stake = focusUSD (USD of T this venue holds); + the usual admin & contract cells
userHolderSubtypes). The comment says it best: "a hack of Bitfinex hot wallet is not an at_risk attack surface
on the token." at_risk measures risk in the protocol contracts that hold T (where a contract/admin compromise
drains value systemically), not wherever tokens happen to sit. This single filter is the line between "protocol
risk" and "someone got phished."
Path 3b (emitVaultAssetOnlyCells) is a small but instructive fix-up: MetaMorpho-style vaults
dropped their (phantom) HOLDS edge in PR #168 but still carry a VAULT_ASSET edge. Path 3 would miss
them, so 3b catches vaults reachable only via VAULT_ASSET, using vault.attrs.tvl_usd as the balance
proxy and skipping any vault Path 3 already covered. Same admin + contract-failure cell pairing as everywhere else.
Every lesson since L6 has dropped phrases like "mirrors at_risk.py:2689," "any change here is a parity
break," and the KeyedSum/sorted-iteration ulp discipline. Here's the machine behind those words. The entire Go
at_risk engine is a port of a Python original (at_risk.py), and correctness is defined as
matching Python's output. Two mechanisms enforce it:
DiffEnrichedGraphs)Run both implementations on the same graph, compare per-token, field by field. USD floats compare within a tolerance; counts must match exactly:
USDRelativeTolerance: 1e-4 // 0.01% relative drift allowed on $ fields USDAbsoluteTolerance: 1e-2 // or 1 cent absolute, whichever is kinder // count fields (n_cells, n_anchor, …) — exact match required
This is exactly why the engine is so fussy about float determinism: a ulp of drift from an unsorted map walk or a
naive sum could push a token past tolerance and fail the harness. cmd/at-risk-diff is the CLI front end;
the tolerances track docs/at_risk_io_schema.md §9.3.
CheckInvariants) — independent of PythonThis is the satisfying capstone. Four invariants run against the Go output alone — and every one of them is a fact you learned in a previous lesson, now encoded as a machine-checked guarantee:
| # | Invariant | You learned it in… |
|---|---|---|
| 1 | extractable_usd ≤ at_stake_usd × 1.001 | L19 — the extractable≤at_stake cap (and the 0.1% slack). |
| 2 | no oracle cell with outcome_class != "trigger" | L20 — oracle attacks trigger a mispricing, a distinct outcome class. |
| 3 | no role_class == "dos" cell with extractable > 0 | L20/L21 — a denial-of-service drains nothing; extractable must be 0. |
| 4 | no deployer_fallback_no_admin cell with extractable > 0 | L19/L20 — the $8B deployer-fallback guard. |
Run() is the only caller that writes at_risk. Which earlier principle is that?HeavyMu / WithHeavy exists because…emitHoldsBasedCells) deliberately SKIPS EOAs, exchange hot wallets, and ops multisigs. Why?emitVaultAssetOnlyCells) catch that Path 3 misses?DiffEnrichedGraphs, USD fields compare within a tolerance but count fields must match exactly. The tolerance is roughly…CheckInvariants runs 4 checks against the Go output WITHOUT Python. One is "no dos cell with extractable > 0." What kind of guarantee is this?extractable ≤ at_stake × 1.001. Where have you seen that exact rule before?DiffEnrichedGraphs (tolerance on $, exact on counts) and why float determinism matters to it.Grounded in: pkg/risk/at_risk_scheduler.go (periodic full-graph cadence 30 min, single-goroutine writer, fail-loop + AtRiskSchedulerStalled, AT_RISK_PARTIAL_LOAD, PR #474 16.2-min measure), heavy_gate.go (HeavyMu/WithHeavy), at_risk_cells.go (emitHoldsBasedCells Path 3 + user-holder exclusion, emitVaultAssetOnlyCells Path 3b / PR #168 / tvl_usd), at_risk_diff.go (DiffEnrichedGraphs tol 1e-4/1e-2, CheckInvariants 4 rules §9.4), cmd/at-risk-diff. Verify against source — the code is the truth.