KeyedSum, ×1.001, round2, big.Int — one discipline, five lessons. ~13 min.
Synthesizes: L19 · L29 · L32 · L39Anchor: a test green locally, red in CINew: exact-in, tolerant-out
Five lessons hit the same odd details: a sum routed through floats.KeyedSum, a comparison with a
×1.001 slack, outputs round2'd, balances summed in big.Int. In most software a 17th-decimal wobble is
noise you'd never think about. Here it's load-bearing — and the reason ties together into one discipline you can
state in a sentence: be exact where error accumulates, tolerant where you compare.
Your anchor: the test that's green locally and red in CI
Every engineer has hit it — a calculation passes on your machine, fails in CI, passes again on rerun. The usual culprit
is non-determinism: the same inputs produce a different answer run-to-run, by a hair. In a system that values
billions and must match a reference implementation bit-for-bit, that hair is the whole problem. This lesson is the
codebase's defenses against it.
1 · The root cause: float math isn't associative, and maps aren't ordered
Two facts combine into a bug:
// (1) float64 addition is non-associative: rounding makes order matter.
(a + b) + c != a + (b + c) // differ in the last bit(s)// (2) Go map iteration order is RANDOMIZED per run.
total := 0.0
for k := range someMap { total += someMap[k] } // sums in a different order every run
Put them together and a plain += over a map produces a different last-ulp answer each run — from identical
inputs. Most of the time nobody notices. In this system, three things make it matter.
2 · Why it's load-bearing here (not aesthetics)
Pressure
Why a ulp flips an outcome
Byte-parity vs Python (L23/L29)
the engine is a port; the diff harness compares to Python within 1e-4/1e-2 tolerance — but a drift that crosses a threshold or equality flips a token to "mismatch" and fails CI
Outputs feed decisions
a value is compared to a rule threshold (fire or not, L45), a cap (capped or not, L19), a changed-only guard (write or not, L38) — equality/threshold checks turn a ulp into a different action
Determinism = reproducibility (L8)
replay, debugging, and parity all assume same-inputs→same-output; a non-reproducible number can't be diffed, replayed, or trusted
The reframe: a flipped ulp isn't "slightly wrong"
It's wrong by being non-reproducible. A value that's off by 1e-15 but stable is fine — tolerances absorb it. A
value that changes run-to-run can't be matched against Python, can't be replayed to the same state, and can flip a
threshold inconsistently. The enemy isn't inaccuracy; it's non-determinism.
3 · The discipline — two complementary halves
Exact / deterministic where error ACCUMULATES
Summing many values is where order-dependence bites, so make it order-independent or exact.
floats.KeyedSum · big.Int · sorted iteration
Tolerant where you COMPARE
Comparing a result to a bound or another value is where you must not let float noise trip a flag.
×1.001 · 1e-4 / 1e-2 · scoreEpsilon
The exact-in half — the gallery
Tool
What it does
Seen in
floats.KeyedSum
sort the map keys, then Neumaier (compensated) sum — deterministic order and recovers lost low bits
a cap "fired" only if the raw aggregate exceeds the bound by >0.1% — so rounding noise doesn't raise a spurious cap flag
at_risk caps (L19)
1e-4 rel / 1e-2 abs
two USD values "match" if within tolerance — float drift below this isn't a parity failure
parity diff (L29), conservation band (L32)
scoreEpsilon = 5e-5
half the 4-dp rounding quantum: a recomputed score within it is "unchanged" → skip the write
node_risk_score changed-only writes (L38)
And the output quantum: round2 / round4
Outputs are rounded to a fixed number of decimals (2 for USD, 4 for ratios) before they're stored. That makes the stored
value stable (a sub-quantum recompute lands on the same figure) and gives the tolerant-out slacks a clean grid to
work against — scoreEpsilon is literally half the rounding quantum. Rounding is the bridge between the exact-in
computation and the tolerant-out comparison.
4 · It's enforced, not trusted to memory
This discipline isn't a code-review hope — it's mechanized at both ends:
A lint guard forbids the footgun. A raw += accumulating over a map/slice fails lint unless it carries a // floats:ok (…deterministic order…) annotation justifying why the order is fixed. So you can't merge a non-deterministic sum without either using KeyedSum or proving the iteration order is stable.
The parity harness is the gate.DiffEnrichedGraphs (L29) runs the Go output against Python every cycle; a determinism slip that drifts a token past tolerance shows up as a mismatch finding. The discipline is tested, continuously, against the reference.
One sentence, five lessons
Be exact (or order-deterministic) where error accumulates; be tolerant where you compare; round to a fixed quantum
in between; and let the linter and the parity harness enforce it. Every KeyedSum, every ×1.001, every
big.Int sum, every round2 you saw was one move in that single discipline — the price of being a billions-valuing
engine that must reproduce a reference bit-for-bit.
Check yourself
1. What's the root cause that makes a plain += over a Go map non-deterministic?
2. Why is a non-deterministic sum a real problem here, when in most apps it's ignorable noise?
3. The lesson reframes the danger as non-determinism, not inaccuracy. What follows from that?
4. What does floats.KeyedSum do, and why both parts?
5. Conservation (L32) sums HOLDS in big.Int but forms the final ratio in float64. Which half of the discipline is each?
6. The capSlackTolerance = 1.001 in the at_risk caps (L19) is which half of the discipline?
7. Why does the codebase round outputs to a fixed quantum (round2 / round4)?
8. How is the "use KeyedSum, not a raw +=" rule actually enforced?
↳ Ask your teacher
Try: "Show me a real // floats:ok annotation and what justifies it." ·
"What exactly is Neumaier summation vs naive Kahan?" ·
"Could integer/decimal types eliminate the whole problem — why not use them everywhere?" ·
"How does the parity harness pick the 1e-4 / 1e-2 tolerances?" ·
"Where would a determinism bug most plausibly still slip through today?"
What you can now do
Explain the root cause: non-associative float addition × randomized Go map order ⇒ non-deterministic sums.
Say why it's load-bearing here: byte-parity vs Python, outputs feeding threshold/equality checks, replay determinism.
State the discipline in one line — exact/deterministic where error accumulates, tolerant where you compare.
Place each tool: KeyedSum / big.Int / sorted iteration (in) vs ×1.001 / 1e-4 / scoreEpsilon (out), with round2 bridging.
Recognize the floats:ok lint guard and the parity harness as the enforcement, and spot a raw map-+= as a parity bug.
Two syntheses down
The combinator family (L46) and the float-determinism discipline (here) are the two big cross-cutting lenses on the risk
engine: what to combine (collapse vs add, worst vs best) and how to combine it safely (deterministic, tolerant,
enforced). Together they're most of what separates "reads the risk code" from "could change it without breaking parity."