Cross-Cutting Synthesis · Deeper Track

One scarce resource, shaping everything

Why the architecture is the way it is: the economics of RPC. ~13 min.

Synthesizes: L24 · L25 · L29 · L35 Anchor: archive node + explorer rate limits New: cost shapes architecture

"Periodic, not per-block." "Sample, don't sweep." "Multicall3: 50 calls → 2." "Cache the balance." "Pure graph, no RPC." You've met these as local choices in a dozen lessons. They're not local — they're all downstream of one force. The system has a single binding constraint, and an enormous fraction of its design is the economics of spending that resource carefully. Name the resource and the architecture stops looking arbitrary.

Your anchor: you've felt this constraint

Anyone who's built on-chain knows it — a self-hosted archive node is finite and precious, and Etherscan / Blockscout are rate-limited and metered. An eth_call or getLogs is orders of magnitude more expensive (latency, rate limit, infra) than a Memgraph read. So the whole system tilts one way: read the graph, not the chain — and when you must touch the chain, touch it as little and as cleverly as possible. RPC is the budget everything is drawn against.

1 · The seven moves for spending less RPC

Almost every efficiency mechanism you've seen is one of these seven strategies:

Strategy	How	Seen in
Filter early	the monitored-set `SISMEMBER` drops ~95% of events before any enrichment; discovery is value-gated (≥$1M, structural deps only) so the crawl can't swallow the chain	L1/L2, L24
Cache & reuse	the balance hot cache, NAV/price stamps recomputed via `price-dirty` not re-read, block prefetch + 7-day cache	L36, L26, L1
Batch	Multicall3 collapses ~50 `eth_call`s into 2 `aggregate3`s; the RPC pool fans out reads concurrently	L25, L26
Sample, don't sweep	chainref verifies a subset per cycle — coverage trends statistically rather than re-reading 1.5M nodes from chain	L29/L34
Work a slice	per-token partial graphs (3.07 GB → 150 MB) load only the spine neighborhood, not the whole graph	L23, L38
Pace it (cadence)	nothing chain-reading runs per-block: at_risk 30 min, refreshers 30–60 min, the slow validator tier hourly vs the fast graph tier every 10 min	L23, L24, L34
Spread & degrade	round-robin across RPC endpoints; every read is best-effort (a failed probe skips, never crashes); `OOM → backpressure` protects the shared box	L26, L24/L32, L37

The cadence IS a budget decision

Notice how many mechanisms are "run less often." A 30-minute at_risk cycle, a 60-minute LP refresher, an hourly slow tier — these aren't arbitrary intervals. Each is the answer to "how stale can this data be before correctness suffers?", set as infrequently as correctness allows, because every cycle costs reads. When you see a periodic loop in this codebase, read its interval as a price tag.

2 · The big idea: the graph is a cache of the chain

Step back and the whole design is one move: the Memgraph graph is an elaborate, queryable cache of on-chain state, built so the hot paths read it (cheap) instead of the chain (dear).

Hot paths read the graph. Risk computation, rule evaluation, the read surface, even whole subsystems — the oracle bridger is pure graph, zero RPC (L27) — operate on cached state.
Periodic re-readers are the budgeted re-sync. Refreshers (L24), chainref verifiers (L29), conservation (L32) are exactly the controlled, paced, sampled touches of the source of truth that keep the cache honest — the only things that routinely spend RPC, and all of them rationed.
Ingest fills the cache push-style. Block events flow in once and update the graph; the system then serves from the graph until a refresher decides a value has drifted enough to re-read.

"Read the graph, not the chain" is the reflex

Whenever a computation can be answered from stored graph state, it is. The oracle bridger could have RPC-probed each market's oracle; instead it derives the dependency from existing edges in pure Cypher. That instinct — prefer the cached derivation over a fresh read — is the single most repeated efficiency decision in the codebase.

3 · The constraint is also the bill

Here's the satisfying loop back to L35. The cost-allocation model bills customers on four signals — and one of them, PROTO edges, is explicitly a proxy for "refresh-worker multicall RPC across the structural edge types," while HOLDS proxies balance lookups. In other words, the scarce resource the architecture is organized around is the very thing the business charges for. RPC budget isn't just an engineering constraint — it's the unit of value the platform sells. Spend it well and you both run cheaper and bill more fairly.

Why naming the resource demystifies the design

Once "RPC is the binding constraint" is in your head, the architecture reads as a series of obvious answers: Why periodic? reads cost money. Why sample? can't afford a full chain sweep. Why a balance cache? don't re-read what hasn't changed. Why Multicall3? amortize the round-trip. Why pure-graph where possible? the graph is the cache. Nothing is arbitrary — it's all one economy.

4 · The reflex for new work

This synthesis turns into a habit. Faced with any new feature, the first design question becomes: "what's its RPC cost, and how do I bound it?" — answered with the seven moves. Need fresh on-chain data? Can you read it from the graph instead (cache)? If you must call, can you batch (Multicall3), sample, or pace it? Can you bound the set (partial / filter)? Is the read best-effort so a failure degrades rather than crashes? A feature that ignores the RPC budget is the one that takes down the archive node — and a reviewer who's internalized this catches it on sight.

Check yourself

1. What is the single binding constraint that most of the architecture is organized around?

2. The monitored-set SISMEMBER filter drops ~95% of events. Which RPC-economy strategy is that?

3. Multicall3 turns ~50 eth_calls per contract into 2 aggregate3s (L25). Why does that matter so much?

4. chainref samples a subset of nodes per cycle rather than re-reading all 1.5M from chain. What's the trade-off it accepts?

5. The lesson frames the Memgraph graph as "a cache of the chain." What does that reframe explain?

6. The oracle bridger derives market→oracle dependencies in pure Cypher rather than RPC-probing each market (L27). Which reflex is that?

7. How does this constraint connect to the cost-allocation model (L35)?

8. You're adding a feature that needs a contract's current owner. What's the RPC-economy-minded first move?

↳ Ask your teacher

Try: "Roughly what's the RPC cost of one enrichment of a fresh contract?" · "Where would adding a naive per-block RPC read hurt the most?" · "How does the rpcPool decide which endpoint a read goes to?" · "Which refresher cadence is the most expensive, and why is it set where it is?" · "Is there anywhere the system over-reads and could be tightened?"

What you can now do

Name RPC (archive node + rate-limited explorers) as the binding constraint, and explain why it dwarfs graph-read cost.
Classify any efficiency mechanism into the seven moves: filter / cache / batch / sample / slice / pace / spread-degrade.
Explain "the graph is a cache of the chain," and why hot paths read the graph while paced re-readers touch the chain.
Read a periodic loop's interval as a budget decision, and explain why pure-graph computation is preferred.
Connect the constraint to billing (L35), and apply the "what's its RPC cost, how do I bound it?" reflex to new work.

Four syntheses — the system as a set of disciplines

Combinators (L46, what to combine) · float-determinism (L47, how reproducibly) · idempotency (L48, how to write safely) · RPC economics (here, what it all costs). Across these four lenses, the codebase's thousands of local decisions resolve into a handful of coherent principles — which is what "understanding it deeply, end to end" actually means.

← PreviousCross-Cutting Synthesis · Deeper Track Next →Lesson 50 · The Capstone · Application

Synthesizes code already cited in: monitored-set filter (L1/L2), value-gated discovery (L24), balance/price caches (L36/L26), Multicall3 admin probes (L25), readAllFeeds RPC-pool fan-out (L26), chainref sampling (L29/L34), per-token partial graphs (L23/L38), scheduler/refresher/validator cadences (L23/L24/L34), best-effort reads + OOM→backpressure (L24/L32/L37), pure-graph oracle bridger (L27); the PROTO/HOLDS billing signals (L35). Verify against source — the code is the truth.