How the graph grows itself — one address discovers the next, without a hardcoded map of DeFi. ~14 min.
DeFi has tens of thousands of contracts and no central registry. You cannot hardcode the map — it changes daily. So the indexer doesn't try. It starts from a few seeds and grows the graph by following the chain itself: every contract it enriches reveals the addresses it depends on, those get queued, and the loop repeats. That self-expanding loop is the discovery flywheel — the subsystem that turns "watch these 5 tokens" into "watch the entire dependency cone around them."
getOwners()). Is it an ERC-4626 vault? (asset()). Each answer hands you more
addresses to investigate — the impl, the signers, the underlying. Discovery is exactly that instinct, automated
and run to a fixpoint.
L2 gave you the node lifecycle (bare → enriched). L5 gave you the 15-stage worker. The piece those left implicit is that stage 1 and stage 15 form a cycle:
The worker polls Memgraph for pending_enrichment=true nodes (a pull loop), runs the pipeline,
and at the end (stage 15) adds every discovered RelatedAddrs to the monitored set. Those become new bare
nodes, get polled, and the cycle continues until it reaches a fixpoint — no new addresses. The graph discovers its
own boundary.
Stage 1 is pure on-chain interrogation — and every probe that succeeds appends to RelatedAddrs. Here's the
proxy resolver, reading the three canonical EIP-1967 storage slots directly:
// enricher.go — resolveProxy (slots from rpc_calls.go) implBytes, _ := telemetry.StorageAt(ctx, rpc, addr, implSlot, nil) // 0x360894…382bbc implAddr := common.BytesToAddress(implBytes) if implAddr != zeroAddr { c.IsProxy = true c.Implementation = &implAddr c.RelatedAddrs = append(c.RelatedAddrs, implAddr) // ← follow the impl } // then adminSlot (0xb531…6103) → ProxyAdmin, beaconSlot (0xa3f0…3d50) → ProxyBeacon, both appended too
| Probe | EVM mechanism | Discovers |
|---|---|---|
| Proxy resolution | eth_getStorageAt on EIP-1967 impl/admin/beacon slots | implementation, proxy admin, beacon |
| Wrapper detection | asset() (ERC-4626), underlying(), stETH() (wstETH) | the underlying token |
| Multisig detection | getOwners() on a Gnosis Safe | each signer key (→ L22's expansion!) |
| Curator/manager | curator() / manager() on vaults | the controlling entity (→ CURATES edge) |
| Deployer | Etherscan/Blockscout contract-creation lookup | the deploying EOA |
asset() resolution is how a receipt token finds the asset it wraps — the WRAP_UNWRAP edge from
L21's exit liquidity. getOwners() is how a Safe's signers get into the graph — the fan-out targets from L22's
multisig expansion. curator() feeds the admin attribution L19's cells are built on. Discovery is the
upstream that makes every downstream subsystem you've studied possible.
The pull-loop discovers structure, but two other mechanisms run alongside it.
Some structure is too cheap to rediscover by crawling. A seeder bootstraps a protocol's known shape
directly. Discovery keeps a static map of knownFactories — Uniswap V2/V3/V4, Curve, Balancer, Aerodrome,
PancakeSwap, etc. — so when a pool's factory() resolves to one, the protocol is inferred instantly, and the
factory's PairCreated/PoolCreated events can be walked to enumerate every pool it ever made:
var knownFactories = map[string]string{ "0x5c69bee701ef814a2b6a3edd4b1652cb9cc5aa6f": "uniswap", // V2 "0xba12222222228d8ba445958a75a0704d566bf2c8": "balancer", // V2 Vault "0xb9fc157394af804a3578134a6585c0dc9cc990d4": "curve", // StableSwap Factory // …Aerodrome, Velodrome, Camelot, Maverick, Trader Joe }
Protocol-specific seeders (curve_lending_seeder.go, balancer_seeder.go,
bridge_seeder.go) go further, encoding each protocol's particular topology.
Discovery finds that a vault holds an asset; the dollar value of that holding drifts every block. A refresher periodically re-reads on-chain state to keep edge USD values current — and this is the part that feeds straight into everything you learned about at_risk:
| Refresher | Re-reads | Keeps fresh |
|---|---|---|
| LP Reserve (~60 min) | getReserves() + totalSupply() on V2 AMMs | LP-holder HOLDS USD values |
| Receipt Token (~60 min) | protocol-specific total underlying (e.g. ERC-4626 totalAssets()) | receipt-token HOLDS USD values |
| Oracle Bridger (~30 min) | nothing (pure graph) — propagates transitive ORACLE_DEP | which markets inherit which oracles (L20!) |
at_stake_usd, exit_v2_total, the receipt-token TVL in Path 3b? Those numbers are kept
current by these refreshers. The Oracle Bridger is literally how the per-(oracle, market) eligibility from L20's oracle
path comes to exist. Discovery and at_risk aren't separate stories — discovery is at_risk's data supply chain.
A self-expanding crawl over a fully-connected financial graph could, in principle, swallow the whole chain. It doesn't, because expansion is value-gated — the same bounded-traversal discipline you met in L2:
StorageAt
just skips that related address with a WARN — fail-loop, not fail-stop, exactly like L8/L23). Coverage vs RPC budget is
the central tension of this subsystem.
resolveProxy find a proxy's implementation?asset(). Discovery concludes…StorageAt probe fails mid-classification. What happens?asset()/underlying(), getOwners(), curator) and what each discovers.Grounded in: docs/enrichment-pipeline.md (15-stage pipeline, stage 1 RPC classification, stage 15 discovery propagation, periodic refreshers), pkg/enrichment/enricher.go (resolveProxy EIP-1967 slots, detectWrapper asset()/underlying()/stETH(), RelatedAddrs accumulation), pkg/enrichment/discovery.go (knownFactories, KnownV2/V3Factories), rpc_calls.go (implSlot/adminSlot/beaconSlot), the protocol seeders + LP/receipt/oracle refreshers. Verify against source — the code is the truth.