Lesson 02 · The Data Model

Nodes, edges, and the shape of the risk graph

The graph's vocabulary — and the three properties that will bite you if you ignore them. ~10 minutes.

Builds on: Lesson 1 New: property graph nodes New: edge categories New: reading Cypher

In Lesson 1 you learned the graph is the output: events in, edges out. Now we open the box. By the end you'll know what a node actually is, the eight families the edges fall into, and the three node properties every contributor must respect — get one wrong and your query silently returns garbage or scans the whole DB.

This lesson has a companion

Everything here is distilled into a printable cheat-sheet: 📋 the Glossary. Keep it open in a tab while you work — it's the canonical vocabulary we'll use in every future lesson.

1 · A node is an `:Entity`

Here's the first surprising thing. Almost every node in the graph carries the same primary label: :Entity. The kind of thing it is (token, vault, multisig…) lives in a property, and is also mirrored as a secondary label.

(:Entity:Token {
  id: "0xa0b8...eb48", // the address — the node's identity
  graph_id: "risk-graph-rt", // which graph partition
  category: "token", subcategory: "stablecoin",
  symbol: "USDC", usd_price: 1.0,
  pending_enrichment: false
})

The node's identity is its address (id). There are 19 node types — the full list is in pkg/types/schema.go as NodeType constants and lives in your glossary. A few you'll meet constantly:

Node type	What it is	Secondary label
`token`	An ERC-20 / focus token	`:Token`
`eoa`	An externally-owned account (a wallet)	`:EOA`
`contract`	A generic smart contract	`:Contract`
`multisig`	A Gnosis-Safe-style multisig	`:Multisig`
`vault`	An ERC-4626 / yield vault	`:Vault`
`lending_market`	A money market (Aave aToken, Compound cToken…)	`:LendingMarket`
`oracle`	A price feed (Chainlink etc.)	`:Oracle`

Source: pkg/types/schema.go — NodeType constants + NodeTypeToLabel map.

2 · The three properties that will bite you

① `graph_id` — the partition key

One Memgraph instance holds multiple independent graphs side by side, separated only by the graph_id property on every node and every edge. The live one might be risk-graph-rt; a shadow/test one might be test_carlos. They share the database but must never mix in a query.

Contributor rule #1

Every query filters by graph_id. Forget it and you'll union two graphs together and get nonsense — or accidentally read another partition's data. It's a property, not a label, so the DB won't stop you. This is the #1 way new contributors get silently-wrong results.

② `category` / `subcategory` — with a legacy trap

Current code stores a node's kind in category + subcategory. But older nodes carry type + subtype instead. The repo's own CLAUDE.md mandates the defensive read:

// Reading a node's kind safely across old + new nodes (from CLAUDE.md):
coalesce(n.type, n.subcategory, n.category)

Source: risk-graph-indexer/CLAUDE.md § Sanity checks. coalesce returns the first non-null — so it works whether the node is legacy (type) or current (category).

③ `pending_enrichment` — the lifecycle flag

From Lesson 1: the indexer creates bare nodes with pending_enrichment=true, and the enrichment-worker flips it to false once it's classified. So this boolean tells you whether a node's metadata (symbol, decimals, labels) can be trusted yet.

3 · Edges roll up into eight categories

There are ~19+ edge types, but you don't memorise them flat — they group into eight parent categories (the EdgeCategory map in schema.go). Learn the categories and the types slot in underneath:

Category	Meaning	Edge types in it
CONTAINS	X holds/contains a balance of Y	`HOLDS`, `POOL_ASSET`, `VAULT_ASSET`, `RESERVE_BACKING`
CONTROLLED_BY	X is controlled/approved/owned	`ADMIN_CTRL`, `ADMIN_OF`, `OWNS`, `OWNS_ADMIN`, `CURATES`, `APPROVES`, `CUSTODY_VIA`
COLLATERALISED_BY	X is backed by collateral Y	`LENDING_COLLATERAL`, `VAULT_ALLOCATION`
DEPENDS_ON	X structurally depends on Y	`ORACLE_DEP`, `BRIDGE_BACKED_BY`, `DVN_VERIFIES`, `SUBORDINATE_TO`
DERIVED_FROM	X is a derivative/receipt of Y	`RECEIPT_FOR`, `DEBT_FOR`, `WRAP_UNWRAP`
OPERATED_BY	X is operated/deployed by Y	`DEPLOYED_BY`, `SERVICE_FOR`
AUDIT_TRAIL	A historical/governance record	`AFFECTS` (governance actions)
ANALYTICAL	Derived risk projection, not real topology	`AT_RISK`

Source: pkg/types/schema.go — the EdgeCategory map. The full edge list with directions + properties lives in your glossary.

Architectural vs Analytical — a real distinction

The first seven categories are architectural: they describe the real on-chain topology (who holds, controls, backs what). ANALYTICAL edges like AT_RISK are computed by the risk engine and layered under the true graph — "if this admin key is compromised, these tokens are at risk." They're projections, not structure, and the graph-viz architecture view deliberately hides them (FORTA-2986).

Your EVM anchor, extended

You already read these relationships on-chain every day — the graph just makes them queryable. An Aave aToken's underlying reserve is a LENDING_COLLATERAL edge. A proxy's admin is an ADMIN_CTRL edge. A Safe's signers are OWNS edges. Same facts you'd pull from contract storage — now they're traversable.

4 · Reading your first Cypher

Cypher is "ASCII-art SQL for graphs." Nodes are (parens), edges are -[brackets]-> with a direction. Here's a real-shaped read — "what tokens does this wallet hold?":

MATCH (w:Entity {id: $wallet, graph_id: $g})-[r:HOLDS]->(t:Entity {graph_id: $g})
RETURN t.symbol, r.quantity_raw, t.usd_price

Read it left to right: start at the wallet node w (anchored by its id), follow an outgoing HOLDS edge r, to a token node t. Return the token's symbol, the raw balance on the edge, and its price.

Notice three things, all of which are rules, not style:

Both nodes are :Entity and both carry graph_id: $g — partition-scoped, per Rule #1.
It's anchored at a known id — it starts from one node, not the whole graph.
The edge has a direction (->). HOLDS goes wallet→token.

Contributor rules #2 and #3 (straight from CLAUDE.md)

#2 — No full-graph scans. Never write MATCH (n) with no anchor. The graph is huge; an unanchored scan = OOM + timeout. Always start from a known id or a label+index, bound depth, and LIMIT.

#3 — Query both directions. Different seeders may write the "same" logical edge in opposite directions. Never assume which way an edge points — check, or match both (-[r]- without an arrow) when you're unsure.

Source: risk-graph-indexer/CLAUDE.md § "Graph DB queries — no brute-force" and § "Before writing code". Real query shapes verified against pkg/ (e.g. the :HOLDS reads in the graphwrite/risk packages).

Check yourself

Instant feedback — these target the exact things that trip up new contributors.

1. You write MATCH (t:Entity)-[:HOLDS]->(x) RETURN x and get results from two unrelated graphs mixed together. What did you forget?

2. An old node has type: "token" but no category. A newer node has category: "token" but no type. How do you read the kind safely for both?

3. The AT_RISK edge is in the ANALYTICAL category. What does that tell you?

4. A Gnosis Safe's signers, a proxy's admin, and a token approval all share which edge category?

5. Why is MATCH (n) RETURN n a fireable offense in this codebase?

↳ Ask your teacher

Try these in chat: "Open the real HOLDS write query in the code," · "What's the difference between ADMIN_CTRL and ADMIN_OF?" · "Walk me through a query that finds every vault exposed to USDC," · "Why is everything an :Entity instead of separate labels?"

What you can now do

Describe a node: an :Entity keyed by address id, scoped by graph_id.
Name the three properties that bite — graph_id, category/type, pending_enrichment — and the coalesce read.
Place any edge into one of the eight categories, and tell architectural from analytical.
Read a simple anchored, partition-scoped Cypher MATCH, and recite the three query rules.

← PreviousLesson 01 · The Pipeline Next →Lesson 03 · The Decoder Path

Grounded in: pkg/types/schema.go (NodeType, EdgeType, EdgeCategory, NodeTypeToLabel), CLAUDE.md (graph-id partition, coalesce rule, no-scan / both-directions rules), docs/architecture.md. Verify against source — the code is the truth.

1 · A node is an :Entity

2 · The three properties that will bite you

① graph_id — the partition key

② category / subcategory — with a legacy trap

③ pending_enrichment — the lifecycle flag

3 · Edges roll up into eight categories

4 · Reading your first Cypher

Check yourself

What you can now do

1 · A node is an `:Entity`

① `graph_id` — the partition key

② `category` / `subcategory` — with a legacy trap

③ `pending_enrichment` — the lifecycle flag