Lesson 42 · Closing the Cypher Gap · Skills

Reading the Cypher

Decode any hot-path query cold — six idioms cover the whole codebase. ~14 min.

Builds on: L2 · L27 · L32 Anchor: SQL you already know New: read patterns, not memorize New: the hard rules in situ

You flagged Cypher as a gap, and forty lessons have quoted it at you. Here's the good news: the entire codebase's Cypher is built from about six recurring idioms. Learn to recognize them and you can read any hot-path query cold — and spot when one breaks a hard rule. This is a skills lesson: every query below is real, and you've already met most of them in context. Now we read them as Cypher.

Your anchor: it's SQL that matches shapes
Think SQL with one new superpower: instead of joining tables on keys, you draw the shape you want and Cypher finds it. (a)-[:REL]->(b) is "an a connected to a b by a REL edge." MATCH is SELECT … WHERE; MERGE is INSERT … ON CONFLICT (upsert); SET is UPDATE; RETURN is the projection. Nodes in (), edges in [], properties in {}. That's 90% of it.

Idiom 1 · Always anchored, always graph_id-scoped

Every single query starts by pinning a node by id and graph_id — never a bare MATCH (n). The read surface's point lookup (rules/node_lookup.go):

MATCH (n:Entity {id: $id, graph_id: $graph}) RETURN n.node_risk_score, n.symbol, …
The golden rule (CLAUDE.md)
graph_id scopes every query because multiple independent graphs share one Memgraph (L2). And the anchor on id is index-served — O(1), not a scan. A query that starts MATCH (n) with no anchor is the cardinal sin: it scans all ~2M nodes. If you read a hot-path query and it isn't pinned to a node or an edge type up front, that's the first thing to flag.

Idiom 2 · Read many by id — UNWIND a list

To read N nodes in one round-trip, UNWIND a parameter list into rows and match each (the batched form of idiom 1, from the same file):

UNWIND $ids AS nid
MATCH (n:Entity {id: nid, graph_id: $graph}) RETURN

UNWIND turns a list into rows — the inverse of collect(). You saw the write side of this everywhere (L41's coalescing produces exactly UNWIND $rows AS row <template>); here it's the read side. One idiom, both directions.

Idiom 3 · Bounded, both-directions traversal

The graph's signature query — the partial-graph spine walk that loads a focus token's neighborhood (L23/L38, risk/graph_partial.go):

MATCH (root:Entity {id: $root, graph_id: $graph})
OPTIONAL MATCH (root)-[r:<spine edge types>*1..6]-(n:Entity {graph_id: $graph})

Three things to read here, each a hard rule made concrete:

*1..6bounded variable-length path: follow the relationship 1 to 6 hops — never unbounded (* alone would walk the whole component). This is L2's "bound the depth."
-[r:…]-undirected (no >): traverse the edge in both directions. CLAUDE.md: "query both edge directions" — a directed -> here would silently miss half the neighborhood.
OPTIONAL MATCHa left-join: root is still returned even if it has no spine neighbors. A plain MATCH would drop isolated roots entirely.

Idiom 4 · Aggregate with collect()

L32's conservation read gathers every holder balance into one list (risk/verify_balance_conservation.go):

MATCH (t:Entity {graph_id: $graph, id: $token})-[r:HOLDS]->(:Entity)
RETURN collect(r.quantity_raw) AS qtys

Here the arrow is directed (->): HOLDS has a known direction (token→holder), so reading it one way is correct. collect() folds the matched rows into a single list returned in one row — the read-side inverse of UNWIND. The target node is (:Entity) with no variable because we only need the edge property, not the holder.

Idiom 5 · Idempotent writes — MERGE, never CREATE; and the anti-join

Every write is idempotent because delivery is at-least-once (L7/L9). Two flavours. The batched SET (L38's node_risk_score):

UNWIND $updates AS u
MATCH (n:Entity {id: u.id, graph_id: $graph})
SET n.node_risk_score = u.nrs, n.governance_risk = u.gov

And the anti-join idempotent edge create (L27's oracle bridger) — create the edge only if it's missing:

MATCH (token)-[:LENDING_COLLATERAL]->(market), (token)-[:ORACLE_DEP]->(oracle)
WHERE NOT EXISTS { MATCH (market)-[:ORACLE_DEP]->(oracle) }
MERGE (market)-[:ORACLE_DEP]->(oracle)
MERGE vs CREATE vs MATCH — and the no-stub rule
MERGE = "match it or create it" (upsert) — re-running is a no-op, which is why every write uses it (idempotency, L4/L9). CREATE would duplicate on replay — you'll almost never see it on the hot path. And note L26's deliberate choice to MATCH (not MERGE) the token before merging an oracle edge: MATCH requires the node to already exist, so a binding to a delisted token can't accidentally stub a phantom node. Picking MATCH vs MERGE is a correctness decision, not a style one.

Idiom 6 · Read a node's kind with coalesce (no APOC)

Memgraph has no APOC, and a node's kind can live under any of three properties, so the codebase reads it through a coalesce fallback chain (seen in chainref / rules scope):

WHERE coalesce(dep.type, dep.subcategory, dep.category) IN ['pool', 'vault', …]

coalesce(a, b, c) returns the first non-null — so "the node's kind" is "whichever of type / subcategory / category is set first." You'll also see the write-side don't-clobber idiom SET r.subcategory = coalesce(r.subcategory, $new) — "set it only if it isn't already set."

The hard rules, now that you can see them
Reading a hot-path query, check it against CLAUDE.md's non-negotiables — you now recognize each on sight: (1) scoped by graph_id; (2) anchored, never a bare MATCH (n) full scan; (3) variable-length paths bounded (*1..6, not *); (4) edges queried both directions unless the direction is genuinely known; (5) kind via coalesce (no APOC); (6) writes are MERGE + idempotent, and bulk writes go through the graph-writer (a forbidigo lint blocks raw ExecuteWrite, L9/L41).
You can now read the hot paths
Six idioms — anchored scope, UNWIND-by-id, bounded both-directions traversal, collect aggregation, MERGE/anti-join idempotent writes, coalesce-kind — cover essentially every query in the system. Each query you've seen cited across 40 lessons is now decodable, and you can audit a new one against the hard rules. The Cypher gap is, for reading purposes, closed.

Check yourself

1. Why does every query begin MATCH (n:Entity {... graph_id: $graph}) rather than MATCH (n:Entity)?
2. In (root)-[r:<spine>*1..6]-(n), what does *1..6 specify?
3. The spine walk uses -[r:…]- (no arrowhead). What does the missing direction mean?
4. The conservation query uses collect(r.quantity_raw). What does collect() do?
5. Why is nearly every write a MERGE rather than a CREATE?
6. The bridger uses WHERE NOT EXISTS { MATCH (market)-[:ORACLE_DEP]->(oracle) } before its MERGE. What's that for?
7. Why read a node's kind as coalesce(n.type, n.subcategory, n.category) instead of just n.type?
8. L26 deliberately MATCHes the focus token but MERGEs the oracle node. Why MATCH the token?
↳ Ask your teacher
Try: "Walk me through spineRelationshipFilter — which edge types are in the spine?" · "Show me a query with WITH and explain the scope boundary it creates." · "How would I write a query to find every vault holding a given token?" · "What does a variable-length path return — nodes, the path, or both?" · "Where would a bare MATCH (n) accidentally sneak in, and what catches it?"

What you can now do

Grounded in real queries: pkg/rules/node_lookup.go (anchored point + UNWIND-by-id reads), pkg/risk/graph_partial.go (MATCH (root…) OPTIONAL MATCH (root)-[r:<spine>*1..6]-(n…) bounded undirected walk), pkg/risk/verify_balance_conservation.go (collect(r.quantity_raw)), pkg/risk/node_risk_score.go (UNWIND $updates … SET), pkg/enrichment/oracle_bridger.go (WHERE NOT EXISTS{…} MERGE anti-join) + navlink_refresher.go (MATCH-token/MERGE-oracle), coalesce(type,subcategory,category) kind idiom; CLAUDE.md Cypher hard rules + glossary. Verify against source — the code is the truth.