Lesson 12 · The Rule Engine · Deeper Track

From risk numbers to alerts

The layer the whole system exists to feed: customer rules → evaluation → firing. ~12 min.

Builds on: L6 · L2 Anchor: "alert me if…" New: rule DSL + fields New: firing state machine

We've built the graph, maintained it, and computed risk on it. But none of that is the product. The product is: a customer says "alert me if an admin can drain more than $5M of my token," and the system watches for exactly that, forever. The rule engine (pkg/rules) is where the risk graph finally becomes an alert. This is what it was all for.

You already want this tool
As a DeFi person you think in alerts: "tell me if this market's utilization tops 95%," "if my vault's curator changes," "if my position's admin risk crosses $X." The rule engine is precisely that — a rules layer over the risk graph. Everything in Lessons 1–11 exists to make these rules answerable in real time.

1 · A rule, anatomised

A customer rule (pkg/rules/types.go Rule) is a small declarative spec:

type threshold | aggregate | event | balance — the evaluation strategy

condition a field path + operator + value (e.g. node_risk_score > 0.8)

scope which nodes it applies to (one address, a portfolio, top-N, all a portfolio's admins…)

trigger cooldown / re-alert timing

Read it as a sentence: "for [scope], when [field] [operator] [value], fire." The four rule types map to four questions:

Rule typeAnswersExample
thresholdDoes a node's field cross a value?utilization_pct > 95 on a market
aggregateDoes a sum over a scope cross a value?portfolio-wide admin_risk_usd > $10M
eventDid a specific on-chain event happen?curator changed on my vault
balanceLive wallet balance check (Redis + RPC)balance_usd < $1000

Source: pkg/rules/types.go (Rule, Condition, Scope, RuleType), threshold.go/aggregate.go/balance.go.

2 · Fields: the contract with the risk engine

The field path in a condition is the hinge between this lesson and Lesson 6. The risk engine writes risk values onto nodes; the rule engine reads them by path. The catalog of evaluable fields is a real schema (docs/rule-fields.json, fields.gen.go), and every field declares where it comes from:

FieldUnitWritten by (source)
node_risk_score0..1risk.NodeRiskScoreComputer (L6)
admin_risk_usdUSDrisk.AdminRiskComputer (L6)
exit_liquidity_json.*.total_exit_usdUSDrisk (L6, wildcard per-token)
utilization_pct%enrichment.LendingRefresher (L5)
balance_usdUSDlive — Redis cache + RPC (L4·L7)
Field "kind" decides when it's fresh
Each field has a kind + cadence: scalar/wildcard risk fields are recomputed periodically (the catalog shows period_ns: 7200000000000 = 2 hours); live fields like balance are evaluated on_event (e.g. balance_changed, lending_dirty). So a rule's responsiveness is bounded by its field's cadence — a 2-hourly risk field can't alert in seconds. Knowing a field's source + cadence tells you both its meaning and its latency.

Source: docs/rule-fields.json (path · kind · unit · cadence · rule_types · scopes · source). A lint (scripts/lint-schema-coverage.sh) keeps the schema endpoint in sync with the code.

3 · Scope: which nodes does the rule touch?

Before evaluating, the engine resolves the rule's scope into a concrete list of node IDs (pkg/rules/scope.go ScopeResolver.Resolve):

ScopeResolves to
address / walletone node
portfolio_holdersevery holder of a portfolio's tokens
portfolio_adminsevery admin of a portfolio's contracts
top_nthe N riskiest nodes by some field
🔗 Lesson 2's query rules, in production
Scope resolution is graph queries — and it obeys exactly the rules you learned in Lesson 2: anchored (start from a portfolio's addresses, a label+index, a focus set), graph_id-scoped, never a full scan. "Resolve a scope" is "run a bounded, anchored Cypher traversal." This is where those rules stop being abstract.

4 · The evaluation loop (two cadences)

The Engine (engine.go) runs two things concurrently:

Two cadences again mirror the system's push-based philosophy (L1): periodic for the 2-hourly risk fields, event-driven for live fields (a balance_changed or lending_dirty signal evaluates the relevant rules immediately).

5 · Firing: a state machine, not a trigger ⭐

Here's the non-obvious depth. A rule does not fire an alert every time the condition is true — that would spam a customer with thousands of identical alerts while a threshold stays breached. Instead, each (rule, node) pair runs through a firing state machine (firing.go):

CLEAR
— met → emit
ACTIVE
— cleared →
COOLDOWN
— expired+not met →
CLEAR
Why this matters (deep-systems intuition)
Alerting is stateful. The naïve "if condition: alert" produces either spam (re-alert every cycle) or missed resolutions (no "all clear"). The state machine gives debouncing/hysteresis (the cooldown absorbs threshold flicker) and lifecycle (fire → remind → resolve). Per-(rule,node) firing state is persisted (OpenSearch) so it survives restarts — the same "state is durable + recoverable" discipline you saw across the write path.

6 · Where alerts go

An emitted AlertEvent is published (publisher.go) and stored in OpenSearch (opensearch.go, opensearchstore.go) — note this is the system's one use of OpenSearch, separate from Memgraph/Redis; alerts are documents you search/filter, not graph nodes. Downstream, the alert-processor (cmd/alert-processor, pkg/alertprocessor) consumes and delivers them. The loop is finally closed:

🔁 The complete loop — what every prior lesson was for
on-chain event (L1) → decoded (L3) → filtered + written to the graph (L4) → enriched + discovered (L5) → risk fields computed (L6)rule reads the field, evaluates over a scope, firing state machine emits an AlertEvent (L12) → published + stored → delivered. The graph was always a means; the alert is the end.

Check yourself

1. A customer rule is fundamentally…
2. The field path in a condition (e.g. node_risk_score) is the link to which lesson's subsystem?
3. A rule on admin_risk_usd (a periodic field, 2h cadence) can't alert within seconds. Why?
4. Resolving a rule's scope (e.g. portfolio_holders) means…
5. Why doesn't a rule emit an alert on every eval cycle while its condition stays true?
6. The COOLDOWN (resolve-grace) state exists to…
7. Alerts are stored in OpenSearch (not Memgraph) because…
8. The dry-run endpoint lets a customer…
↳ Ask your teacher
Try: "Show me evalCycle in engine.go," · "How does an aggregate rule sum across a scope?" · "Walk me through the full ApplyFiringState transitions," · "What does the alert-processor do with a published AlertEvent?"

What you can now do

Layer complete — the system now has a purpose
You've added the consumer side. The graph is no longer an end in itself: rules turn its risk fields into the alerts that are Forta's actual product. Combined with phase 1, you can now narrate the system from an RPC block all the way to a customer's pager.

Grounded in: pkg/rules/{engine,types,scope,firing,threshold,aggregate,balance,publisher,opensearch}.go, docs/rule-fields.json (the field catalog: path · kind · cadence · source), cmd/{rule-engine,alert-processor}/main.go, pkg/alertprocessor/. Verify against source — the code is the truth.