Lesson 33 · The Linear Promoter · Quality Internals

File each finding exactly once

Turning persistent findings into tickets without spamming duplicates. ~13 min.

Builds on: L29 · L31 · L9 Anchor: idempotent external side-effects New: create-then-mark New: owned failure window

L29 said durable findings (streak ≥ threshold) "become Linear tickets." That sentence hides the whole problem: the promoter runs every cycle, and a finding stays promotable for many cycles. Naively, that's a ticket per cycle — the on-call's nightmare. So the promoter's real job isn't creating issues; it's creating each one exactly once, across retries, partial failures, and two systems that can't be transacted together.

Your anchor: the alert that paged you 50 times
Every engineer has been paged repeatedly for one unchanging problem, or seen an actuator double-file because a write half-succeeded. This is a pure backend problem — idempotent side-effects against an external API — wearing a quality-harness hat. The graph barely appears; the lesson is exactly-once-ish actuation, which you'll meet in any system that turns internal state into outside-world actions.

1 · The exactly-once primitive

A finding lives in OpenSearch with a streak (L29). The promoter's contract with that store is three methods — and the whole design hinges on the last one:

type FindingStore interface {
    RefreshIndex(ctx) error                                  // read-your-writes (below)
    FetchPromotable(ctx, class, threshold, limit) ([]Cand)   // streak>=N AND LinearIssueID==""
    SetLinearIssueID(ctx, findingID, linearIssueID) error    // THE idempotency primitive
}

FetchPromotable only returns findings that are persistent (streak ≥ threshold) and not yet linked (LinearIssueID empty). The moment a finding gets its Linear ID stamped, it drops out of every future fetch. So "exactly once" reduces to: create the issue, then write the ID back. A linked finding is invisible forever after.

2 · Create-then-mark, and the window it owns

Here's the crux. The promoter writes to two systems — Linear (create issue) and OpenSearch (stamp ID) — and you cannot make those two writes atomic (no distributed transaction). So you must pick an order, and each order has a failure window. The code picks create-then-mark, deliberately:

1
CreateIssue on Linear. On API error → log, count, do not mark → the finding stays promotable and retries next run. No false link, safe.
2
SetLinearIssueID write-back, before counting success. On success → the finding is linked and never re-promoted. Done, exactly once.
The owned window: if step 1 succeeds but step 2 fails, the issue exists but is unlinked — next run could file a duplicate. This is the ONLY path that can dupe.
Why create-then-mark, not mark-then-create
Both orders have a failure window; the design picks the one whose worst case is recoverable. Create-then-mark's bad case is a visible duplicate (two issues) — annoying, but caught by a dedicated LinearPromotionOrphaned counter and the finding ID embedded in the issue body, so an operator can find and merge it. Mark-then-create's bad case would be a silently dropped finding (marked done, but no issue ever created) — a real problem hidden forever. Prefer a loud duplicate over a silent miss.

3 · Three layers of dedup

Belt and braces around that one window, the promoter dedups at three levels:

GuardCatches
aFetchPromotable excludes already-linked findings (re-checked in the loop)the normal case — a filed finding never returns
ba within-run seen set keyed on finding IDthe same finding appearing twice in one fetch page
con a Linear API error, never mark — retry naturally next runa half-failed create becoming a false "done"

4 · Two throttles you've seen before

5 · Two smaller subtleties worth keeping

Read-your-writes, and an honest concurrency caveat
Read-your-writes: the streak upserts are written with WithRefresh(false) (cheap, ~1s async), so a fetch issued immediately after would read the previous cycle's segments and fire the threshold a cycle late. So PromoteClass calls RefreshIndex once per class before fetching — best-effort (a refresh miss just means slightly-stale reads, not a correctness break). Concurrency: the fetch→create→write-back sequence is not lease-guarded. It's safe only because quality-gate runs serially in one pod per chain — the docstring says so plainly, and notes that a multi-replica deploy would need optimistic concurrency (an OpenSearch if_seq_no guard or a Redis lease). Documenting the unhandled case is the engineering — single-writer is the stated deployment invariant (L9).
The shape of a safe actuator
Look at what makes this robust: an idempotency key (LinearIssueID), an ordering whose failure mode is recoverable, layered dedup, a default-off kill switch, a per-run cap, and a frank note about the concurrency it does not handle. None of it is about Linear or the graph specifically — it's the universal checklist for "turn internal state into an external action without making a mess." That's the transferable lesson.

Check yourself

1. The promoter runs every cycle and a finding stays promotable for many cycles. What's its core challenge?
2. What makes SetLinearIssueID "the idempotency primitive"?
3. The promoter creates the Linear issue, then writes the ID back. Why that order rather than the reverse?
4. CreateIssue succeeds but the SetLinearIssueID write-back fails. What's the consequence, and how is it handled?
5. On a Linear API error during CreateIssue, the promoter deliberately does NOT mark the finding. Why?
6. The promoter is disabled unless StreakThreshold > 0, Client != nil, and Store != nil. What design instinct is that?
7. What does the per-run cap (MaxPerRun, default 5) protect against, and how does it relate to L31?
8. The docstring states the fetch→create→write-back sequence is unguarded by a lease, safe only on a single replica. Why document that?
↳ Ask your teacher
Try: "Show me how the streak Upsert + reset works in findings.go." · "What does the issue body's finding ID = sha256(class|ref_id|kind) let an operator do?" · "How would an if_seq_no guard make this multi-replica-safe?" · "Where in quality-gate main.go is PromoteClass called?" · "How does this compare to the healer as an actuator on the same findings?"

What you can now do

Both branches of the finding fork, now seen
L29's finding either gets auto-healed (L30/L31) or auto-ticketed (here). Both are actuators on the same streak-tracked findings, and both share a posture: idempotent, throttled, default-cautious, honest about their limits. The quality subsystem measures drift and then does something about it — safely, on both branches.

Grounded in: pkg/quality/chainref/linear_promoter.go (LinearPromoter.PromoteClassFetchPromotable streak≥threshold ∧ unlinked, create-then-mark with SetLinearIssueID idempotency primitive, the orphaned-window + LinearPromotionOrphaned counter + finding-ID-in-body, 3-layer dedup, default-off kill switches via Enabled(), MaxPerRun cap, RefreshIndex read-your-writes, single-replica concurrency caveat), findings.go (FetchPromotable/streak), linear_client.go (CreateIssue). Verify against source — the code is the truth.