feat(pr-metrics): Attribute PRs from the seer.pr_created event by vaind · Pull Request #116759 · getsentry/sentry

vaind · 2026-06-03T11:19:08Z

Part of the PR Merge Live Metrics project. Builds on the storage layer from #116586 (CORE-200, now merged).

What

When Seer reports a PR it created — it already emits the seer.pr_created event (SentryAppEventType.SEER_PR_CREATED). This hooks attribution into that existing inbound flow (process_autofix_updates, next to the activity handler — no new transport):

Resolve the org-scoped Repository by name and provider.
Find-or-create the canonical PullRequest row (keyed on PR number). This may run before the SCM opened webhook, so the row can be a shell; we never overwrite title/body the webhook fills in later.
Idempotently record a PullRequestAttribution row with signal type seer_app and source seer_data (signal_details = {run_id, group_id, pr_url}), keyed on (pull_request, signal_type, source) — matching the model's unique constraint — so event redelivery refreshes rather than duplicates.

Why provider-aware resolution

Repository has no (organization_id, name) unique constraint (only (organization_id, provider, external_id)), so an org can legitimately host same-named repos across providers — resolving by name+org alone could attribute to the wrong repo. Seer normalizes its provider (process_repo_provider: strips integrations:, lowercases) while Sentry stores the prefixed form, so we match both shapes (the idiom filter_repo_by_provider already uses) and resolve only on a single match — refusing to guess when Seer sends unknown.

Observability

Structured warnings so upstream issues surface: repo_not_found, repo_ambiguous, and unrecognized_provider (a provider value we don't map — flagged so it can be fixed in Seer).

Rollout / scope

Gated behind the organizations:pr-metrics-attribution flag (FlagPole, backend-only). Matures out after rollout.
Covers PRs from Seer's own coding pipeline — that's exactly what seer.pr_created reports — recorded as the seer_app signal type. Delegated-agent PRs (Cursor/Copilot/Claude) are out of scope: they flow through Seer's coding-agent state-update path, which does not emit seer.pr_created, so they never reach this handler. Attributing those is separate/later work.
The cached PullRequest.attribution (MAX-confidence) projection is deferred — recompute_pull_request_attribution computes it as a read helper, but the cached column isn't persisted yet (by design).

Tests

New tests/sentry/pr_metrics/test_attribution.py (resolution, provider disambiguation, unknown-provider single-match vs ambiguous, idempotency, signal revival, confidence ranking, all three warnings) + operator-flow integration tests behind the feature flag.

Refs CORE-204

linear-code · 2026-06-03T11:19:13Z

CORE-204

CW-1460

giovanni-guidini · 2026-06-03T13:43:59Z

+) -> None:
+    """Attribute PRs reported by Seer's ``seer.pr_created`` event to the Seer app.
+
+    For each reported PR: resolve the org-scoped ``Repository`` by name and


Why on earth is it more than one?

same name on gitlab & github for example

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 0d38d14. Configure here.}

When Seer reports a PR it created (own or delegated coding agent) via the existing seer.pr_created event, attribute it in Sentry: resolve the org-scoped Repository by name + provider, find-or-create the canonical PullRequest row (keyed on PR number), and idempotently record a seer_app PullRequestAttribution signal (signal_details = run_id, group_id, pr_url). Hooks into the existing process_autofix_updates flow next to the activity handler; no new transport. Repository resolution takes provider into account because an org can host same-named repos across providers; it resolves only on a single match and refuses to guess when Seer sends an unknown provider. Unresolvable repos and unrecognized provider values are logged so they can be fixed upstream. Gated behind the organizations:pr-metrics-attribution flag for rollout. Scoped to the seer_app source only; delegated-agent classification and the cached PullRequest.attribution projection land separately. Refs CORE-204 Co-Authored-By: Claude <noreply@anthropic.com>

giovanni-guidini · 2026-06-04T07:20:06Z

+detected signal is preserved as its own ``PullRequestAttribution`` row rather
+than collapsed into a single field.
+
+See the architecture doc ("PR Metrics — Architecture Overview", §Attribution


[nit] Docs get stale fast... I'd remove this paragraph I think

giovanni-guidini · 2026-06-04T07:24:11Z

+            )
+            record_attribution_signal(
+                pull_request=pull_request,
+                signal_type=PullRequestAttributionSignalType.SEER_APP,


Are we confident that this is indeed the app used, and not the SENTRY_APP?
Or does it not matter?

I'm even thinking that we should drop the distinction and just always put Sentry in there, instead of trying to figure out who is using Sentry and who is on Seer app

I think that's fair. For our purposes it doesn't matter much, to be honest.

Truth be told I think these days it might be just the SENTRY_APP anyway

Address PR review feedback on the attribution resolver docs: - Use "attributions" instead of "signals" for naming consistency. - Drop the stale architecture-doc reference paragraph. - Document why the seer.pr_created path always records SEER_APP: Seer picks between the Sentry and Seer GitHub apps at push time, but the payload doesn't carry which one it used, so a faithful SEER_APP vs SENTRY_APP split is deferred until that app kind is threaded through. Refs CORE-204 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Seer picks between the Sentry and Seer GitHub apps at push time, but the seer.pr_created payload doesn't say which one it used. Until Seer threads that app kind through, default to SENTRY_APP rather than SEER_APP so the attribution reflects the more common case. Refs CORE-204 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…webhook (#116834) Part of the [PR Merge Live Metrics](https://linear.app/getsentry/project/pr-merge-live-metrics-6a9efd473801/overview) project. Builds on the schema from #116586 (CORE-200) and the attribution write helper from #116759 (CORE-204), both now merged. ## What Hooks a new processor into `PullRequestEventWebhook.WEBHOOK_EVENT_PROCESSORS` that runs on every GitHub `pull_request` webhook. On `action=opened` it detects two attribution signals and writes `PullRequestAttribution` rows via `record_attribution_signal()`: 1. **App ID match** — compares `pull_request.user.id` against `SEER_AUTOFIX_GITHUB_APP_USER_ID` and `SENTRY_GITHUB_APP_USER_ID` to produce `SEER_APP` / `SENTRY_APP` attributions respectively. 2. **Referenced issue** — scans the PR title and body for Sentry issue short IDs (`Fixes PROJ-123`) and sentry.io URLs (`Fixes https://....sentry.io/issues/456`) via the existing `find_referenced_groups` utility, and writes a `REFERENCED_ISSUE` attribution with matched group IDs in `signal_details`. Both signals are independent — a Seer-opened PR that also references an issue produces two rows. All writes are idempotent and race-safe via `record_attribution_signal()` (keyed on `(pull_request, signal_type, source)`, matching the unique constraint). The processor is gated behind the `organizations:pr-metrics-attribution` FlagPole flag (registered by #116759). Refs CORE-216 --------- Co-authored-by: Claude <noreply@anthropic.com>

…h) (#116842) Part of the [PR Merge Live Metrics](https://linear.app/getsentry/project/pr-merge-live-metrics-6a9efd473801/overview) project. Builds on the schema (#116586, CORE-200), the `seer.pr_created` attribution path (#116759, CORE-204), and the webhook attribution processor (#116834, CORE-216). ## What On a tracked PR's **close/merge**, Sentry now emits a provisional `pr_metrics.row` analytics event directly — no Seer judge, no round-trip. This is the "easy path" of the judge-gated emission design (the judge path is CORE-217). A PR is **tracked** once it has ≥1 valid `PullRequestAttribution` row (`is_valid=true`); untracked PRs emit nothing. The row carries only what Sentry already holds — no SCM fetch, no PR text: - lifecycle: `close_action`, `head_commit_sha`, `merge_commit_sha`, `opened_at` / `closed_at` / `merged_at`, `draft` - payload-derived counters: `additions`, `deletions`, `files_changed`, `commits_count`, `comments_count`, `review_comments_count`, `is_assigned` - `attributions`: the point-in-time snapshot of valid attribution signals (`{signal_type, source, signal_details}`), ordered by attribution priority (highest-confidence first) - `verdict`: `None` for now (verdicts arrive with judges) Emission is gated by the new `organizations:pr-metrics-emit` FlagPole flag, and is **stateless** — it does not dedupe webhook redeliveries (a DB-side, pre-fork guard keyed on the terminal event is CORE-227; the judge round-trip is CORE-217). Transport is `sentry.analytics` → the existing analytics → BigQuery pipeline. The schema is intentionally provisional/additive; the consumable schema is finalized in M5 (CORE-223). ## Architecture Consolidates all pr_metrics GitHub webhook handling into `src/sentry/pr_metrics/webhooks.py`, following the code_review pattern. Two **independent** processors are registered on `PullRequestEventWebhook.WEBHOOK_EVENT_PROCESSORS`: - `handle_attribution` — GH-App-author + referenced-issue signals (relocated from `integrations/github/pr_metrics_webhook_processors.py`, no behavior change) - `handle_emission` — the close/merge metrics row They're separate rather than one routing function so the webhook loop isolates each in its own `try/except` (a failure in one can't suppress the other), and each carries its own feature flag and action gate. Domain logic lives in `pr_metrics/attribution.py` and `pr_metrics/emit.py`; `pr_metrics/webhooks.py` is the GitHub entry point. > [!NOTE] > This relocates the CORE-216 attribution webhook handler (a pure move, logic unchanged) from `integrations/github/` into `pr_metrics/`. Refs CORE-221 --------- Co-authored-by: Claude <noreply@anthropic.com>

github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 3, 2026

vaind force-pushed the pr-merge-metrics/seer-pr-created-attribution branch from 7e174e1 to 988fbef Compare June 3, 2026 11:33

Base automatically changed from gio/pr-merge-metrics/extend-pr-data to master June 3, 2026 12:47

vaind force-pushed the pr-merge-metrics/seer-pr-created-attribution branch from 988fbef to 0856ca4 Compare June 3, 2026 12:53

vaind marked this pull request as ready for review June 3, 2026 12:53

vaind requested a review from a team as a code owner June 3, 2026 12:53

sentry Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread src/sentry/pr_metrics/attribution.py Outdated

getsentry deleted a comment from github-actions Bot Jun 3, 2026

This comment was marked as resolved.

Sign in to view

vaind force-pushed the pr-merge-metrics/seer-pr-created-attribution branch from 0856ca4 to 407fb82 Compare June 3, 2026 13:02

vaind requested a review from giovanni-guidini June 3, 2026 13:04

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread src/sentry/pr_metrics/attribution.py

This comment was marked as outdated.

Sign in to view

vaind force-pushed the pr-merge-metrics/seer-pr-created-attribution branch from 407fb82 to 0d38d14 Compare June 3, 2026 13:43

giovanni-guidini reviewed Jun 3, 2026

View reviewed changes

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread src/sentry/pr_metrics/attribution.py

giovanni-guidini reviewed Jun 3, 2026

View reviewed changes

Comment thread src/sentry/pr_metrics/attribution.py

sentry-warden Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread src/sentry/seer/entrypoints/operator.py

vaind force-pushed the pr-merge-metrics/seer-pr-created-attribution branch from 0d38d14 to f84a7d8 Compare June 3, 2026 14:03

giovanni-guidini approved these changes Jun 4, 2026

View reviewed changes

giovanni-guidini mentioned this pull request Jun 4, 2026

feat(pr-metrics): Write PullRequestAttribution on PR open via GitHub webhook #116834

Merged

vaind and others added 2 commits June 4, 2026 10:26

vaind enabled auto-merge (squash) June 4, 2026 08:38

vaind merged commit 552fb5c into master Jun 4, 2026
85 checks passed

vaind deleted the pr-merge-metrics/seer-pr-created-attribution branch June 4, 2026 08:46

vaind mentioned this pull request Jun 4, 2026

feat(pr-metrics): Emit a BigQuery row on PR close/merge (no-judge path) #116842

Merged

sentry-release-bot Bot mentioned this pull request Jun 15, 2026

publish: getsentry/sentry@26.6.0 getsentry/publish#8571

Closed

3 tasks

Uh oh!

Conversation

vaind commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why provider-aware resolution

Observability

Rollout / scope

Tests

Uh oh!

linear-code Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as outdated.

giovanni-guidini Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

vaind Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

giovanni-guidini Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

giovanni-guidini Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

vaind Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

giovanni-guidini Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vaind commented Jun 3, 2026 •

edited

Loading

linear-code Bot commented Jun 3, 2026 •

edited

Loading

vaind Jun 3, 2026 •

edited

Loading

vaind Jun 4, 2026 •

edited

Loading