ci(e2e): harden staging e2e against deterministic and flaky reds#8756
Conversation
🦋 Changeset detectedLatest commit: 7b59e11 The changes in this PR will be included in the next version bump. This PR includes changesets to release 0 packagesWhen changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Repository UI (inherited) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughStaging E2E reliability: change workflow concurrency key to commit SHA, increase integration test timeout, upload Playwright JSON reports on non-cancelled runs, add CI reporter, and add env-gated skips and lint disables in Playwright tests. ChangesE2E Staging Test Resilience
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Comment |
@clerk/astro
@clerk/backend
@clerk/chrome-extension
@clerk/clerk-js
@clerk/expo
@clerk/expo-passkeys
@clerk/express
@clerk/fastify
@clerk/hono
@clerk/localizations
@clerk/nextjs
@clerk/nuxt
@clerk/react
@clerk/react-router
@clerk/shared
@clerk/tanstack-react-start
@clerk/testing
@clerk/ui
@clerk/upgrade
@clerk/vue
commit: |
The staging e2e "generic" leg was red on ~100% of runs because a few independent failures sat behind an all-or-nothing gate: - whatsapp-phone-code: the WhatsApp channel is not enabled on the staging instance, so the button never renders and the suite times out every run. It also bypasses the isStagingReady graceful-skip, so skip it explicitly on staging until the channel is provisioned. - custom-pages "survives a parent rerender": validates an unreleased @clerk/react fix (#8604), but the staging leg installs published @latest, so it is deterministically red until release. Skip when E2E_SDK_SOURCE=latest; PR CI (ref builds) still covers it. - concurrency was keyed on ref (effectively always "main") with cancel-in-progress, so each new staging deploy cancelled the in-flight run and no commit could report a status. Key on the clerk_go commit instead. - raise the job timeout above the 25-minute test step so the job cap no longer kills runs mid-suite. - emit and upload a JSON Playwright report in CI so the report job can classify failures (flaky vs failed, infra vs regression) later.
fc18bdf to
7b59e11
Compare
Fixing the deterministic failures in staging e2e.
Two always-failing tests simply cannot pass on staging today, and no retry helps.
whatsapp-phone-codewaits on a WhatsApp button the staging instance never renders (the channel isn't enabled there), andcustom-pages"survives a parent rerender" validates an unreleased@clerk/reactfix (#8604) while the staging leg installs published@latest. Both now skip only where they can't pass (staging, andE2E_SDK_SOURCE=latestrespectively); PR CI still runs them against branch builds, so coverage there is unchanged.The change that most deserves a look is concurrency. It was keyed on
ref(effectively alwaysmain) withcancel-in-progress, so every new staging deploy cancelled the in-flight run and no commit could report its own result. It now keys on the clerk_go commit SHA, falling back torun_idfor manual dispatches. The rest is a timeout bump above the 25-minute test step, plus a JSON Playwright report upload for later failure classification.Deliberately out of scope and coming next: making
validate-staging-instancesactually gate, a small@smokegating leg with the full suite informational, and reducing FAPI load on the sharedwith-email-codesinstance.Summary by CodeRabbit
Chores
Tests