Skip to content

FROST/ROAST readiness branch#3866

Draft
mswilkison wants to merge 490 commits into
mainfrom
feat/frost-schnorr-migration-scaffold
Draft

FROST/ROAST readiness branch#3866
mswilkison wants to merge 490 commits into
mainfrom
feat/frost-schnorr-migration-scaffold

Conversation

@mswilkison

@mswilkison mswilkison commented Feb 19, 2026

Copy link
Copy Markdown
Contributor

Current State (as of 2026-05-17)

This draft PR is the umbrella readiness branch for feat/frost-schnorr-migration-scaffold.
It is being kept current with main so it can become a direct merge target if the FROST/ROAST stack is approved for activation.

It remains in draft until the remaining phase-gate, governance, and cross-repository readiness items are closed.

Canonical Status Sources

  • Cross-repo migration tracker: docs/frost-migration/external-repository-tracking.md (in tlabs-xyz/tbtc)
  • Companion tBTC umbrella draft: https://github.com/tlabs-xyz/tbtc/pull/10
  • Latest readiness audit: docs/reviews/frost-roast-production-readiness-2026-05-16.md (in tlabs-xyz/tbtc)

Latest Refresh

  • Merged current main into this branch.
  • Local verification passed for the FROST signing package and tBTC signer backend paths, with and without frost_native.
  • Local verification also passed the native TBTC signer-path tests covering the FFI signing primitive and signing executor.

Remaining Cross-Repo Closure Items

  • Wait for CI from the latest refresh to complete.
  • Capture the first post-fix funded nightly live run artifact for Phase 4.
  • Record final approver signoff in the Phase 4 decision/packet docs.
  • Execute external org archive/redirect mapping and record results.

Notes

  • Keep this PR in draft until the activation decision is explicit.
  • Treat it as the readiness branch for the integrated keep-core side of the stack, not only a historical index.

@mswilkison mswilkison changed the title Draft: Add Schnorr/FROST migration scaffold package and RFC Draft: Add Schnorr/FROST scaffold and tBTC runtime signing adapter slice Feb 20, 2026
mswilkison added a commit that referenced this pull request Feb 26, 2026
## Summary
- cut over the `frost_tbtc_signer` bootstrap path to return coarse
tbtc-signer signature output on successful `RunDKG -> StartSignRound ->
FinalizeSignRound`
- keep legacy signing fallback only for verified coarse-path failures
(bridge errors, decode failures, or structural divergence)
- wire `BuildTaprootTx` through the transitional native tbtc-signer
orchestration path
- gate `BuildTaprootTx` signing substitution on strict native-vs-legacy
transaction input/output equivalence checks
- add coarse success/fallback telemetry and observer-registration guards
- expand unit and integration coverage for coarse cutover,
retry/attempt-variation behavior, and `BuildTaprootTx` substitution
safety

## Stack Context
- base branch: `feat/frost-schnorr-migration-scaffold` (`#3866`)
- recommended review order:
  1. review `#3866` for scaffold/runtime seams
  2. review this PR as the cutover + hardening delta

## Review Guide (hot paths)
- coarse cutover + fallback semantics:
- `pkg/frost/signing/native_ffi_primitive_transitional_frost_native.go`
-
`pkg/frost/signing/native_frost_engine_tbtc_signer_registration_frost_native.go`
- `BuildTaprootTx` wiring and substitution gating:
  - `pkg/tbtc/wallet.go`
-
`pkg/tbtc/native_tbtc_signer_build_taproot_tx_frost_native_tbtc_signer.go`
  - `pkg/bitcoin/transaction_builder.go`
- coverage for tx assembly/substitution and bridge safety:
  - `pkg/tbtc/wallet_sign_transaction_build_taproot_tx_test.go`
  - `pkg/bitcoin/transaction_builder_test.go`
-
`pkg/frost/signing/native_frost_engine_tbtc_signer_registration_frost_native_test.go`

## Scope Boundaries
- in scope: bootstrap/coarse-path cutover hardening and safe
`BuildTaprootTx` integration
- out of scope: full production signer-runtime replacement and later
migration phase gates
@mswilkison mswilkison changed the title Draft: Add Schnorr/FROST scaffold and tBTC runtime signing adapter slice Draft (Umbrella): keep-core FROST/ROAST migration scaffold tracker (not for direct merge) Mar 1, 2026
@mswilkison mswilkison changed the title Draft (Umbrella): keep-core FROST/ROAST migration scaffold tracker (not for direct merge) Draft: keep-core FROST/ROAST readiness branch May 17, 2026
@mswilkison

Copy link
Copy Markdown
Contributor Author

Readiness evidence update for the tBTC Schnorr FROST/ROAST migration stack, 2026-05-20.

From a clean worktree at PR head 37b2ce78348c4ab1c4a98eda8adcf99fa3d9aa1e, the focused integration-tag package lane passed:

go test -timeout 20m -tags 'integration frost_native frost_tbtc_signer' ./pkg/frost/... ./pkg/tbtc

Observed package coverage:

  • pkg/frost
  • pkg/frost/retry
  • pkg/frost/roast
  • pkg/frost/signing
  • pkg/tbtc (221.475s)

This narrows the keep-core evidence gap for FROST/tBTC focused package behavior, but it is not a production-readiness substitute for full keep-core integration/testnet coverage. The following remain open blockers for the tBTC FROST/ROAST readiness gate:

  • full go test -tags=integration ./... or equivalent full-stack current integration evidence
  • client-integration-test
  • deployment/testnet lanes
  • funded production-like wallet/sign/deposit/redemption/fraud/rollback run
  • operator rehearsal and signoff
  • final maintainer/security/runtime/governance acceptance

The corresponding tBTC evidence docs were pushed in tlabs-xyz/tbtc#402.

@mswilkison mswilkison marked this pull request as ready for review May 22, 2026 20:07
@mswilkison mswilkison changed the title Draft: keep-core FROST/ROAST readiness branch FROST/ROAST readiness branch May 22, 2026
@mswilkison mswilkison marked this pull request as draft May 22, 2026 20:32
mswilkison added a commit that referenced this pull request May 22, 2026
… at init (#3958)

## Summary

Addresses three FFI-safety findings from an independent review of #3866:

- **H3 (init-time panic)**:
`RegisterNativeExecutionFFISigningPrimitiveForBuild` and
`registerNativeExecutionAdapterForBuild` (frost_native) panic on
registration failure. Both are invoked from
`pkg/frost/signing/native_adapter_registration.go`'s package `init()`,
so a transient registration failure crashes the binary at startup.
Downstream code (`pkg/frost/signing/backend.go`) already returns
`ErrNativeCryptographyUnavailable` when no native adapter is registered,
so the legacy execution backend remains the safe-by-default path —
panicking at init turned a recoverable degradation into an outage.

Replace panics with structured `logger.Warnf` plus a package-level
`lastRegistrationError` and `LastNativeRegistrationError()` accessor.
Callers that want to fail startup on a registration error can opt in by
checking that accessor after `RegisterNativeExecutionAdapterForBuild`;
default callers continue booting with the legacy backend, exactly as if
`frost_native` was never enabled. The existing
`TestRegisterNativeExecutionFFISigningPrimitiveForBuild_ProviderErrorPanics`
becomes `..._ProviderErrorIsRecordedNotPanicked` and asserts the new
behavior.

- **M1 (nil ptr free)**: `parseBuildTaggedTBTCSignerResult`
unconditionally deferred `C.tbtc_signer_free_buffer(result.buffer.ptr,
result.buffer.len)` even when the C wrapper's status-code -1 path
returned `result.buffer.ptr == NULL`. The C wrapper checks the
`frost_tbtc_free_buffer` symbol for NULL but does not check the buffer
pointer, so a future Rust-side change that dereferenced its ptr argument
without a NULL guard would crash. Skip the defer when `result.buffer.ptr
== nil`.

- **M6 (unbounded length)**: `unmarshalSignerMaterialFromPersistence`
accepted any uvarint length within the data buffer. A corrupted state
file or hostile peer carrying a multi-hundred-MiB envelope would
allocate that many bytes before the existing bounds check ran. Cap the
format length at 256 bytes and the payload length at 256 KiB —
comfortably above any real signer material envelope — and reject earlier
with a clear error. New regression tests
`TestUnmarshalSignerMaterialFromPersistence_RejectsOversizedFormatLength`
and `..._RejectsOversizedPayloadLength`.

## Out of scope (deferred)

The remaining placeholder-fencing findings from the same review (H1:
\`KeyGroupSource == \"legacy-wallet-pubkey\"\` fallback; H2: DKG
placeholder participant pubkeys; H4: silent key-group substitution when
source is legacy) require maintainer policy alignment on whether to gate
the \`frost_tbtc_signer\` build behind an opt-in flag or
refuse-by-default. Not included here.

Several MED findings around Bitcoin witness preservation, FROST message
channel back-pressure, and replay-error string matching also require
behavior decisions and are not included in this safety-hygiene slice.

## Verification

Local (GOCACHE under \`/private/tmp\`):

- \`go test ./pkg/frost/...\` — PASS
- \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...\` —
PASS
- \`go test ./pkg/tbtc -run
'TestUnmarshalSignerMaterial|TestMarshalSigner|TestSignerMarshalling|TestFuzzDecodeNativeSignerMaterial'\`
— PASS
- \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/tbtc -run
'TestConfigureFrostSigningBackend|TestNewNode_ConfiguresFrostSigningBackend|TestSigningExecutor_Sign|TestRegisterSignerMaterialResolverForBuild'\`
— PASS
- \`go vet ./pkg/frost/... ./pkg/tbtc\` — clean
mswilkison added a commit that referenced this pull request May 22, 2026
… message hygiene (#3959)

## Summary

Bundles four findings from the independent PR #3866 review that all sit
in the same code seam (frost_native scaffold path + receive loops).
Stacked on #3958.

### H1+H4 — scaffold key-group must be opt-in (was silently accepted)

\`signer_material_resolver_build_frost_native_tbtc_signer.go\` built
signer material with \`KeyGroupSource: \"legacy-wallet-pubkey\"\` (a
sha256 placeholder, not a DKG output) and the FFI primitive in
\`native_ffi_primitive_transitional_frost_native.go\` silently
substituted the Rust signer's RunDKG key group when the source was that
placeholder. Production deployments with placeholder material would have
signed through whatever key group the Rust side returned without
operator-facing signal.

Add a refuse-by-default opt-in:
\`KEEP_CORE_FROST_TBTC_SIGNER_ACCEPT_SCAFFOLD_KEY_GROUP=1\`. The new
\`signing.AcceptScaffoldKeyGroupEnabled\` helper is per-call (not
cached), so flipping the env unset recovers fail-closed behavior without
restart. Both the resolver and the FFI primitive check the flag; both
refuse with a clear error that names the env var and the placeholder
source. New regression test pins the refuse-by-default path; existing
scaffold-using tests opt in via \`t.Setenv\`.

### M2+M3 — Bitcoin witness restoration refuses unsupported shapes

\`ReplaceUnsignedTransaction\`'s restoration path handled only
single-element previous witnesses (P2WSH redeem script). Multi-element
witnesses (P2TR script-path) were silently dropped. Replace with an
explicit switch: 0 elements → leave empty, 1 → restore as before, ≥2 →
fail loudly. Removes the tautological inner \`len(replacedInput.X) ==
0\` checks that the outer refusals already guarantee. New regression
test
\`TestTransactionBuilder_ReplaceUnsignedTransaction_RejectsMultiElementPreviousWitness\`.

### M5 — first-write-wins on peer messages

Three round-message receive loops (tbtc-signer contribution, FROST round
one, FROST round two) did last-write-wins, letting a peer mutate its own
contribution after first send. Switch to first-write-wins with
byte-equal retransmissions idempotent and conflicting retransmissions
logged via a new \`protocolLogger\` channel. Three message-equality
helpers cover the three message types.

## Out of scope (deferred to separate PRs)

- **H2** — DKG placeholder participant pubkeys
(\`buildTaggedTBTCSignerDKGPlaceholderPublicKeyHex\`) needs either
wiring real \`MembershipValidator\` pubkeys through or fencing under the
same env flag.
- **M4** — ROAST-compliant bounded transition evidence for the
non-blocking message channel. Multi-PR effort.
- **M7** — Real ROAST-aware retry replacing the byte-identical tECDSA
shuffle in \`pkg/frost/retry/retry.go\`. Multi-PR effort.
- **L5** — FFI status-code semantics for replay detection. Paired with a
tbtc-signer follow-up.

## Verification

Local (GOCACHE under \`/private/tmp\`):

- \`go test ./pkg/frost/... ./pkg/bitcoin\` — PASS
- \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...
./pkg/bitcoin\` — PASS
- \`go test -tags 'frost_native frost_tbtc_signer' ./pkg/tbtc -run
'TestConfigureFrostSigningBackend|TestNewNode_ConfiguresFrostSigningBackend|TestSigningExecutor_Sign|TestRegisterSignerMaterialResolverForBuild|TestBuildTaggedTBTCSignerRoundKeyGroup|TestBuildTaggedLegacyCompatibleNativeExecutionFFISigningPrimitive|TestTransactionBuilder_ReplaceUnsignedTransaction'\`
— PASS
mswilkison added a commit that referenced this pull request May 22, 2026
…3962)

## Summary

Adds **RFC-21** as the design doc that scopes the M4 (transition
evidence) and M7 (ROAST-aware retry) findings from the independent
review of #3866 into a single layered design and a phased,
PR-sized implementation plan.

This PR is **doc-only**. It introduces no behaviour change. Subsequent
implementation PRs reference RFC-21 in their descriptions.

Stacked on #3961.

## Why one design, not two

M4 and M7 share the same notion of *attempt context* and *transition
evidence*:

- Fixing M4 alone produces evidence that no consumer reads.
- Fixing M7 alone gives the consumer nothing to drive retry decisions
on.

The RFC treats them as one design split into linear phases.

## Phasing

- **Phase 0** -- this RFC.
- **Phase 1** -- `AttemptContext` type + canonical hash; protocol
  messages carry attempt-context binding (optional during migration).
- **Phase 2** -- receiver overflow tracking (M4 layer A) plumbed
  through the three `select { default }` drop sites, default no-op.
- **Phase 3** -- coordinator state machine: `BeginAttempt`,
  `RecordEvidence`, `NextAttempt`. Deterministic
  `(AttemptContext, TransitionEvidence) -> AttemptContext` map.
- **Phase 4** -- wire receiver to coordinator behind
  `frost_roast_retry` build tag.
- **Phase 5** -- retry adapter +
  `EvaluateRoastRetryForSigning`; migrate first call site behind
  the build tag with readiness-gate guard.
- **Phase 6** -- migrate remaining call sites; delete the
  byte-identical-to-tECDSA shuffle once unused.
- **Phase 7** -- flip the readiness manifest to `present` once Phase
  6 ships and integration tests run against a real testnet (only
  then; no early flip).

## Open questions called out explicitly

The RFC lists four open design questions that need cross-team
review before Phase 3 lands:

1. Cross-process coordinator agreement -- gossip topic choice.
2. Persistence across signer restart.
3. FFI surface (Rust signer error-code style; follows the L5
   pattern from #425 / #3961).
4. Backward-compat horizon for the `AttemptContextHash` field.

## Out of scope

- DKG retry (separate RFC).
- Bitcoin transaction-builder changes.
- Operator UX changes (CLI, dashboards) -- land alongside Phase 5/6.
- Cross-domain ROAST between keep-core and tbtc-signer.

## Test plan

- [ ] Reviewer reads RFC end-to-end.
- [ ] Reviewer flags any phase that should be split further or
  reordered before Phase 1 begins.
- [ ] Reviewer answers the four open questions or marks them
  defer-to-Phase-3.

No code change in this PR, so no CI test run is meaningful beyond
asciidoc rendering.
mswilkison added a commit that referenced this pull request May 22, 2026
…ild tag (#3965)

## Summary

Forward-fix for #3866 CI: the Phase 1B binding file and test
referenced message types defined in \`//go:build frost_native\`
files but were themselves untagged. Untagged staticcheck on
the integration branch (#3866) then reported
\`undefined: nativeFROSTRoundOneCommitmentMessage\` and the
client-lint job failed.

Adds \`//go:build frost_native\` to:

- \`pkg/frost/signing/attempt_context_binding.go\`
- \`pkg/frost/signing/attempt_context_binding_test.go\`

The helpers and tests are only exercised by gated code paths
(the three message-type methods all live behind \`frost_native\`),
so the build tag is the right locus.

## Why now

PRs #3963 (Phase 1A) and #3964 (Phase 1B) were merged into the
\`feat/frost-schnorr-migration-scaffold\` branch before #3866's
integration CI ran. Once the merges landed, #3866's
\`client-lint\` job rebuilt under the untagged staticcheck pass
and exposed the missing tag. This PR is the smallest possible
fix.

## Verification

Locally with module-pinned staticcheck 2025.1.1:

\`\`\`
go build ./...
go build -tags 'frost_native frost_tbtc_signer' ./pkg/frost/...
go test  -tags 'frost_native frost_tbtc_signer' ./pkg/frost/signing/
staticcheck -checks \"-SA1019\" ./... # whole repo, silent
staticcheck -checks \"-SA1019\" ./pkg/frost/signing  # silent
\`\`\`

## Test plan

- [ ] CI green: client-lint, client-vet, client-scan,
  client-build-test-publish all pass.
- [ ] #3866 lint job recovers once this merges into
  \`feat/frost-schnorr-migration-scaffold\`.
mswilkison added a commit that referenced this pull request May 23, 2026
…3988)

## Summary

Closes the **M4 gap** from the original PR #3866 review by adding
the two evidence categories the RFC-21 Phase-2 work left as future
work: **validation-rejection evidence** and **first-write-wins-conflict
evidence**.

With this PR, the \`NextAttempt\` policy can permanently exclude
misbehaving peers on all four ROAST blame channels --
transport-overflow, validation-reject, equivocation-conflict, and
silence -- instead of just overflow + silence.

## Why this matters

A peer that only sends **malformed messages** (validation rejects,
never overflows the channel) was previously indistinguishable from
a silent peer. The transient silence-parking policy would
bench-and-reinstate them indefinitely, never permanently excluding
the malicious behaviour. Same for a peer **equivocating mid-attempt**:
the existing first-write-wins assembly correctly dropped the
conflicting retransmission but only logged the event -- the bundle
carried no structured evidence the coordinator's policy could act
on.

## What lands

### Recorder API

| Surface | Notes |
|---|---|
| \`RecordReject(sender, reason)\` | reason captured verbatim;
per-reason quota counter |
| \`RecordConflict(sender)\` | saturates at conflict quota |
| \`RejectQuotaDefault = 8\`, \`ConflictQuotaDefault = 4\` | matches
RFC-21 Layer A categoryQuota |
| Per-reason quotas independent | peer cannot saturate one reason to
mask another |

### Wire types

| Type | Sort order | Cap |
|---|---|---|
| \`RejectEntry{Sender, Reason, Count}\` | asc by Sender, then asc by
Reason | per-attempt evidence size bounded by Σ quotas |
| \`ConflictEntry{Sender, Count}\` | asc by Sender | per-attempt
evidence size bounded by Σ quotas |

Both fields use \`omitempty\` so pre-PR snapshots round-trip without
the new fields. \`Validate()\` enforces sorted-ascending invariants.

### NextAttempt policy

| Threshold | Value | Source |
|---|---|---|
| \`RejectExclusionThreshold\` | 1 | RFC-21 Layer B ("any non-transport
reject is sufficient cause") |
| \`ConflictExclusionThreshold\` | 1 | A single conflict is byzantine
evidence |

\`computeNextAttempt\` merges \`overflowBlamed\`, \`rejectBlamed\`,
\`conflictBlamed\` into the permanent ExcludedSet. The
\`blamedSenders\` helper is factored out so all three categories
share the deterministic sort + threshold-comparison logic.

### Receive-loop wiring

Three reject sites and three conflict sites updated across the two
files that house the three FROST/tbtc-signer receive loops:

| Site | Was | Now |
|---|---|---|
| \`shouldAcceptNativeFROSTMessage\` returns false | silent drop |
\`evidence.RecordReject(senderID, "validation_gate_rejected")\` + drop |
| First-write-wins conflict in assembly loop | warn log only |
\`evidence.RecordConflict(senderID)\` + warn log |

## Test coverage (15 new cases)

- 7 recorder tests: accumulation, per-reason quota saturation,
per-reason independence, conflict saturation, all-categories-present,
NoOp-inert, RFC-constant assertions
- 5 policy tests: single reject excludes, single conflict excludes,
reject+conflict on different senders, empty evidence (sanity),
threshold-constant assertions
- Receive-loop wiring is covered indirectly by the recorder unit tests;
the NoOp default keeps pre-RFC-21 receive semantics observably unchanged
so no integration-level test is required.

## Verification

| Command | Result |
|---|---|
| \`go build ./...\` + \`go build -tags 'frost_native frost_tbtc_signer
frost_roast_retry' ./...\` | both clean |
| \`go test ./pkg/frost/...\` + race | pass |
| \`go test -tags 'frost_native frost_tbtc_signer frost_roast_retry'
./pkg/frost/...\` | pass (5 packages) |
| \`staticcheck -checks '-SA1019' ./pkg/frost/...\` | silent |
| \`go vet ./pkg/frost/...\` + \`gofmt -l ./pkg/frost/\` | clean |

## RFC-21 status

With this PR, all four ROAST evidence categories are operational.
M4 from the original PR #3866 review is **fully closed**. The
keep-core code arc for RFC-21 is now feature-complete; remaining
work is operations-side (integration testnet, manifest flip).

## Test plan

- [ ] CI green.
- [ ] Reviewer confirms the per-reason quota independence is the right
semantics (alternative: single per-sender reject counter).
- [ ] Reviewer confirms threshold = 1 for both reject and conflict
(alternative: higher to absorb noise; trade-off is faster vs slower
exclusion of misbehaving peers).
mswilkison added a commit that referenced this pull request May 24, 2026
#3993)

## Why

The RFC-21 Phase 6 review decided which orchestration errors are
fallback-eligible (static config errors → safe to fall back to legacy
retry path) and which must hard-fail (runtime per-attempt errors → no
fallback, since per-participant divergence creates split-brain group
fracture). The rationale lived in commit messages, the RFC text, and
inline comments on individual sentinels — distributed enough that a
future maintainer reading just \`roast_retry_orchestration.go\` could
miss the load-bearing constraint.

This PR adds a top-of-file design-rationale block that centralises the
decision in the place that enforces it.

## What changed

- One file changed: \`pkg/frost/signing/roast_retry_orchestration.go\`
- Pure documentation: no behavior change, no test changes, no API change
- 49 lines added (one comment block)

## What it captures

1. **STATIC vs RUNTIME classification** — explicit definitions, with the
sentinel (\`ErrNoRoastRetryCoordinatorRegistered\`) and detection
mechanism (\`errors.Is\` in \`signing_loop_roast_dispatcher.go\`) named.
2. **Why static-error fallback is safe** — every honest signer observes
the same node-local config at startup, so the fallback decision is
deterministic across the group.
3. **Why runtime-error fallback is unsafe** — per-attempt protocol state
errors can be observed by some participants and not others within the
same attempt; fallback would put some operators on new code and others
on legacy for the same attempt.
4. **Enforcement rule** — any error surfaced from this package that is
intended to permit fallback MUST be the sentinel; wrapping ANY runtime
error in the sentinel is a safety regression that PR reviewers should
reject.
5. **Historical redirect** — the earlier design had \`BeginAttempt\`
failures fall back, on the assumption that BeginAttempt was cheap
idempotent setup. Review identified that BeginAttempt mutates
per-attempt state and can fail from races with concurrent receives; the
taxonomy was tightened so only true configuration errors are
fallback-eligible.

## Lineage

Surfaced in the cross-PR review re-evaluation following PR #3866
follow-up landings. Originally tracked as "Document static-vs-runtime
classification canonically" — initially flagged as "available if you
want," now elevated because the rationale was the most important
architectural decision in the RFC-21 stack and is currently the easiest
piece of design context to lose.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mswilkison added a commit that referenced this pull request May 25, 2026
## Summary
- add FROST WalletRegistry and FrostDkgValidator bindings plus config
and chain attachment
- implement v4 FROST DKG result digest assembly with full vs active
member types and fixture-backed parity tests
- add the native FROST DKG engine boundary, P2P round protocol, result
signing, coordinator lifecycle, challenge monitoring, and wallet ID
handling for x-only output keys

## Notes
- Stacked on #3866 / `feat/frost-schnorr-migration-scaffold`.
- Runtime DKG still requires the concrete native DKG engine registration
from the frost-uniffi-sdk UDL/Rust export work.
- The digest fixture now records the tBTC TypeScript generator source
and regeneration command. A paired tBTC PR should still commit the
mirror fixture at `docs/test-vectors/frost-dkg-result-digest-v1.json`
and add the TS-side emitter/test; until then, the keep-core test
verifies the pinned bytes and metadata but does not compare against a
checked-in tBTC mirror file.

## Validation
- `go test ./pkg/frost/registry ./pkg/chain/ethereum
./pkg/chain/ethereum/frost/gen/...`
- `go test ./pkg/tbtc -run
"TestFrostDKGSignatureThreshold|TestBoundedFrostDKGRecoveryStartBlock|TestFrostDKGRecoveryLookBackBlocks"
-count=1`
- `go test -tags "frost_native frost_tbtc_signer" ./pkg/tbtc -run
"TestLowestLocalActiveMemberIndex|TestFrostMisbehavedMemberIndices|TestFrostDKGSignatureThreshold|TestBoundedFrostDKGRecoveryStartBlock|TestFrostDKGRecoveryLookBackBlocks"
-count=1`
- CI `client-build-test-publish` passes on the prior pushed commit;
rerunning for the latest follow-up commit after push.

## Local Note
- Full local `go test ./pkg/tbtc` currently fails in standalone
`TestWatchCoordinationWindows`; this reproduces when run by itself and
appears unrelated to the FROST DKG coordinator changes.
@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 95f63242-c79d-4995-897f-6476faec66c5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/frost-schnorr-migration-scaffold

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

mswilkison added a commit that referenced this pull request Jun 2, 2026
## Summary

Stacked on #3866.

This PR implements Taproot-native key-path wallet signing for the FROST
migration path. It adds P2TR script handling, BIP341 SIGHASH_DEFAULT
computation, BIP340 Schnorr signature verification, and single-element
Taproot witness application in the Bitcoin transaction builder.

The wallet transaction executor now routes all-P2TR transactions through
the Schnorr/Taproot witness path. Mixed Taproot plus legacy inputs are
rejected before signing, so this does not introduce a dual-signing
model.

## Details

- Add P2TR script helpers and x-only output key extraction.
- Add Taproot key-path sighash generation without a repo-wide btcd
upgrade.
- Add `AddTaprootKeyPathSignatures` for 64-byte BIP340 signatures.
- Preserve canonical 32-byte FROST signing messages when `big.Int`
strips leading zero bytes.
- Add builder and wallet tests covering all-P2TR signing and mixed-input
rejection.

## Validation

- `go test ./pkg/bitcoin ./pkg/tbtc`
- `go test -tags=frost_native ./pkg/frost/signing`
…d-set acceptance

When a seat enters lost-sync from a transition bundle for an attempt it never
observed, the triggering bundle is operator-authenticated -- log the sending seat
and the claimed attempt-context hash once per lost-sync episode, so the
operational runbook can attribute and remove/slash a member spamming
bogus-attempt bundles. (This is an authenticated-insider liveness halt the blame
bridge does NOT close, because a never-observed attempt yields no evidence to
attribute it.)

markLostSync now uses CompareAndSwap and returns whether it transitioned, so the
attribution is logged exactly once even while the listener keeps receiving such
bundles. No change to the fail-closed semantics.

Also document that this residual is accepted under the PERMISSIONED operator set
(attributable + liveness-only + governance-removable) and MUST be revisited
before any move to a permissionless set, where the f+1 snapshot-corroboration /
resync fix would be warranted instead of accepting it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
During the ECDSA->FROST migration an operator is a member of BOTH sortition
pools at once: existing ECDSA wallets keep draining via redemptions while FROST
is live. The seven sortition.Chain methods on TbtcChain switch on
hasFrostAuthorization() (true whenever FROST is configured), and a single
MonitorPool loop consumed them -- so the loop labeled "legacy ECDSA sortition
pool monitoring" actually maintained the FROST pool during overlap, and
post-cutover (DisableLegacyECDSA=true) the only loop stopped, leaving the FROST
pool -- the one new FROST wallet DKG selects from -- unmonitored.

Bind monitoring explicitly per pool. Two sortition.Chain views
(ecdsaSortitionChain, frostSortitionChain) route directly to their own
registry/pool with no hasFrostAuthorization() switch, and the node runs one
MonitorPool loop per pool:
- ECDSA loop: existing flags + policy, now correctly ECDSA-bound.
- FROST loop: new DisableFrostSortitionPoolMonitoring flag (default-on), beta
  policy only (the ECDSA pre-params gate does not apply to FROST DKG), gated on
  FROST being configured and INDEPENDENT of DisableLegacyECDSA so the FROST pool
  stays monitored during the drain and after the legacy pool is retired.

The operator is not necessarily registered in both pools, so the FROST loop
treats sortition.ErrOperatorUnknown (now exported) as non-fatal: it warns and
leaves FROST monitoring inactive rather than aborting node startup. The legacy
loop keeps its existing fail-fast (the operator is ECDSA-registered during the
drain). TbtcChain's own sortition.Chain methods are left unchanged, so heartbeat
and other callers are unaffected; GetOperatorID stays ECDSA-bound. Both loops'
join/update txs already share tc.transactionMutex + tc.nonceManager, so they
cannot race on the operator account nonce.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mswilkison and others added 19 commits June 27, 2026 10:44
…uted DKG (#4110)

Stacks on the FROST/ROAST readiness branch (#3866). Adds two
separate-process ("shape B") real-crypto e2es over **real libp2p**,
complementing the in-process shape-A multinode test. Each re-execs the
test binary as N worker processes that each link `libfrost_tbtc`.

## What's added

- **shape-B** (`roast_shapeb_libp2p_multiproc_e2e_…`): dealer DKG run
once in a bootstrap subprocess, the encrypted key group copied into each
worker's own state dir; every worker drives the **ROAST
interactive-signing runner** over real libp2p and independently
aggregates the same BIP-340 signature (n winners, vs the shared-engine
shape-A's one). Covers per-node engine/state isolation + the libp2p
outer framing for the runner + ROAST + transport seam.
- **distributed-DKG** (`roast_distributed_dkg_libp2p_multiproc_e2e_…`):
every worker runs the **real distributed FROST DKG** (`part1/2/3`) over
libp2p, with round-2 per-recipient secret shares **sealed via secp256k1
ECDH + AES-256-GCM** (cleartext round-2 over a broadcast bus would let
any node sum `f_i(j)` and reconstruct a peer's share), so each node
holds **only its own key package**; then threshold-signs via the
stateless low-level path. Closes the dealer-DKG key-custody gap end to
end.

## Verification

Both green standalone (stable across repeated runs) and under the full
cgo gate, linking `libfrost_tbtc` built from the pinned signer ref
(`ci/frost-signer-pin.env` = `6e3718ba0`, unchanged):

```
CGO_ENABLED=1 KEEP_CORE_FROST_REQUIRE_CGO=true \
  go test -tags "frost_native frost_tbtc_signer" -run TestRealCgoInteractiveSigning ./pkg/frost/signing/
```

All 6 real-crypto tests pass together on this branch (4 shape-A +
shape-B + distributed-DKG).

## CI

The `frost-cgo-integration` gate already exercises both new tests — its
`-run 'TestRealCgoInteractiveSigning'` matches them by prefix, and the
pinned crate already exports the `dkg_part1/2/3` + low-level sign FFI
they use. No workflow change needed. (Heads-up: these are multi-process
+ gossipsub tests; robust locally via retransmission + warmup, but
mesh-convergence time varies — if shape-B ever flakes on a slow runner,
move it to a non-blocking job.)

## Note

Building the distributed-DKG test confirmed `dkg_part1/2/3` interoperate
over the transport — but the node's DKG path
(`executeTBTCSignerFROSTDKG`) still uses the dealer `RunDKGWithSeed`.
Wiring distributed DKG into the node remains the readiness-gate-blocking
work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…sient parking

After the transient-silence feasibility fix, NextAttempt's infeasibility test is
on the non-excluded (feasible) set, not the post-parking IncludedSet: a next
IncludedSet can fall below threshold due to transient parking without returning
ErrAttemptInfeasible (parked members reinstate next attempt). Update the
ErrAttemptInfeasible doc + the NextAttempt step-6 contract (and the now-misleading
'included set below threshold' error string) to match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lags

#4120 added Config.Tbtc.DisableFrostSortitionPoolMonitoring (default-on FROST
monitoring) but initTbtcFlags did not bind a CLI flag for it, unlike the legacy
opt-out -- so operators starting the node via CLI flags could not exercise the
advertised opt-out. Bind --tbtc.disableFrostSortitionPoolMonitoring and cover it
in the flags test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ror type

GetWallet derived a legacy wallet ID on ANY error from the canonical walletID
accessor. For a FROST wallet on a canonical Bridge, a transient call failure
would silently yield the left-padded legacy ID, and callers use
WalletChainData.WalletID to choose P2TR (FROST) vs P2WPKH (legacy) scripts, so it
would build or search the wrong wallet script.

The error type cannot reliably tell a legacy on-chain Bridge -- where the walletID
eth_call returns a normal RPC/ABI error even with the current binding -- from a
transient failure, so distinguishing by error is fragile and breaks legacy
deployments (Codex P1 on the first revision of this PR). Route the fallback by
SCHEME instead, using the wallet's ECDSA wallet ID (zero => FROST, non-zero =>
legacy ECDSA), which GetWallet already has:

  - Legacy ECDSA wallet: its canonical wallet ID equals the legacy derivation, so
    fall back on any accessor error (and it is the only option on a legacy Bridge
    lacking the accessor).
  - FROST wallet: requires the canonical ID; surface the error rather than return
    a wrong legacy ID. A FROST wallet only exists on a canonical-ID Bridge, so the
    error is genuinely transient.

Extracted into resolveWalletID with TestResolveWalletID covering all four cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## Summary

Follow-up to #3866 (Go FROST/ROAST coordinator). Three defensive fixes
at the cgo Go↔Rust signer FFI boundary in `pkg/frost/signing`, so a
malformed/oversized response — or a panic — from the native signer fails
a single attempt instead of crashing the node or leaking secrets.

## Fixes

1. **Bounds-check the response length before `C.GoBytes`**
(`parseBuildTaggedTBTCSignerResult`). The Rust-supplied `buffer.len`
(`C.size_t`) was narrowed to `C.int` with no check: a length `≥ 2³¹`
overflows to a negative value → `C.GoBytes` panics (`length out of
range`) at the cgo boundary, or silently truncates to a wrong length.
Now rejected with a clear error (the buffer is still freed by the
deferred free).

2. **Zeroize the secret request bytes on the C heap before `C.free`**
(the request-call helper). `C.CBytes(requestPayload)` copies the request
(which can carry signing-share / nonce material) to the C heap; plain
`C.free` does not overwrite, so the secret lingered in freed memory. Now
scrubbed via the existing `zeroBytes`, mirroring the Go-side hygiene
already applied to the caller's own copy.

3. **`recover()` at the FFI boundary**
(`nativeExecutionFFIExecutorAdapter.Execute`). A panic anywhere along
the cgo signing path (e.g. fix #1's overflow panic, or a nil-deref
decoding a malformed engine response) previously took down the whole
signing process. It's now converted to a failed attempt the outer tBTC
`signingRetryLoop` handles cleanly.

## Tests / verification

-
`TestNativeExecutionFFIExecutorAdapter_Execute_RecoversCgoBoundaryPanic`
— a panicking primitive; verified to **crash the process without the
recover** and pass with it. Full untagged `pkg/frost/signing` suite
stays green.
- The cgo-tagged changes (#1, #2) are **compile-verified** under `-tags
'frost_native frost_tbtc_signer cgo'` (build + `go vet`). They still
need a **runtime pass in the cgo + linked-`libfrost_tbtc` environment**
— I can't exercise the real FFI here.

Addresses the cgo-boundary cluster from the review of #3866
(length-narrowing crash, secret-in-C-heap, missing recover). _Found
during review of #3866._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…usions only (#4116)

## Summary

Follow-up to #3866 (Go FROST/ROAST coordinator). Fixes a robustness bug
in the ROAST `NextAttempt` policy where a **transient** event could
**permanently** kill a signing session.

`computeNextAttempt`'s infeasibility check used the post-parking
`IncludedSet` — which subtracts transiently-parked *and silenced*
members — to decide permanent session failure (`ErrAttemptInfeasible`).
Silence-parking deliberately has **no** accuser-quorum gate (it's meant
to be strictly transient), so a single transient mass-silence event — or
one byzantine member that is the elected coordinator for one attempt and
omits snapshots — could drop the `IncludedSet` below threshold and
permanently fail the session, even though the original signer set could
still complete it.

This contradicts the file's own design contract and hands a *single*
byzantine member (elected coordinator for one attempt) the power to
permanently kill a session — exactly the grind-to-`ErrAttemptInfeasible`
outcome the accuser-quorum machinery exists to prevent, routed around
via the ungated silence path:

- **step 4:** "Silence parking (strictly transient)… the attempt after
that automatically reinstates them, so a falsely-silenced honest peer
recovers without intervention."
- **`ErrAttemptInfeasible` doc:** returned when "the session can no
longer make progress **with the original signer set**."

## Fix

Evaluate feasibility against the **permanently-available set**
(`original \ ExcludedSet`): only permanent exclusions (established
reject / conflict / coordinator-equivocation) can render a session
infeasible. The next attempt's `IncludedSet` is still `feasible \
parkSet` — it may fall below threshold for one attempt, but the parked
members are reinstated next attempt (burning one attempt instead of
failing the session). When parking would *empty* the `IncludedSet`, the
parked members are reinstated now rather than producing an
unconstructable empty attempt.

Pathological grinding under *sustained* malicious silence stays bounded
by the outer tBTC `signingAttemptsLimit` — the inner chain no longer
needs a permanent-fail to stop grinding.

## Tests

The two existing tests asserted the buggy behavior (silence below
threshold → permanent fail, using a degenerate n-of-n config).
Retargeted to genuine infeasibility and added recovery coverage:

- `TestNextAttempt_InfeasibilityWhenPermanentExclusionsBelowThreshold` —
permanent exclusion below threshold still fails (correctly).
- `TestSoak_TransientSilenceBelowThresholdRecovers` — full
multi-coordinator harness: silence does **not** fail; silenced members
are parked, not excluded.
- `TestNextAttempt_TransientSilenceBelowThresholdDoesNotPermanentlyFail`
— unit: silence parks (transient), never excludes (permanent).
- `TestNextAttempt_TransientSilenceRecoversAcrossTwoAttempts` —
end-to-end: silence → park → reinstate, included set returns to full.

The "original signer set preserved" invariant (`|Inc|+|Exc|+|Park|` =
original size) holds in both branches. `gofmt`, `go vet`, and the full
`pkg/frost/roast` suite pass; builds clean untagged and under `-tags
'frost_native frost_tbtc_signer cgo frost_roast_retry'`.

_Found during review of #3866._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…proot (#4117)

## Summary

Follow-up to #3866 (Go FROST/ROAST coordinator). Gates the native (Rust
cgo) `BuildTaprootTx` parity/substitution path so it runs only for
Taproot transactions.

`walletTransactionExecutor.signTransaction` ran the native
`BuildTaprootTx` path **unconditionally**, at the very top, for
**every** transaction a wallet signs — including legacy ECDSA
redemptions / sweeps / moving-funds — before any scheme/Taproot check.
The native builder is a Taproot builder, so running it for a legacy
(non-Taproot) transaction is meaningless, and a hard error from it
(other than `ErrNativeCryptographyUnavailable`) would fail the signing
of that legacy transaction. In the default build the native builder is a
no-op (`("", nil)`), so this only bit the `frost_native+cgo` build — but
it's a latent regression on the legacy signing path.

## Fix

Gate the native-build/substitution path on
`unsignedTx.HasOnlyTaprootKeyPathInputs()` — the **same predicate that
already governs FROST signing** later in the same function (`"cannot
apply FROST signatures to non-taproot transaction inputs"`). The native
Taproot builder now runs only for all-Taproot transactions; legacy
transactions skip it. The path is extracted into
`maybeSubstituteNativeBuildTaprootTx`.

> Gate signal chosen per a Codex second opinion: the tx shape
(`HasOnlyTaprootKeyPathInputs`), not the wallet scheme — "the native
builder's applicability is about the unsigned tx it is asked to build."

## Tests

The four legacy-P2PKH substitution-through-`signTransaction` integration
tests asserted the now-removed behaviour (substitution for a legacy tx).
They are replaced with:

- `SkipsNativeBuildForLegacyTransaction` — a legacy (P2PKH) tx: the
native build is **not** invoked even with substitution enabled, and the
tx signs via the Go path. (Verified to fail if the gate is removed.)
- `SubstitutesNativeBuildForTaprootTransaction` — an all-Taproot tx: the
native build **is** invoked, and a matching native tx is substituted
(`ReplaceUnsignedTransaction` + the substitution info log) and signed
with a Taproot witness.

The two native-build error-propagation tests are retargeted to a Taproot
tx (the only path on which the native build now runs), via a shared
`buildTaprootKeyPathUnsignedTxForTest` helper. The substitution
**logic** remains covered directly by the
`TestEvaluateNativeUnsignedTransactionForSigning_*` tests.

gofmt + `go vet` clean; full untagged `pkg/tbtc` suite passes; builds
clean under `-tags 'frost_native frost_tbtc_signer cgo'`.

_Found during review of #3866._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…package (#4118)

## Summary

Follow-up to #3866. Eliminates a byte-for-byte duplicate of the retry
participant-selection algorithm by moving it into a shared,
scheme-neutral package.

`pkg/frost/retry/retry.go` was a **byte-for-byte copy** of
`pkg/tecdsa/retry/retry.go` — two hand-synchronized copies of the
security-critical retry participant-selection algorithm
(`EvaluateRetryParticipantsForSigning` / `…ForKeyGeneration`). A fix
applied to one and not the other would silently make ECDSA and FROST
select **different** qualified-operator sets for the same seed.

## Why a shared package (vs. one importing the other)

This selection is **structurally shared** across schemes: it's the
scheme-agnostic "pick the base signing group from the ready members"
used for the **initial attempt of every signing** (and for DKG). FROST's
per-attempt robustness diverges only on **retries**, via ROAST
(`NextAttempt`) — the initial selection does not. ROAST is a
transition-from-previous-attempt function and *cannot* produce attempt
0, so `roastSigningParticipantSelector` falls back to this selection for
the initial attempt (`ConsumeRoastTransitionForSelection` returns
`ErrRoastSelectionFallBackToLegacy` at `roastAttemptNumber == 0`). So
the initial selection is **permanently shared** — neither `tecdsa` nor
`frost` should own it.

## Change

- Move the algorithm **byte-for-byte** (verified `diff`-identical, no
logic change) into a new neutral package `pkg/protocol/retry` (package
`retry`, alongside `pkg/protocol/group`).
- Repoint the **only two** callers — `pkg/tbtc/dkg_loop.go` (DKG) and
`pkg/tbtc/signing_loop_legacy_selector.go` (FROST initial selection) —
at it (import-path change only; usage `retry.X` unchanged).
- Delete `pkg/tecdsa/retry` and `pkg/frost/retry`.
- Keep the **superset** test (the `tecdsa` copy had a
`TestExcludeOperatorTripletsCountsRightOperatorSeats` case the `frost`
copy lacked).
- Update three doc comments that referenced the old path.

## Verification

- `diff` confirms the moved algorithm is byte-identical to the original
(pure move).
- `go build ./...` clean (no dangling references to the deleted
packages); builds clean under `-tags 'frost_native frost_tbtc_signer cgo
frost_roast_retry'`.
- gofmt + `go vet` clean; the moved `pkg/protocol/retry` test passes
(incl. the superset case); the full `pkg/tbtc` suite passes.

No behavior change — a single source eliminates the silent-drift hazard.

_Found during review of #3866._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…d-set acceptance (#4119)

## Summary

Follow-up to #3866. A small, behavior-preserving hardening + decision
record for the ROAST transition-exchange lost-sync path (review finding
#2).

**Background.** In `onBundle`, a seat that receives a transition bundle
for an attempt it **never observed** trips `markLostSync()` — failing
the wallet's signing retry loop closed — *before* any verification (a
behind-seat lacks the observe handle full `VerifyBundle` needs). So an
authenticated group member can broadcast a structurally-valid bundle
with a bogus, never-committed attempt hash and halt every honest seat's
signing. The blame bridge (PR2b-2) does **not** close this: a
never-committed attempt produces no evidence to attribute the sender.

**Decision (after Codex review + threat analysis).** Under the
**permissioned** operator set this residual is **accepted**: it's
liveness-only (fail-closed — never an unsafe/divergent signature), the
triggering seat is operator-authenticated (so attribution is immediate),
and a misbehaving operator is governance-removable + economically
deterred. The proper fix (f+1 snapshot-corroboration, or a resync state)
is a real ROAST protocol change whose simple form can fracture the group
on legitimate *sparse-failure* bundles — disproportionate while the set
stays permissioned.

## Change

- **Attribution logging.** When a seat enters lost-sync from an
unobserved-attempt bundle, log the **sending seat** and the **claimed
attempt-context hash** — once per lost-sync episode — so the operational
runbook can identify and remove/slash a member spamming bogus-attempt
bundles.
- `markLostSync` now uses `CompareAndSwap` and returns whether it
transitioned, so the attribution is logged exactly once even while the
listener keeps receiving such bundles. **No change to the fail-closed
semantics.**
- **Decision record in-code:** documents the permissioned-set acceptance
and marks it as a **hard item to revisit before any move to a
permissionless operator set**, where an
anonymous/costless/non-attributable DoS would warrant the
corroboration/resync fix.

## Verification

`gofmt` + `go vet` clean; builds under `-tags 'frost_native
frost_roast_retry cgo frost_tbtc_signer'`; the transition-exchange /
lost-sync / bundle tests pass. The only caller of `markLostSync` is the
updated site.

_Found during review of #3866._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…4120)

## Summary

Follow-up to #3866 (review finding #3). Fixes sortition-pool monitoring
for the dual-pool window of the ECDSA→FROST migration.

During the migration an operator is a member of **both** sortition pools
at once: existing ECDSA wallets keep draining via redemptions while
FROST is live. The seven `sortition.Chain` methods on `TbtcChain` switch
on `hasFrostAuthorization()` (true whenever FROST is configured), and a
**single** `MonitorPool` loop consumed them — so:

- **During overlap** (`DisableLegacyECDSA=false`): the loop labeled
*"legacy ECDSA sortition pool monitoring"* actually maintained the
**FROST** pool; the ECDSA pool got no maintenance.
- **Post-cutover** (`DisableLegacyECDSA=true`): the only loop stopped
entirely, leaving the **FROST** pool — the one new FROST wallet DKG
selects from — **unmonitored**.

> The ECDSA drain itself is unaffected either way: signing an existing
wallet uses the locally-held key share + the wallet's fixed roster,
never sortition-pool state. This is a
pool-membership/selection-eligibility fix, not a fund-availability one.

## Change

Bind monitoring **explicitly per pool**. Two `sortition.Chain` views
(`ecdsaSortitionChain`, `frostSortitionChain`) route directly to their
own registry/pool with **no** `hasFrostAuthorization()` switch, and the
node runs **one `MonitorPool` loop per pool**:

- **ECDSA loop** — existing flags + policy, now correctly ECDSA-bound
(data path *and* beta policy read the ECDSA pool).
- **FROST loop** — new `DisableFrostSortitionPoolMonitoring` flag
(**default-on**), beta policy only (the ECDSA pre-params gate doesn't
apply to FROST DKG), gated on FROST being configured and **independent
of `DisableLegacyECDSA`** so the FROST pool stays monitored during the
drain *and* after the legacy pool is retired.

The operator isn't necessarily registered in both pools, so the FROST
loop treats `sortition.ErrOperatorUnknown` (now exported) as
**non-fatal** — it warns and leaves FROST monitoring inactive rather
than aborting startup. The legacy loop keeps its existing fail-fast (the
operator is ECDSA-registered during the drain).

`TbtcChain`'s own `sortition.Chain` methods are **left unchanged**, so
heartbeat and other callers are unaffected; `GetOperatorID` stays
ECDSA-bound. Both loops' join/update txs already share
`tc.transactionMutex` + `tc.nonceManager`, so they cannot race on the
operator account nonce.

### Design input baked in (per maintainer + Codex)
- FROST loop join policy = beta-operator only (no FROST equivalent of
the ECDSA pre-params gate; FROST DKG readiness is announced separately).
- Operator compensation is out-of-band, so ECDSA-pool staleness during
the drain is benign; the FROST loop is the load-bearing one.
- `GetOperatorID` asymmetry preserved (separate `GetFrostOperatorID`
exists).

## Known limitation
`MonitorPool` (shared with the beacon) hard-returns at startup without
starting its ticker, so an operator that registers for FROST *after*
node start needs a restart to begin FROST monitoring. Logged clearly;
deliberately not changing the shared `MonitorPool` here.

## Tests / verification
- New `tbtc_sortition_chain_views_test.go` pins each view to its
intended pool (legacy→ECDSA, FROST→FROST, unconfigured→nil) —
**negative-checked** (mis-binding the legacy view to the FROST pool
makes it fail).
- gofmt + `go vet` clean; untagged `go build ./...` clean; builds under
`-tags 'frost_native frost_tbtc_signer cgo frost_roast_retry'`;
`pkg/sortition` + `pkg/chain/ethereum` tests pass; full `pkg/tbtc` suite
passes.
- Adversarial review (3 angles → verify): 0 confirmed findings; adapter
fidelity confirmed line-for-line across all 13 methods; cross-loop nonce
safety confirmed.

_Found during review of #3866._

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…ror type (#4122)

## Summary

Addresses a Codex finding (relayed via the #4119 review):
`TbtcChain.GetWallet` derived a **legacy** wallet ID on **any** error
from the canonical `walletID` accessor. For a FROST wallet on a
canonical Bridge, a transient call failure would silently yield the
left-padded legacy ID — and callers use `WalletChainData.WalletID` to
choose **P2TR (FROST)** vs **P2WPKH (legacy)** scripts, so the node
would build or search the **wrong wallet script**.

## Why route by scheme (revised after Codex P1)

The first revision distinguished by error type (a sentinel for the
missing accessor, surface everything else). Codex correctly flagged a
**P1 regression**: a *legacy on-chain Bridge* built with the *current*
generated binding still satisfies the accessor interface, so its missing
`walletID` function returns a normal RPC/ABI error — not the sentinel —
and that revision would surface it and **break `GetWallet` on exactly
the legacy deployments the fallback exists for**. Error type cannot
reliably separate "function absent on-chain" from "transient."

So this routes by **scheme**, using the wallet's `EcdsaWalletID` (which
`GetWallet` already reads, and which the codebase already uses to infer
scheme — zero ⇒ FROST):

- **Legacy ECDSA wallet** (`EcdsaWalletID != 0`): its canonical wallet
ID *equals* its legacy derivation, so fall back on **any** accessor
error — and it's the only option on a legacy Bridge lacking the
accessor.
- **FROST wallet** (`EcdsaWalletID == 0`): requires the canonical ID;
**surface** the error rather than return a wrong legacy ID. A FROST
wallet only exists on a canonical-ID Bridge, so such an error is
genuinely transient.

Logic is extracted into `resolveWalletID(bridge, walletPublicKeyHash,
ecdsaWalletID)`.

## Tests

`TestResolveWalletID` covers all four cases: accessor success →
canonical; FROST + accessor error → surfaced; **legacy + accessor error
→ legacy fallback** (the P1 regression guard — verified to fail if the
routing surfaces errors for legacy wallets); legacy + missing-accessor
binding → legacy fallback. gofmt + `go vet` clean; full
`pkg/chain/ethereum` suite passes.

_Found during the Codex review batch on #4115#4120; revised per the
Codex P1 re-review._

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)
The native tbtc-signer Sign path scrubs its Go-heap FFI transport buffers
that carry secret material (via `defer zeroBytes(...)` on the request/
response/nonces slices), but the DKG path did not, leaving long-term
share and DKG secret material resident in the Go heap after use. This
closes that DKG<->Sign zeroization inconsistency.

The DKG engine methods build a Go-heap request payload (JSON), hand a C
copy to the Rust FFI via C.CBytes, and receive the response as a fresh
Go slice via C.GoBytes. callBuildTaggedTBTCSignerOperation already scrubs
and frees the C-heap request copy, and the Rust side frees the C response
buffer, but the Go-side request/response slices were never wiped. Mirror
the Sign path exactly by deferring zeroBytes on the secret-bearing Go
buffers, so a mid-function or error return still wipes:

- Part1: response (round-1 secret package / private polynomial coeffs).
- Part2: request (round-1 secret package) and response (round-2 secret
  package + per-recipient round-2 secret shares).
- Part3: request (round-2 secret package + received secret shares) and
  response (final key package / long-term signing share).
- RunDKGWithSeed: request (DKG seed that deterministically reconstructs
  the group secret); its response is public metadata only.

Public-only buffers are left untouched (RunDKG request/response, Part1
request). The defers run after the decoders evaluate the return value,
and the decoders return freshly hex-decoded copies, so wiping the
transport buffers never corrupts the returned secrets. cgo-safe: the
Go slices are independent of the C copies, so zeroing them after the
call returns neither races the C side nor risks a double-free.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The interactive FROST + ROAST retry coordinator flow (liveness +
evidence/blame) lives behind the `frost_roast_retry` build tag, but no CI
job ever set it: client.yml and release.yml run untagged `go build/test`,
and frost-cgo-integration.yml built only `frost_native frost_tbtc_signer`
and `-run`-filtered to the `TestRealCgoInteractiveSigning*` family. So the
entire ROAST retry state machine and ~30 `frost_native` unit tests never
compiled or ran in CI, and `make build` (the release/Docker path) shipped
the `!frost_roast_retry` no-op stubs. This closes that activation gap.

- client.yml: add a `client-frost-roast-retry` job that builds the
  coordinator path with cgo disabled (`go build -tags "frost_roast_retry"`
  and `-tags "frost_native frost_roast_retry"` over `./...`) and runs the
  tagged unit tests under the three non-cgo tag sets that cover the whole
  matrix (`frost_native`, `frost_roast_retry`,
  `frost_native frost_roast_retry`) over ./pkg/frost/... and ./pkg/tbtc/...
  against the mock FFI (no Rust lib, no Docker).

- frost-cgo-integration.yml: add `frost_roast_retry` to the real-crypto
  cgo tag set and drop the narrow `-run` filter so the whole tagged
  pkg/frost/signing suite runs against the linked libfrost_tbtc (skips
  still forbidden); the heavy multiproc e2e tests already ran and
  self-constrain their worker subprocesses with anchored `-test.run`, so
  dropping the outer filter only adds lighter tagged unit tests. Add a
  step that smoke-builds the activation artifact via `make build-frost`.

- Makefile: add a `build-frost` target that produces the ROAST-retry
  activation binary (tags `frost_native frost_tbtc_signer
  frost_roast_retry`, cgo-linked to libfrost_tbtc with the same
  CGO_LDFLAGS as the cgo workflow).

- frost-roast-retry-rollout.adoc: replace the false claim that CI already
  exercised the tag with an accurate description of the new coverage.

Locally validated (system Go, cgo off): `go build -tags "frost_roast_retry"
./...` and `-tags "frost_native frost_roast_retry" ./...` compile clean;
all three non-cgo tag sets pass on ./pkg/frost/... and ./pkg/tbtc/...
The cgo-linked full build is deferred to CI (requires building the Rust
libfrost_tbtc, which the cgo workflow does from the pinned signer source).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
mswilkison and others added 6 commits July 2, 2026 11:07
…c is linked

Dropping the narrow `-run` filter in frost-cgo-integration.yml made
`TestRegisterBuildTaggedTBTCSignerEngine` run for the first time under the
cgo gate, where it failed: it asserts every engine operation returns
`ErrNativeCryptographyUnavailable`, a fail-closed contract that only holds
when libfrost_tbtc is NOT linked (the cgo bridge is compiled but the
frost_tbtc_* symbols are unresolvable via dlsym). Under the gate the lib IS
linked, so `StartSignRound` instead reached the real signer and its
provenance gate, producing a different error.

Probe the linked lib with `assertTBTCSignerABICompatible()` - the same
check the ABI preflight uses, which keeps `ErrNativeCryptographyUnavailable`
in the chain iff the lib is absent - and skip the fail-closed assertions
with a reason when the lib is present. The registration-wiring assertions
still run under both builds, and the linked-lib crypto path is covered by
`TestBuildTaggedTBTCSignerInteractiveFROSTBridge_WithLinkedSigner` and the
`TestRealCgoInteractiveSigning*` suite. No assertion was weakened and no
production code was touched; this matches the skip-when-unavailable pattern
already used by the neighbouring cgo tests.

Validated locally by building libfrost_tbtc from the pinned signer mirror
and running the whole tagged pkg/frost/signing suite with the lib linked
and KEEP_CORE_FROST_REQUIRE_CGO=true: 402 pass, 1 skip (this test), 0 fail;
the real-crypto DKG/multiproc e2e tests ran and passed. Without the lib
linked the test still runs its fail-closed assertions and passes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…th) (#4128)

## Finding

The native tbtc-signer **Sign** path scrubs its Go-heap FFI transport
buffers that carry secret material — `defer zeroBytes(...)` on the
request payload, response payload, and nonces slices
(`native_frost_engine_tbtc_signer_registration_frost_native.go`). The
**DKG** path in the same file did **not**, leaving long-term share / DKG
secret material resident in the Go heap after use. This is a
memory-hygiene inconsistency: DKG secrets (private polynomial
coefficients, per-recipient secret shares, and the final long-term
signing share) linger in reclaimable-but-unwiped Go heap buffers,
whereas the equivalent Sign secrets are wiped.

## Sites zeroed

Each DKG engine method marshals a Go-heap JSON request, hands a **copy**
to Rust via `C.CBytes`, and receives the response as a **fresh** Go
slice via `C.GoBytes`. Only the genuinely secret-bearing Go-side buffers
are wiped (public-only buffers are left untouched):

| Method | Buffer | Secret it carries |
| --- | --- | --- |
| `Part1` (`~L718`) | response | round-1 secret package (private
polynomial coefficients, "must never be broadcast") |
| `Part2` (`~L737`) | request | round-1 secret package |
| `Part2` (`~L746`) | response | round-2 secret package + per-recipient
round-2 secret shares |
| `Part3` (`~L768`) | request | round-2 secret package + received
round-2 secret shares |
| `Part3` (`~L777`) | response | final key package (long-term signing
share) |
| `RunDKGWithSeed` (`~L686`) | request | DKG seed that deterministically
reconstructs the group secret |

Left untouched because they carry no secret: `RunDKG` request/response
(participant public keys + metadata), `RunDKGWithSeed` response (public
metadata only), `Part1` request (participant id + signer counts).

## How it mirrors the Sign path

The Sign path uses the package-local `zeroBytes(data []byte)` helper
(`native_frost_engine_frost_native.go:59`) via `defer`:
- `GenerateNoncesAndCommitments`: `defer zeroBytes(responsePayload)`
(response carries one-time nonces).
- `Sign`: `defer zeroBytes(noncesData)` and `defer
zeroBytes(requestPayload)`.

This change reuses the same helper and the same `defer` placement (right
after a secret request is built / a secret response is received), so a
mid-function or error return still wipes. No new/divergent mechanism is
introduced.

## cgo-safety reasoning

- `callBuildTaggedTBTCSignerOperation` already `C.CBytes`-copies the
request to the C heap and, on defer, `zeroBytes`+`C.free`s that C copy.
The Go-side request slice is a **separate** `json.Marshal` allocation,
so zeroing it after the call returns neither races the C side nor risks
a double-free.
- The response is a `C.GoBytes` copy; the C-side response buffer is
freed separately by `tbtc_signer_free_buffer`. Wiping the Go copy is
independent and safe.
- The deferred `zeroBytes` runs **after** the decoder evaluates the
return value, and the decoders return freshly hex-decoded copies
(independent of the transport buffer), so wiping never corrupts the
returned secret. Identical ordering to the existing
`GenerateNoncesAndCommitments`.

## Validation

- `gofmt -l` clean on the touched file.
- `go vet -tags "frost_native frost_tbtc_signer" ./pkg/frost/signing/`
clean.
- `go build -tags "frost_native frost_tbtc_signer" ./pkg/frost/...`
succeeds (cgo path uses runtime `dlopen`, so it compiles the touched
file without the Rust lib present).
- `go build -tags "frost_roast_retry" ./pkg/frost/...` succeeds (non-cgo
compile check).
- DKG/RunDKG/Sign/Nonces unit tests pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
…#4130)

## Why

This closes the **sole production-activation blocker** found in the deep
production-readiness review of the ROAST retry work (stacked on #3866).

The interactive FROST + ROAST retry coordinator flow — `BeginAttempt` /
`RecordEvidence` / `AggregateBundle` / `VerifyBundle` / `NextAttempt`,
i.e. liveness plus slashing/blame — lives behind the `frost_roast_retry`
Go build tag (~50 files). **No CI job ever set that tag:**

- `client.yml` (~line 138) and `release.yml` (~line 56) run untagged `go
build/test ./...`, which compiles only the `!frost_roast_retry` no-op
stubs.
- `frost-cgo-integration.yml` (~line 111) built only `-tags
"frost_native frost_tbtc_signer"` and `-run`-filtered to the
`TestRealCgoInteractiveSigning*` family.

Net effect: the entire ROAST retry state machine and ~30 `frost_native`
unit tests never compiled or ran anywhere in CI, and `make build` (the
release/Docker path) shipped the ROAST-retry-noop default build. The
rollout doc also **falsely** claimed CI already exercised the tag.

## What this changes

**`.github/workflows/client.yml` — new `client-frost-roast-retry` job**
(plain Go, cgo off, no Docker; runs on every PR touching Go):
- `go build -tags "frost_roast_retry" ./...` and `go build -tags
"frost_native frost_roast_retry" ./...` (mock-FFI, no Rust lib).
- `go test` under the **three non-cgo tag sets that cover the whole
matrix** — `frost_native`, `frost_roast_retry`, `frost_native
frost_roast_retry` — over `./pkg/frost/...` and `./pkg/tbtc/...`.

**`.github/workflows/frost-cgo-integration.yml`:**
- Adds `frost_roast_retry` to the real-crypto cgo tag set (`frost_native
frost_tbtc_signer frost_roast_retry`).
- **Drops the narrow `-run` filter** so the whole tagged
`./pkg/frost/signing/` suite runs against the linked `libfrost_tbtc`,
with skips still forbidden (`KEEP_CORE_FROST_REQUIRE_CGO=true`). Safe by
construction: the heavy multiproc e2e tests already ran (matched by the
old substring regex) and spawn their worker subprocesses with anchored
`-test.run`, so dropping the outer filter only *adds* lighter tagged
unit tests.
- New step smoke-builds the activation artifact via `make build-frost`
using the lib built earlier in the job.
- Adds `Makefile` to the path triggers.

**`Makefile` — new `build-frost` target:** produces the ROAST-retry
activation binary (tags `frost_native frost_tbtc_signer
frost_roast_retry`, cgo-linked to `libfrost_tbtc` with the same
`CGO_LDFLAGS` as the cgo workflow). The default `make build` still ships
the `!frost_roast_retry` stubs; adopting the tagged artifact in the
release/Docker path is gated on the readiness-manifest flip and is
intentionally left to that decision (the Rust lib currently lives on a
separate branch — see `ci/frost-signer-pin.env`), so this PR makes the
artifact *producible + CI-validated* rather than silently flipping the
default release image.

**`docs/development/frost-roast-retry-rollout.adoc`:** replaces the
false "CI already exercises the tag" claim with an accurate description
of the coverage above.

## Validated locally (system Go, cgo disabled)

| Check | Result |
| --- | --- |
| `go build -tags "frost_roast_retry" ./...` | compiles clean |
| `go build -tags "frost_native frost_roast_retry" ./...` | compiles
clean |
| `go test -tags "frost_native" ./pkg/frost/... ./pkg/tbtc/...` | pass |
| `go test -tags "frost_roast_retry" ./pkg/frost/... ./pkg/tbtc/...` |
pass |
| `go test -tags "frost_native frost_roast_retry" ./pkg/frost/...
./pkg/tbtc/...` | pass |
| `make -n build-frost` | expands correctly |

The tagged builds compiled clean and every newly-run non-cgo tagged test
**passed** — no failures were surfaced, and no assertion was weakened.

**Deferred to CI:** the cgo-linked full build/tests and the `make
build-frost` smoke — these require building the Rust `libfrost_tbtc`,
which cannot be done locally without the pinned signer source. The cgo
job already builds that lib, so those steps are correct by construction
(they reuse the same lib + `CGO_LDFLAGS`).

## Follow-ups / known gaps

- **cgo path is CI-only-validated.** The `frost_native frost_tbtc_signer
frost_roast_retry` real-crypto suite and `make build-frost` link
`libfrost_tbtc`; they were not run on this machine. First green run of
`frost-cgo-integration.yml` on this branch is the confirmation.
- **Release/Docker still ship the stub build by design.** `make build`
(Dockerfile `build-docker` stage) is unchanged; wiring `build-frost`
into the release image is deferred to the readiness-manifest flip and to
the branch merge that brings the signer crate in-tree (per
`ci/frost-signer-pin.env`).
- **pkg/tbtc cgo-tagged tests** (the 1–2 `frost_native frost_tbtc_signer
cgo` files, e.g. real taproot-tx build) are not yet in the cgo gate; the
cgo job keeps its `pkg/frost/signing` scope. Adding `./pkg/tbtc/` to the
cgo run is a reasonable next step but pulls the heavy tbtc suite under
real-crypto linking, so it is left as a follow-up.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Chain the full FROST wallet-creation coordinator↔chain flow into ONE
in-process run: a local FrostDKGChain emits FrostDKGStarted, the
coordinator's OnFrostDKGStarted subscription handles it (dedup, block
confirmation, DKG-state check, past-event lookup, group-membership
resolution), executeFrostDKGIfPossible announces readiness and runs the
REAL cgo tbtc-signer DKG, and the assembled result is submitted back
through SubmitFrostDKGResult, after which the wallet is verified
registered on the chain.

Previously the coordinator↔chain wiring and the real cgo DKG execution
were covered separately (frost_dkg_coordinator_test.go with stub results;
frost_dkg_execution_frost_native_test.go in isolation) and never in one
flow.

The DKG output is real: a thin recording wrapper delegates to the cgo
engine and captures the x-only group key, which the test asserts equals
the key submitted on-chain byte-for-byte (no injected/fake result can
pass), is a valid secp256k1 point, and backs a registered wallet. The
group is reduced to 3 seats held by one node because the cgo engine is a
process-global OnceLock<Mutex> and its development dealer DKG holds all
key packages in one engine (n>=2; the library rejects n==1). Gated
frost_native && frost_tbtc_signer && cgo, so plain frost_native builds
are unaffected.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
P1: probe the real linked libfrost_tbtc up front. The build-tagged
tbtc-signer engine registers whether or not the lib is linked, so the
prior availability check passed even with an absent/stale lib and the
missing ABI only surfaced inside the coordinator goroutine's
RunDKGWithSeed, making the test hang until the 90s deadline instead of
skipping. Now exercise the once-per-process ABI preflight up front via a
raw seeded RunDKGWithSeed and route the result through a helper mirroring
the reference skipFrostUnavailable: ErrNativeCryptographyUnavailable SKIPS
(or FATALs under KEEP_CORE_FROST_REQUIRE_CGO), any other error fails. The
probe runs on the raw engine so it never pollutes the recording wrapper's
captured key.

P2: remove the recovery-goroutine race. initializeFrostDKGCoordinator also
launches recoverFrostDKGCoordinatorState; if it observed AwaitingResult
before the OnFrostDKGStarted subscription, it could drive the DKG via the
waitForConfirmation=false bypass and the deduplicator would suppress the
subscription path, passing the test without exercising the confirmation
flow. The chain now signals when recovery has completed its initial IDLE
scan (the first GetFrostDKGState reader), and the test waits for that
signal before flipping to AwaitingResult+emitting - so only the
subscription's block-confirmation path can run. A deterministic assertion
(submit block >= emit block + dkgStartedConfirmationBlocks) proves the
confirmation waitForBlockHeight path was exercised and guards against
silent regression.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
## What this adds

An in-process integration test that chains the **full FROST
wallet-creation
coordinator↔chain flow into ONE run**:

```
local chain emits FrostDKGStarted
  → initializeFrostDKGCoordinator's OnFrostDKGStarted subscription fires
    → handleFrostDKGStarted (dedup → confirm block → GetFrostDKGState →
       PastFrostDKGStartedEvents → resolve group membership)
      → executeFrostDKGIfPossible (readiness announcement → REAL cgo DKG via
         executeFrostDKG/RunDKGWithSeed → signer registration → result assembly
         → DKG-result operator-signature collection)
        → SubmitFrostDKGResult (through the FrostDKGChain interface)
          → wallet verified registered on the local chain
```

New file:
`pkg/tbtc/frost_dkg_coordinator_chain_e2e_frost_native_test.go`
(build-gated `frost_native && frost_tbtc_signer && cgo`).

## Why

Today the coordinator↔chain wiring and the real cgo DKG execution are
tested
**separately** and never in one flow:

- `frost_dkg_coordinator_test.go` drives the chain plumbing with stub
results.
- `frost_dkg_execution_frost_native_test.go` drives the real DKG in
isolation.

This closes that gap: event delivery, confirmation, membership
resolution,
readiness announcement, result assembly/signature collection, on-chain
submission, and wallet registration all run together against a real cgo
DKG
output.

## What is REAL vs REDUCED

**Real**
- **The DKG output.** `executeFrostDKG` calls the process-global cgo
  tbtc-signer engine (`buildTaggedTBTCSignerEngine`, registered via
`RegisterNativeExecutionFFISigningPrimitiveForBuild`). The x-only group
key
  that lands on-chain is the exact key the engine produced.
- **The submission path.** The result is submitted through the
`FrostDKGChain`
interface (`SubmitFrostDKGResult`) — not injected into chain state
directly.
- **The coordinator wiring.** The event is delivered through the
  `OnFrostDKGStarted` subscription registered by
`initializeFrostDKGCoordinator`; confirmation, state check, past-event
lookup,
membership resolution, readiness announcement, DKG-result
operator-signature
  collection, and delayed submission all run as in production.

**Reduced (documented)**
- **Group size / custody.** The group is a 3-seat group whose 3 seats
are all
  held by ONE operator/node. The cgo engine is a process-global
`OnceLock<Mutex>`, so N independent real-custody participants cannot run
concurrently in one OS process. The tbtc-signer **development dealer
DKG** by
design has a single engine hold every participant's key package — which
is
  exactly this shape. The cgo library rejects `n == 1`
(`participants must contain at least 2 entries`), so the minimum honest
  reduction is `n >= 2`; the test uses `GroupSize=3, GroupQuorum=2,
  HonestThreshold=2`.
- **Signer profile.** The test sets `TBTC_SIGNER_PROFILE=development`
(plus a
hermetic state-encryption key + per-process state path, mirroring the
existing
real-cgo reference harness). Bootstrap/dealer DKG is disabled under the
production profile, which requires distributed DKG wiring across
processes.

Nothing about the crypto or the submission is faked: the DKG output
comes from
the real engine and the submission goes through the chain interface.

## Load-bearing assertion

A thin recording wrapper delegates every engine method to the real cgo
engine
and captures the exact x-only key returned by `RunDKGWithSeed`. The test
then
asserts the on-chain submitted `Result.XOnlyOutputKey`:

1. **equals the captured real engine output byte-for-byte** (so no
injected /
   fake result can pass),
2. is non-zero and lifts to a valid secp256k1 curve point, and
3. the wallet derived from it is registered on the local chain
   (`IsFrostWalletRegistered`).

## How to run

```
export FROST_LIB_DIR=<path to libfrost_tbtc dir>
export CGO_ENABLED=1
export CGO_LDFLAGS="-L${FROST_LIB_DIR} -Wl,-rpath,${FROST_LIB_DIR} -lfrost_tbtc"
export KEEP_CORE_FROST_REQUIRE_CGO=true
go test -tags "frost_native frost_tbtc_signer cgo" -count=1 -v \
  -run 'TestFrostDKGCoordinatorChainEndToEnd_RealCgo' ./pkg/tbtc/
```

Proof lines from a passing run:

```
STEP 2: emitting FrostDKGStarted seed=0x42ef0705...
STEP 3+4: chain received SubmitFrostDKGResult x-only=3971f8481d56...d03d wallet=00...453a
STEP 4: SubmitFrostDKGResult observed on-chain
LOAD-BEARING: real cgo DKG x-only key 3971f8481d56...d03d landed on-chain via the coordinator
STEP 5: wallet 00...453a registered on-chain
--- PASS: TestFrostDKGCoordinatorChainEndToEnd_RealCgo
```

Without the cgo lib linked the test skips (or fails when
`KEEP_CORE_FROST_REQUIRE_CGO=true`), so it stays inert where real crypto
is
unavailable. It is excluded from the plain `frost_native` build, so the
existing
coordinator tests are unaffected (`go test -tags frost_native
./pkg/tbtc/`
stays green).

## Not covered / follow-up

This is an **in-process** rehearsal with one node holding all seats. It
does NOT
cover the fully-live path:

- **N distinct node processes** each holding real per-seat custody,
running the
**production distributed DKG** (not the development dealer DKG), which
is what
  `TBTC_SIGNER_PROFILE=production` requires.
- A **real chain** (hardhat / Ethereum) with **staked operators and
sortition**
  emitting `DkgStarted`, real `SelectFrostGroup`, and on-chain result
validation / challenge / approve, instead of the test-local
`FrostDKGChain`.
- **DKG-result signature collection across multiple operators** over a
live
network (here a single operator's signature satisfies the reduced
group's
  threshold with no network round-trip).

Those belong in a multi-process / system-test rehearsal and are out of
scope for
this in-process integration test.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant