Skip to content

Feature: per-skill output contracts — standardize what each stage returns/persists/surfaces (don't leave it to model discretion) #204

Description

@joshft

Problem

Across the workflow skills (/cspec, /creview-spec, /ctdd, /cverify, /cupdate-arch, /cdocs, /cauto, /cprune, …) what gets surfaced back to the user at the end of each stage is largely left to model judgment — "surface whatever it thinks is useful." The result is an inconsistent, unpredictable experience: the same stage returns a terse line in one run and a wall of detail in another; important items (decisions made, things deferred, artifacts written, what to do next) are sometimes surfaced and sometimes silently omitted; and the structure differs skill-to-skill, so the user never builds a stable mental model of "what I get at this stage."

This is the shared root of several already-tracked failures — e.g. findings not persisted (#94), calibration telemetry silently dropped (#189), forked skills stalling after presenting proposals (#90). Each is an instance of the same gap: there is no declared contract for what a stage MUST return, persist, and surface, so an omission is a model judgment call rather than a detectable violation.

Proposal

Define an explicit output contract per skill/stage: a declared, ideally machine-checkable specification of what each stage must return to the user and must persist, with a uniform top-level structure shared across all skills. The model fills the content; the contract fixes the shape and the required-to-surface set.

A candidate uniform contract — every user-facing stage ends with these, in this order:

  1. Result — one line: the stage outcome (e.g. pass / fail / blocked / N findings / advanced to <phase>).
  2. What changed / produced — the concrete deliverables of this stage.
  3. Needs your attention — decisions made autonomously, escalations, deferred items, anything requiring a human call. Always rendered (even if "none") — never omitted at model discretion. This is the load-bearing section: it's exactly what gets silently dropped today.
  4. Artifacts — paths to everything written/updated (report, findings, manifest, …).
  5. Next step — the single recommended next action.

Pair the human-readable rendering with a small machine-readable companion (structured fields / a fenced block) so a contract violation is mechanically detectable — the same way /cauto already emits an AUTONOMOUS_DECISIONS_START/END block and the verification report follows a fixed section layout. Generalize that one-off discipline into a framework-level contract that every skill declares and is checked against.

Why

Scope / rollout

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions