Skip to content

Validate its own generated schemas#1017

Open
cds-amal wants to merge 2 commits into
codama-idl:mainfrom
cds-io:fix/validators-nested-defined-type-link
Open

Validate its own generated schemas#1017
cds-amal wants to merge 2 commits into
codama-idl:mainfrom
cds-io:fix/validators-nested-defined-type-link

Conversation

@cds-amal

@cds-amal cds-amal commented Jul 1, 2026

Copy link
Copy Markdown

Summary and Context

I spent most of my day chasing a validation issue for my own project and decided to look at how codama validation works.

getValidationItemsVisitor throws CodamaError: Expected node of kind [definedTypeLinkNode], got [...] on any IDL where a definedTypeLinkNode sits below the top of the tree: under a struct field, an instruction argument, an array element, or a PDA seed. That is most real programs, and six of this repo's own committed test IDLs hit it, including example-idl.json, system-program-idl.json, token-idl.json, and token-2022-idl.json.

The root cause is a single ordering issue in how the visitor threads its NodeStack, and the same issue silently corrupted every recorded ValidationItem.path. The fix restores the invariant the other eleven recordNodeStackVisitor consumers already follow.

Disclosure: I traced the issue, hacked the fix and used LLM to polish.

The invariant

When a visitor threads a context through the tree (here, the NodeStack of ancestors), it has to commit the current node into that context as it enters the node, before anything reads it. Two readers run on arrival: the node's own logic, and, further down, its children. So the node must be on the stack for its own resolution, as well as its children's.

What went wrong

This visitor was the only one to run its per-node logic before committing the node to the stack, so the validation ran in the gap between "arrived at the node" and "recorded the node." (In pipe terms it applied extendVisitor after recordNodeStackVisitor; getByteSizeVisitor, setInstructionAccountDefaultValuesVisitor, and the nine others do the reverse.)

That one gap produced two symptoms:

  1. The crash. visitDefinedTypeLink calls linkables.has(stack.getPath(node.kind)). getPath(kind) asserts the top of the stack is kind; the link node wasn't committed yet, so the top was its parent, giving Expected node of kind [definedTypeLinkNode], got [structFieldTypeNode] (or arrayTypeNode, or variablePdaSeedNode: always exactly the link's parent).

  2. Off-by-one paths. Every validationItem(..., stack) snapshotted the stack at that same pre-commit moment, so each ValidationItem.path omitted the node it points at, contradicting the documented contract ("the path of nodes that led to the node above, including the node itself").

Regression

The crash isn't original to the file. Tracing the link check:

commit check crashes?
#7 (Add validators package) linkables.has(node, stack), passes node explicitly no
#270 (Use NodePath in LinkableDictionary) linkables.has(stack.getPath()), no assertion no
#275 (Refactor NodeStack and NodePath) linkables.has(stack.getPath(node.kind)), asserts stack top yes

#275 made getPath(kind) assert the stack's top kind and switched this call to pass node.kind. That assertion holds under the standard composition (node already committed), which this visitor happened not to have, so the call began throwing. The off-by-one paths are older, dating to the original composition, but stayed invisible until #275's assertion surfaced the underlying staleness.

The fix

Commit the node before the per-node logic runs, matching every other visitor: recordNodeStackVisitor and recordLinkablesOnFirstVisitVisitor now wrap the validation logic instead of the reverse. stack.getPath(node.kind) resolves, and recorded paths satisfy the contract. The change is a two-line reordering of the pipe.

Tests

  • Two regression tests build the smallest crashing shape from the node constructors (a defined type whose struct field links to another). Both throw before the fix. One asserts a resolvable link validates cleanly; the other asserts a missing link is reported as an error (Pointing to a missing defined type named "missing"), so the fix restores the validator's ability to catch a broken link, not just stop the crash.
  • The two existing tests asserted the off-by-one paths ([] for the program node, [node] for the nested items); they are corrected to the documented behavior ([node], [node, tupleNode], [node, structNode]).

Verification against the repo's own corpus

Running the validator over every committed rootNode IDL in packages/dynamic-client/test/programs/idls/: before the fix, 6 of 12 throw; after, all 12 validate.

Suggestion: guard our own output in CI

Since six committed IDLs crashed the validator, a small guard would catch this class of regression early: a test that runs getValidationItemsVisitor over the repo's own fixture IDLs and asserts it doesn't throw. Happy to add it in this PR or a follow-up, wherever you'd prefer it to live.

cds-amal added 2 commits July 1, 2026 19:30
getValidationItemsVisitor throws `Expected node of kind [definedTypeLinkNode]`
whenever it walks a definedTypeLinkNode that sits below the top of the tree: a
struct field, an argument, an array element, or a PDA seed. That is most real
programs, and six of this repo's own test IDLs (example-idl, system, token,
token-2022, mpl-token-metadata, pmp) hit it.

Add two tests that build the smallest such shape from the node constructors,
both of which throw today: one asserts a resolvable link validates cleanly, the
other asserts a missing link is reported as an error. So the fix must restore
the visitor's ability to catch a broken link, not merely stop the crash.
…rules

getValidationItemsVisitor was the only visitor to run its per-node logic before
committing the current node to the stack, so validation ran in the gap between
arriving at a node and recording it. Two bugs fell out of that single ordering:

  - visitDefinedTypeLink calls stack.getPath(node.kind), which asserts the top
    of the stack is the link node; the link node was not committed yet, so the
    top was its parent, and any nested definedTypeLink (struct field, argument,
    array, PDA seed) threw `Expected node of kind [definedTypeLinkNode]`. Six of
    this repo's own test IDLs hit it, including example-idl.json, System, Token,
    and Token-2022.
  - every validationItem recorded an ancestors-only path, missing the node
    itself, which contradicts the documented ValidationItem.path contract
    ("the path of nodes that led to the node above, including the node itself").

Commit the node before the per-node logic runs, matching the eleven other
recordNodeStackVisitor consumers (getByteSizeVisitor, and the rest): the node
is on the stack when the visit runs, so getPath(node.kind) resolves, missing
links are reported instead of throwing, and recorded paths satisfy the
contract. The two existing tests asserted the off-by-one paths, so they are
corrected to the documented behaviour.
@changeset-bot

changeset-bot Bot commented Jul 1, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: d88b551

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 16 packages
Name Type
@codama/validators Patch
codama Patch
@codama/dynamic-address-resolution Patch
@codama/dynamic-client Patch
@codama/dynamic-instructions Patch
@codama/errors Patch
@codama/node-types Patch
@codama/nodes Patch
@codama/visitors-core Patch
@codama/visitors Patch
@codama/dynamic-codecs Patch
@codama/dynamic-parsers Patch
@codama/fragments Patch
@codama/nodes-from-anchor Patch
@codama/renderers-core Patch
@codama/cli Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@cds-amal

cds-amal commented Jul 3, 2026

Copy link
Copy Markdown
Author

@lorisleiva , @mikhd -- Dispatch from the Heart of Gold...

Don’t panic! The validator isn't broken in any profound metaphysical sense; it merely wandered into one of those delightfully improbable corners of the universe where swapping two visitors changes whether six perfectly respectable IDLs exist. There's a PR containing the appropriate improbability correction, should anyone be passing through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant