Skip to content

Trius-AI/Spectre

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spectre

Agentic software formation framework. Instead of asking a coding agent to write a system, Spectre copies a minimal self-modifying agent and has it transform itself into the target system — then seals the result by removing its own coding ability.

How it works

Formation Spec (JSON)          Verifier Agent
       │                            │
       ▼                            │
┌─────────────┐                     │
│  Seed Agent │── modifies itself ──│
│  (copy of   │── per formation     │
│  template)  │── spec & criteria   │
└─────────────┘                     │
       │                            │
       ▼                            ▼
  Each step:              Check compilation,
  read source,            run acceptance criteria,
  modify files,           detect regressions ─── revert if failed
  run tests                          │
       │                             ▼
       │                      All criteria met?
       │                         │           │
       │                        Yes          No
       │                         │           │
       ▼                         ▼           └── revert, retry
┌─────────────┐            ┌──────────┐
│   Seal      │            │  Repeat   │
│  strip self-│            │  steps   │
│  mod tools  │            └──────────┘
└─────────────┘
       │
       ▼
┌─────────────┐
│   Final     │  No modifySelf, readSelf, or runTest.
│  Artifact   │  Self-modification code completely removed.
│  (sealed)   │  Domain tools + LLM reasoning remain.
└─────────────┘

The formation loop

  1. Copy template — the seed agent template is copied to a workspace
  2. Each step — the seed agent runs as a subprocess:
    • Reads its own source via readSelf
    • Decides modifications via LLM reasoning
    • Writes changes via modifySelf
    • Self-tests via runTest
  3. Verify — after each step, the verifier:
    • Checks compilation (tsc --noEmit)
    • Runs acceptance criteria (structural or behavioral)
    • Re-checks previously-passed criteria (regression detection)
    • Reverts the step on any failure
  4. Seal — once all criteria pass:
    • Strips all code between @seal:remove-start / @seal:remove-end markers
    • Deletes files marked @seal:remove-file
    • Cleans dead imports
    • Verifies the sealed artifact still compiles

Quick start

# Prerequisites: Node.js 22+, Ollama running with glm-5.1:cloud

npm install

# Run formation
npx tsx src/index.ts form \
  --spec ./specs/example-task-planner.json \
  --model glm-5.1:cloud \
  --max-steps 10 \
  --output ./artifacts

# Verify an artifact
npx tsx src/index.ts verify \
  --workspace ./artifacts/task-planner \
  --spec ./specs/example-task-planner.json

# Seal an artifact manually
npx tsx src/index.ts seal \
  --workspace ./artifacts/task-planner

# Run a sealed artifact
npx tsx src/index.ts run \
  --workspace ./artifacts/task-planner \
  "Create a Node.js project with package.json and index.ts"

# Evaluate a sealed artifact against its spec's test scenarios
npx tsx src/index.ts evaluate \
  --workspace ./artifacts/task-planner \
  --spec ./specs/example-task-planner.json

Continuing a formation

If a formation is interrupted (Ctrl+C, crash, max steps reached), resume it with --resume:

npx tsx src/index.ts form \
  --spec ./specs/example-task-planner.json \
  --resume ./artifacts/task-planner \
  --model glm-5.1:cloud \
  --max-steps 20

When resuming, Spectre:

  • Reuses the existing workspace — no template copy, no overwrite
  • Skips npm install if node_modules/ already exists
  • Resumes step numbering from the git commit history
  • Restores previously passed criteria by running a quick verification on the current state
  • Appends to the existing formation-log.jsonl

Improving a completed project

When you resume a project that has already met all criteria and is sealed, Spectre detects this and asks what you want to do:

This project has already met all acceptance criteria and is sealed.

What would you like to do?
  1. Leave as-is (no changes)
  2. Auto-improve (LLM suggests and implements one improvement)
  3. Directed improvement (you specify what to improve)

Enter choice [1/2/3]:
  • Leave as-is — exits immediately, no changes
  • Auto-improve — re-seeds the project (restores self-modification tools + @seal:remove markers), has the LLM propose and implement one meaningful improvement, then re-seals
  • Directed improvement — same as auto-improve, but you provide the improvement direction

You can skip the prompt with --improve:

# Auto-improve with LLM-chosen direction
npx tsx src/index.ts form \
  --spec ./specs/example-task-planner.json \
  --resume ./artifacts/task-planner \
  --improve ""

# Directed improvement
npx tsx src/index.ts form \
  --spec ./specs/example-task-planner.json \
  --resume ./artifacts/task-planner \
  --improve "Add retry logic with exponential backoff to all tool calls"

How re-seeding works

Re-seeding restores the self-modification scaffolding into a sealed artifact:

  1. Copies self-modify.ts back into src/tools/
  2. Patches agent.ts — adds the registerSelfModifyTools import and self-modify tool routing, wrapped in @seal:remove markers
  3. Patches main.ts — adds the --formation flag handling, wrapped in @seal:remove markers
  4. Patches system-prompt.ts — adds a formation mode section before the domain prompt, wrapped in @seal:remove markers
  5. Commits as reseed: restored self-modification capability
  6. Runs formation with the improvement direction as the spec description
  7. Seals again when done

Formation spec

A JSON file describing the target system:

{
  "name": "my-agent",
  "description": "What this agent should do",
  "domain": {
    "tools": [
      {
        "name": "myTool",
        "description": "What the tool does",
        "inputSchema": { "type": "object", "properties": { ... } },
        "outputSchema": { "type": "object", "properties": { ... } }
      }
    ],
    "behaviors": ["List of behavioral requirements"],
    "llmUsage": [
      {
        "trigger": "When LLM should be called",
        "purpose": "What the LLM decides"
      }
    ]
  },
  "acceptanceCriteria": [
    {
      "id": "criterion-id",
      "description": "What must be true",
      "testScenario": {
        "input": "Test input",
        "expectedBehavior": "Expected output"
      }
    }
  ]
}

Acceptance criteria come in two types:

  • Behavioral"input": "some test input" — the verifier generates a Vitest test from the scenario and runs it
  • Structural"input": "N/A - structural check" — the verifier uses LLM to analyze whether the source code satisfies the criterion

Regression detection

The verifier remembers which criteria passed in previous steps. If a previously-passing criterion fails after a new modification, it's flagged as a regression and the step is reverted — even if new criteria pass.

Architecture

spectre/
├── src/
│   ├── index.ts           # CLI (form, verify, seal, run)
│   ├── formation.ts       # Orchestrates the formation loop
│   ├── verifier.ts        # Checks compilation, criteria, regression
│   ├── seal.ts            # Strips @seal:remove markers, deletes files
│   ├── llm.ts             # Ollama client with retry/backoff
│   ├── git.ts             # Git init, commit, revert operations
│   ├── types.ts           # Shared type definitions
│   └── template/          # Seed agent template (copied to workspace)
│       ├── package.json
│       ├── tsconfig.json
│       └── src/
│           ├── main.ts          # Entry point (formation mode + runtime)
│           ├── agent.ts          # Agent loop (LLM + tool calling)
│           ├── llm.ts           # Per-artifact LLM client
│           ├── compaction.ts     # Context window management
│           ├── logger.ts         # Structured + visual logging
│           ├── system-prompt.ts  # System prompt (formation section sealable)
│           ├── types.ts
│           └── tools/
│               ├── index.ts      # Tool registry
│               ├── domain.ts     # Domain tools (filled during formation)
│               └── self-modify.ts # @seal:remove-file — stripped during sealing
├── specs/
│   └── example-task-planner.json
└── tests/
    ├── e2e.test.ts        # Full lifecycle test
    ├── seal.test.ts        # Sealing logic
    ├── git.test.ts         # Git operations
    ├── markers.test.ts     # @seal:remove marker constants
    └── spec.test.ts        # Formation spec schema

Sealing markers

Code that should be removed during sealing is annotated:

// @seal:remove-start
import { registerSelfModifyTools } from "./tools/self-modify.js";
// ...all formation/self-modification code...
// @seal:remove-end

// This code remains after sealing:
export async function runAgent(userMessage: string, llm: LLMClient) {
  // ...
}

Files that should be entirely deleted:

// @seal:remove-file
// entire file is deleted during sealing

Context compaction

The agent manages its context window with three strategies:

  1. Tool result truncation — results longer than 4000 chars are truncated before entering the conversation
  2. Threshold compaction — when messages exceed 40 or ~80K chars, the middle of the conversation is summarized by the LLM, keeping the system prompt and recent context
  3. Emergency compaction — on context-length errors, an aggressive compaction keeps only the system prompt, a summary, and the last 4 messages

Compaction never triggers on 503 (ambiguous — could be a server issue). It only triggers on explicit context-length error messages from the LLM.

Convergence: checkCriteria + step timeout

The seed agent can call checkCriteria() at any time during a formation step. This contacts a criteria check server running alongside the formation orchestrator, which runs the actual verifier and returns real-time pass/fail results for each acceptance criterion. When allMet: true is returned, the agent should stop modifying files.

The criteria check server:

  • Starts on a random port at the beginning of each formation step
  • URL is passed to the agent via SPECTRE_CRITERIA_URL environment variable
  • Serves GET /check-criteria — returns { allMet, compilationOk, criteria: [{ id, passed, reason }] }
  • Stops when the formation step ends

Step timeout: each formation step has a 5-minute timeout. If the agent doesn't converge within that time, it's terminated (SIGTERM, then SIGKILL after 5s), and the step is treated as a failure.

Error handling

  • LLM errors — 3 retries with exponential backoff on 429/5xx/connection errors; graceful degradation in the agent loop on persistent failure
  • Compilation failures — step reverted via git, consecutive failure limit of 3 before abort
  • Verification failures — step reverted, regression detection prevents previously-working criteria from breaking
  • Git revert safety — handles single-commit repos, falls back to git checkout if reset fails
  • Protected filesmodifySelf rejects writes to critical files (self-modify.ts, package.json, tsconfig.json, etc.)

Development

npm install
npm test          # run all tests
npm run lint      # typecheck
npm run dev       # run CLI via tsx

License

MIT

About

LLM-driven agentic software development framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors