Skip to content

Sub-agent model: override is silently dropped in BYOK / custom-provider mode #3891

Description

@tsm-harmoney

Describe the bug

Bug: Sub-agent model: override is silently dropped in BYOK / custom-provider mode

Summary

When the CLI is configured with a custom model provider (BYOK) and a custom agent declares
a model: different from the primary session model, the sub-agent does not run on its declared
model. Instead it silently falls back to the session model, and no error is raised.

Root cause: in BYOK mode the CLI never requests the provider's /models catalog. The per-agent
executor therefore receives an empty availableModels list, cannot validate the requested
model, and falls back to sessionModel.

The same scenario works correctly under standard GitHub authentication, where the CLI fetches
the catalog, populates availableModels, and honors the per-agent override. The defect is specific
to the BYOK code path.


Expected vs. actual behavior

Expected Actual (BYOK)
Sub-agent model the model: declared in the agent definition the primary session model
If declared model unavailable explicit "model not available" error silent fallback, no error
Provider /models fetched and used to validate models never requested

Root cause

  1. The BYOK code path never fetches the provider's /models. The fake provider received zero
    GET /models requests, and Listed models collapses to the session model duplicated. Under
    standard auth the catalog is fetched and availableModels is populated.
  2. Empty availableModels triggers a silent fallback to sessionModel. The executor logs
    using first candidate "claude-opus-4.6" (CAPI will validate) but then resolves to the session
    model instead of sending the requested one.

Suggested fixes (either or both)

  1. In BYOK mode, fetch the custom provider's /models and use it to populate availableModels,
    so per-agent overrides can be validated and honored (the provider already exposes the endpoint).
  2. When availableModels is empty, trust definitionModel — send it upstream (as the
    "CAPI will validate" log already implies) rather than substituting sessionModel. If the
    provider rejects the model, surface an explicit error instead of falling back silently.

Affected version

GitHub Copilot CLI 1.0.63

Steps to reproduce the behavior

Steps to reproduce

The repro is fully self-contained: a ~40-line fake OpenAI-compatible provider (so no real model
account or quota is needed) plus one custom agent. The fake provider logs the model field of
every inference request — this is the ground truth for which model is actually used on the wire.

1. Create a working directory and a custom sub-agent

mkdir -p repro/.github/agents && cd repro

cat > .github/agents/model-probe.agent.md <<'EOF'
---
model: claude-opus-4.6
description: probe subagent that reports the model it runs on
---
PROBE_SUBAGENT_MARKER. Output exactly one line and nothing else: probe done.
EOF

The primary session will run on claude-sonnet-4.6; this sub-agent declares claude-opus-4.6.
The two must differ so the override is observable.

2. Create the fake OpenAI-compatible provider

fake-provider.js:

const http = require('http');
const fs = require('fs');
const PORT = 8799;
const LOG = __dirname + '/server.log';
const MODELS = ['claude-sonnet-4.6','claude-opus-4.6','claude-opus-4.8','gpt-5.5','gpt-5.4'];
const log = s => fs.appendFileSync(LOG, s + '\n');
fs.writeFileSync(LOG, '');

// Exact shape the real CLI emits for a Task-tool dispatch (function name "task").
const TASK_ARGS = JSON.stringify({
  name: 'model-probe', prompt: 'go', agent_type: 'model-probe',
  description: 'probe', mode: 'sync'
});

const json = (res, o) => { res.writeHead(200, {'Content-Type':'application/json'}); res.end(JSON.stringify(o)); };
const sse  = (res, chunks) => {
  res.writeHead(200, {'Content-Type':'text/event-stream'});
  for (const c of chunks) res.write('data: ' + JSON.stringify(c) + '\n\n');
  res.write('data: [DONE]\n\n'); res.end();
};

http.createServer((req, res) => {
  let body = ''; req.on('data', d => body += d); req.on('end', () => {
    const url = req.url || '';
    if (req.method === 'GET' && url.includes('models')) {
      log('[GET ' + url + '] -> serving catalog of ' + MODELS.length + ' models');
      return json(res, { object:'list', data: MODELS.map(id => ({ id, object:'model', owned_by:'fake' })) });
    }
    if (req.method === 'POST' && url.includes('chat/completions')) {
      let p = {}; try { p = JSON.parse(body); } catch (_) {}
      const model = p.model, stream = !!p.stream;
      const msgs = JSON.stringify(p.messages || []);
      const isSub = msgs.includes('PROBE_SUBAGENT_MARKER');
      const dispatched = msgs.includes('Agent completed') || msgs.includes('agent_id');
      log('[POST chat/completions] wire_model=' + model + '  stream=' + stream +
          '  isSubAgent=' + isSub + '  dispatchedAlready=' + dispatched);
      const id = 'chatcmpl-fake';
      if (isSub || dispatched) {                     // terminate this turn with plain text
        const content = isSub ? 'probe done' : 'done';
        if (stream) return sse(res, [
          { id, object:'chat.completion.chunk', created:0, model, choices:[{ index:0, delta:{ role:'assistant', content }, finish_reason:null }] },
          { id, object:'chat.completion.chunk', created:0, model, choices:[{ index:0, delta:{}, finish_reason:'stop' }] },
        ]);
        return json(res, { id, object:'chat.completion', created:0, model,
          choices:[{ index:0, message:{ role:'assistant', content }, finish_reason:'stop' }],
          usage:{ prompt_tokens:1, completion_tokens:1, total_tokens:2 } });
      }
      const tc = { id:'call_task', type:'function', function:{ name:'task', arguments: TASK_ARGS } };
      if (stream) return sse(res, [
        { id, object:'chat.completion.chunk', created:0, model, choices:[{ index:0, delta:{ role:'assistant', content:null, tool_calls:[{ index:0, ...tc }] }, finish_reason:null }] },
        { id, object:'chat.completion.chunk', created:0, model, choices:[{ index:0, delta:{}, finish_reason:'tool_calls' }] },
      ]);
      return json(res, { id, object:'chat.completion', created:0, model,
        choices:[{ index:0, message:{ role:'assistant', content:null, tool_calls:[tc] }, finish_reason:'tool_calls' }],
        usage:{ prompt_tokens:1, completion_tokens:1, total_tokens:2 } });
    }
    log('[other ' + req.method + ' ' + url + ']');
    json(res, { ok:true });
  });
}).listen(PORT, '127.0.0.1', () => log('fake provider listening on ' + PORT));

Note: the fake provider re-emits the task tool-call every primary turn, so after the sub-agent
returns, the primary will keep re-dispatching in a loop. That is harmless for the repro (the
evidence is captured on the first sub-agent turn) but stop the CLI after a few seconds and
kill the server, or add a stronger termination condition.

3. Run the CLI in BYOK mode and dispatch the sub-agent

node fake-provider.js &        # start the fake provider

export COPILOT_PROVIDER_BASE_URL=http://127.0.0.1:8799
export COPILOT_PROVIDER_TYPE=openai
export COPILOT_PROVIDER_WIRE_API=completions
export COPILOT_PROVIDER_API_KEY=dummy
export COPILOT_MODEL=claude-sonnet-4.6

copilot -p "Use the Task tool to invoke the custom agent named 'model-probe' exactly once with the prompt 'go'. Then reply done." \
  --model claude-sonnet-4.6 --allow-all-tools --no-ask-user \
  --log-level all --log-dir ./logs --no-color -s

# stop the CLI after it dispatches once (Ctrl-C), then:
pkill -f fake-provider.js

4. Inspect the evidence

# Which model did each request actually use on the wire?
grep "chat/completions" server.log | sort | uniq -c

# How the CLI resolved the sub-agent's model:
grep -E "definitionModel|availableModels|Using model|Listed models" logs/process-*.log | head

Observed results (BYOK — BROKEN)

Fake provider wire log (server.log): the sub-agent's request goes out as the session model:

[POST chat/completions] wire_model=claude-sonnet-4.6  stream=true  isSubAgent=false  ...   ← primary
[POST chat/completions] wire_model=claude-sonnet-4.6  stream=true  isSubAgent=true   ...   ← SUB-AGENT
  • GET /models requests received: 0 — the CLI never asked the provider for its catalog.
  • Every sub-agent request used claude-sonnet-4.6; zero claude-opus-4.6 requests were sent.

CLI session log (logs/process-*.log):

Agent "model-probe": definitionModel="claude-opus-4.6", sessionModel="claude-sonnet-4.6", availableModels=[]
Agent "model-probe": no available models list, using first candidate "claude-opus-4.6" (CAPI will validate)
Using model: claude-sonnet-4.6
Listed models: [claude-sonnet-4.6,claude-sonnet-4.6]
  • definitionModel="claude-opus-4.6" is read correctly from the agent definition.
  • availableModels=[] — empty.
  • Despite logging it will use "claude-opus-4.6" and "let CAPI validate", it resolves to
    claude-sonnet-4.6 (the session model) and sends that on the wire.

Control: validate without the custom provider (standard GitHub auth — WORKS)

This is the same scenario with no custom provider, used to confirm the override mechanism
itself is fine and the defect is BYOK-specific. It uses normal GitHub Copilot authentication and
real models, so it requires a Copilot account that has both claude-sonnet-4.6 and an
opus-family model available (and available request quota).

Steps to validate

  1. Reuse the same working directory and the same custom agent from the main repro:
    # .github/agents/model-probe.agent.md  (unchanged)
    ---
    model: claude-opus-4.6
    description: probe subagent that reports the model it runs on
    ---
    PROBE_SUBAGENT_MARKER. Output exactly one line and nothing else: probe done.
    
  2. Ensure you are authenticated and not in BYOK mode — clear any custom-provider env vars so
    the CLI uses GitHub's model routing:
    copilot login                      # if not already authenticated
    unset COPILOT_PROVIDER_BASE_URL COPILOT_PROVIDER_TYPE COPILOT_PROVIDER_WIRE_API \
          COPILOT_PROVIDER_API_KEY COPILOT_MODEL
    
  3. Run the CLI with the session model set to claude-sonnet-4.6 and dispatch the sub-agent via the
    Task tool (the sub-agent declares claude-opus-4.6):
    rm -rf logs
    copilot -p "Use the Task tool to invoke the custom agent named 'model-probe' exactly once with the prompt 'go'. Then reply done." \
      --model claude-sonnet-4.6 --allow-all-tools --no-ask-user \
      --log-level all --log-dir ./logs --no-color -s
    
  4. Inspect how the CLI resolved the sub-agent's model:
    grep -E "definitionModel|availableModels|Using model|Listed models" logs/process-*.log | head
    

Expected behavior

Expected output

Agent "model-probe": definitionModel="claude-opus-4.6", sessionModel="claude-sonnet-4.6", availableModels=[claude-sonnet-4.6, claude-opus-4.8, claude-opus-4.7, claude-opus-4.6, gpt-5.5, gpt-5.4, gpt-5.3-codex]
Using model: claude-opus-4.6
Listed models: [claude-opus-4.6,Claude Opus 4.6], [claude-opus-4.7,Claude Opus 4.7], [claude-opus-4.8,Claude Opus 4.8], [claude-sonnet-4.6,Claude Sonnet 4.6], [gpt-5.3-codex,…], …

The CLI fetches the full catalog, availableModels has 7 entries, and the sub-agent runs on
claude-opus-4.6 while the primary stays claude-sonnet-4.6. The override is honored.

The only difference between the broken (BYOK) and working (standard auth) runs is the presence of
the custom provider — confirming the defect lives in the BYOK code path, not the override
mechanism. The same availableModels=[] → fallback signature would be the assertion to watch when
verifying a fix in BYOK mode.

Additional context

Environment

  • GitHub Copilot CLI v1.0.63 (copilot --version).
  • OS: reproduced on Windows 11 (git-bash); not OS-specific.
  • Node.js available (only to run the tiny fake provider).
  • BYOK env:
    • COPILOT_PROVIDER_BASE_URL=http://127.0.0.1:8799
    • COPILOT_PROVIDER_TYPE=openai
    • COPILOT_PROVIDER_WIRE_API=completions
    • COPILOT_PROVIDER_API_KEY=dummy
    • COPILOT_MODEL=claude-sonnet-4.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions