feat(sandbox,providers): add aws-bedrock as a recognized inference provider#1704
feat(sandbox,providers): add aws-bedrock as a recognized inference provider#1704st-gr wants to merge 8 commits into
Conversation
|
All contributors have signed the DCO ✍️ ✅ |
|
I have read the DCO document and I hereby sign the DCO. |
|
recheck |
| # substitute the `{region}` placeholder. Operators in other regions | ||
| # override via the `BEDROCK_BASE_URL` config-key the same way the | ||
| # `anthropic` provider accepts `ANTHROPIC_BASE_URL`. | ||
| - host: bedrock-runtime.us-east-1.amazonaws.com |
There was a problem hiding this comment.
You can also use wildcards/glob patterns here unless you feel that is too broad of access
There was a problem hiding this comment.
Good flag — this turned out tighter than I expected. crates/openshell-sandbox/src/l7/mod.rs:356-389 (validate_host_wildcard) enforces:
- Wildcards may only appear in the first DNS label (line 372: "host wildcard may only appear in the first DNS label") — so
bedrock-runtime.*.amazonaws.comis rejected. - TLD/single-label wildcards (
*,*.com) are rejected.
The only wildcard form that passes is *.amazonaws.com, which is the "too broad" case you flagged (matches S3, DynamoDB, IAM, every AWS service). Validator and your concern agree.
Two ways to land it:
a) Keep bedrock-runtime.us-east-1.amazonaws.com:443 (current state). Operators in other regions override via BEDROCK_BASE_URL, mirroring ANTHROPIC_BASE_URL for anthropic. Tight, no false declarative promise.
b) File a follow-up to allow bedrock-runtime.{region}.amazonaws.com placeholder substitution at the YAML loader (or relax the first-label rule specifically for YAML-defaulted endpoints), so one entry covers all regions. Out of scope for this PR; happy to open the issue.
My read is (a) for this PR + a follow-up issue for (b). Push back if you'd prefer otherwise.
There was a problem hiding this comment.
Yeah (a) makes sense here!
PR Review StatusValidation: this PR remains project-valid as a concentrated Bedrock-compatible provider/profile and sandbox L7 routing update. The prior legacy Head SHA: Review findings:
Docs: missing for direct provider and inference-routing behavior. Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for Next state: |
Addresses johntmyers's review on NVIDIA#1704: net-new providers should land via the v2 YAML profile only and should NOT require changes to the legacy `ProviderDiscoverySpec` registry. - Delete `crates/openshell-providers/src/providers/aws_bedrock.rs` (the legacy SPEC + `test_discovers_env_credential!` invocation). - Drop `pub mod aws_bedrock;` from `crates/openshell-providers/src/providers/mod.rs`. - Drop `registry.register(providers::aws_bedrock::SPEC)` from `crates/openshell-providers/src/lib.rs`. Kept: - `providers/aws-bedrock.yaml` and the `include_str!` in `BUILT_IN_PROFILE_YAMLS` (`profiles.rs`) — the v2 path. `discover_from_profile()` (`crates/openshell-providers/src/discovery.rs`) picks up AWS_* env vars via `discovery.credentials` in the YAML. - L7 router patterns in `crates/openshell-sandbox/src/l7/inference.rs` — orthogonal to the provider registry. The discovery test in the deleted file goes with it; v2 doesn't have an established per-provider env-var-pickup unit test pattern, and other YAML-only registrations (none today, but this is the new direction) won't carry one either. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Re-check After Reviewer UpdateI re-evaluated latest head Disposition: partially resolved. The legacy Remaining items:
Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for Next state: |
…le assertion Two fixes from johntmyers's gator-agent re-check on NVIDIA#1704: 1. `providers/aws-bedrock.yaml`: add `aws_session_token` to `discovery.credentials`. The credential is declared in the profile but was missing from the discovery scan list, so Providers v2 `--from-existing` would silently drop temporary AWS credentials (STS / IRSA scenarios). 2. `crates/openshell-server/src/grpc/provider.rs`: update the static `list_provider_profiles_returns_built_in_profile_categories` assertion to include `aws-bedrock` at alphabetical position 0. Adding `providers/aws-bedrock.yaml` to BUILT_IN_PROFILE_YAMLS made the prior `["claude-code", "github", "nvidia"]` expectation stale. Remaining blockers from the same review (deferred to follow-up commits): `inference::profile_for` registration for aws-bedrock, user-facing provider + inference-routing docs, and an `upsert_cluster_inference_route` integration test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses johntmyers's blocking review feedback on PR NVIDIA#1704: "aws-bedrock still is not wired into the managed inference.local route registry. profile_for only registers openai, anthropic, and nvidia, so inference set --provider <aws-bedrock-provider> will reject this provider before the new sandbox L7 patterns can be used." Approach: register aws-bedrock as a *bridge-fronted* upstream — the router does not inject any auth header on outbound requests; the configured BEDROCK_BASE_URL is expected to point at a translating bridge / Bedrock-compatible proxy that handles auth in its own pod. This is the shape the L7 patterns commit (8b30211) and the YAML profile (6b51e1a) were designed for. SigV4 signing for direct AWS Bedrock is a separate follow-up; see PR thread. Changes: - core::inference::AuthHeader: add `None` variant for upstreams that authenticate themselves. - core::inference: add AWS_BEDROCK_PROFILE static + register in profile_for. Default base URL is bedrock-runtime.us-east-1, override via BEDROCK_BASE_URL config-key (mirrors ANTHROPIC_BASE_URL pattern). Empty credential_key_names + auth: None means no router-side credential lookup at route time. - router::backend: handle AuthHeader::None as a no-op (skip auth injection). - server::inference::resolve_provider_route: gate find_provider_api_key on auth != None. aws-bedrock providers with empty credentials now resolve cleanly. Updated the unsupported-type error message to include aws-bedrock in the supported list. - server::inference tests: add positive upsert_cluster_route_succeeds_for_aws_bedrock_without_api_key test covering the new code path end-to-end (provider with empty creds + BEDROCK_BASE_URL config → upsert succeeds → resolved route has empty api_key + provider_type aws-bedrock + bridge URL). - core::inference tests: profile_for_known_types covers aws-bedrock, case-insensitive lookup, plus three new aws-bedrock-specific tests (auth: None, no credential keys, bedrock-specific protocols). - docs/sandboxes/inference-routing.mdx: header forwarding row mentions aws-bedrock has no passthrough headers; new tabs in Supported API Patterns (InvokeModel + InvokeModelWithResponseStream) and Create a Provider (with the bridge-fronted shape note + SigV4 deferral). - docs/sandboxes/manage-providers.mdx: new row in Supported Provider Types table; new row in Supported Inference Providers table. Verification (in dev container): - cargo check -p openshell-core -p openshell-router -p openshell-server: clean - cargo test -p openshell-core --lib inference: 14/14 pass (incl. 3 new) - cargo test -p openshell-server --lib inference::tests::upsert: 6/6 pass (incl. new aws-bedrock test) - cargo fmt --check: clean - cargo clippy --all-targets -D warnings: clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
|
@johntmyers — pushed This push:
Deferred to follow-up PRs (will file issues for tracking):
Verification (run locally in the rust-1.95 dev container): |
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. The previous Remaining items:
Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for Next state: |
BlockedGator is blocked by merge conflicts on the PR branch. GitHub currently reports Next action: @st-gr, please rebase or merge |
Resolves merge conflicts after main fast-forwarded 554 commits past the branch base. Conflicting files and resolutions: - crates/openshell-core/src/inference.rs: kept aws-bedrock additions (AWS_BEDROCK_PROTOCOLS const, AWS_BEDROCK_PROFILE static) alongside the new VERTEX_AI_PROTOCOLS const and normalize_inference_provider_type function. Added "aws-bedrock" to that normalize function so profile_for resolves it through the same canonicalization path as the other inference profiles. - crates/openshell-server/src/grpc/provider.rs: merged the static built-in profile assertion to include both the upstream additions (codex, copilot, cursor, google-vertex-ai, pypi) and aws-bedrock, alphabetically. - crates/openshell-server/src/inference.rs: kept the auth: None gating for aws-bedrock api-key lookup, applied around the new CredentialLookup-based call signature; preserved the vertex-ai dispatch added upstream. Updated the unsupported-type error message to list both google-vertex-ai and aws-bedrock. - docs/sandboxes/inference-routing.mdx: combined the upstream Vertex Claude rawPredict header note with the aws-bedrock no-passthrough note in the header forwarding row. Drive-by fix for one finding from gator-agent's NVIDIA#1704 re-check on 4ab587f: AWS_BEDROCK_PROFILE.default_base_url is now an empty string rather than the real AWS Bedrock URL. Without BEDROCK_BASE_URL config, resolve_provider_route's existing empty-base_url check now rejects the provider rather than silently forwarding prompts to AWS with auth: None. Once SigV4 lands, the default can revert. Verification (in dev container): - cargo check -p openshell-core -p openshell-router -p openshell-server: clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses four findings from gator-agent's NVIDIA#1704 re-check on 4ab587f: - **Item 5** (YAML collects unused AWS creds): mark all four AWS credentials `required: false` and clear `discovery.credentials`. Bridge-fronted routing intentionally does not consume AWS credentials, so `--from-existing` no longer scans for them. The credentials remain in the schema (not deleted) so the SigV4 follow-up can flip them back without a schema migration. Added a multi-line description that names the bridge-fronted shape and the SigV4 deferral so readers don't have to cross-reference the PR thread. - **Item 3** (docs show command that the CLI rejects): rewrite the Create-a-Provider example for AWS Bedrock to use the actual required shape — placeholder `--credential AWS_ACCESS_KEY_ID= unused-bridge-fronted-shape` plus the `--config BEDROCK_BASE_URL`. The placeholder satisfies the gRPC handler's `provider.credentials.is_empty()` rejection without expanding server-side validation; the router ignores it on the outbound path because `auth: AuthHeader::None` skips header injection. Operators see a clearly-labeled placeholder in `provider get` output. - **Item 1** (validator probe): document `--no-verify` as required for `openshell inference set --provider <aws-bedrock>` since the default validation probe doesn't recognize the `aws_bedrock_invoke` / `aws_bedrock_invoke_stream` protocols. Doc now shows the full `provider create` + `inference set --no-verify` flow with rationale for both decisions inline. - **Item 6** (docs polish): `inference-routing.mdx` summary row now lists AWS Bedrock alongside NVIDIA, Anthropic, Vertex AI, and OpenAI-compatible providers, with the bridge-fronted caveat inline. Test additions in `crates/openshell-server/src/inference.rs`: - Renamed the existing aws-bedrock test from `..._without_api_key` to `..._with_bridge_url` and updated it to use a placeholder credential (mirroring the doc-recommended pattern operators will copy-paste). The `auth: None` path still produces an empty `api_key` on the resolved route — the test now documents that the credential is *stored* but not *used*. - Added `upsert_cluster_route_rejects_aws_bedrock_without_bedrock_base_url`: the negative half of johntmyers' "successfully used by upsert_cluster_inference_route or intentionally rejected with a clear documented error" ask. With `default_base_url: ""` and no `BEDROCK_BASE_URL` config, route resolution returns `InvalidArgument` naming the missing base_url rather than silently forwarding prompts to AWS Bedrock with no usable auth. Verification (in dev container): - cargo test -p openshell-core --lib inference: 18/18 (incl. 3 new) - cargo test -p openshell-server --lib inference::tests::upsert: 8/8 (incl. 2 new aws-bedrock cases — positive + negative) - cargo fmt --check: clean - cargo clippy --all-targets -D warnings: clean Item 2 (router-side enforcement of operator-configured Bedrock model path, replacing the current verbatim path forwarding + body-only model rewrite) is the remaining blocker and is genuinely separable — it touches the L7 router with streaming-aware test coverage. Deferring to its own commit so the security-critical change gets the review attention it deserves. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. The merge-conflict blocker is resolved ( Remaining items:
Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for Next state: |
Closes the security-blocking item from gator-agent's NVIDIA#1704 re-check on 4ab587f: "Bedrock carries the model id in /model/{modelId}/invoke, but the router currently forwards the caller's original path and only rewrites JSON body model. That lets sandbox code choose a different upstream model than the operator-configured route model, and may also mutate native Bedrock request bodies incorrectly." Two changes in `prepare_backend_request`: 1. **Path rewrite for Bedrock routes.** Before computing the upstream URL, parse the inbound path's `/model/<id>/invoke[-with-response-stream]` shape and substitute the operator-configured `route.model` for the caller-supplied model segment. Sandbox code that hardcodes a different model still works (we don't reject on mismatch), but the operator's configured model is what reaches the upstream / bridge. If the inbound path is somehow not a recognized Bedrock shape on a Bedrock route (the L7 pattern detector upstream of the router should never produce this combination), reject with RouterError::Internal naming the offending path rather than forwarding verbatim. 2. **Skip body-model injection for Bedrock routes.** The existing body rewriter unconditionally inserts `route.model` into the JSON body for non-Vertex routes. AWS Bedrock InvokeModel encodes the model in the URL path; the body is the raw provider-specific payload (Anthropic Messages for Claude, Mistral payload for Mistral, etc.) and must not be mutated. The branch ordering is now: needs_vertex_anthropic_version → strip body model + inject anthropic_version; route_is_bedrock → leave body alone; else → inject route.model (existing default). New helpers, all in `crates/openshell-router/src/backend.rs`: - `route_is_bedrock(route)` — true when route.protocols contains aws_bedrock_invoke or aws_bedrock_invoke_stream. - `parse_bedrock_invocation_path(path)` — returns Some((model_id, "/invoke" | "/invoke-with-response-stream")) for paths matching the recognized Bedrock shapes. Strips query strings. Rejects empty model ids and multi-segment ids (defense-in-depth matching the L7 pattern detector's existing guards). - `rewrite_bedrock_path(route, path)` — returns the path with the caller's model segment replaced by route.model. Test coverage in the same file (9 new tests): - parse_bedrock_invocation_path: positive cases for both invoke variants, query-string stripping; negative cases for empty model id, multi-segment id, unknown action, wrong prefix, missing slash. - route_is_bedrock: matches both protocol variants singly and combined; rejects openai_chat_completions. - rewrite_bedrock_path: substitutes operator model on both invoke variants; returns None for non-Bedrock paths. - bedrock_route_rewrites_model_in_path_and_preserves_body (wiremock end-to-end): caller sends /model/some-other-model/invoke with a body containing model: "caller-supplied-model-name". Mock asserts the upstream receives /model/<operator-model>/invoke and the body's model field is the caller's value (NOT route.model) — proves both the path rewrite and the body preservation. - bedrock_route_streaming_rewrites_model_in_path: same contract for invoke-with-response-stream. - bedrock_route_rejects_non_bedrock_path: defense-in-depth coverage of the Internal-error path when a Bedrock route receives a path that doesn't match Bedrock shape. Verification (in dev container): - cargo test -p openshell-router --lib: 53/53 (incl. 9 new) - cargo fmt --check: clean - cargo clippy -p openshell-core -p openshell-router -p openshell-server --all-targets -- -D warnings: clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Adds two patterns to `default_patterns()` so the supervisor's L7
inference router recognizes the Bedrock InvokeModel URL shape and
forwards matched requests to the registered upstream:
- `POST /model/{modelId}/invoke` → aws_bedrock_invoke
- `POST /model/{modelId}/invoke-with-response-stream` → aws_bedrock_invoke_stream
The `{modelId}` segment is wildcarded by extending `detect_inference_pattern`
to handle one middle `/*/` segment in addition to the existing trailing
`/*`. The wildcard is constrained to a single non-empty path segment to
avoid path-traversal liabilities — `/model//invoke` and `/model/a/b/invoke`
both no-match.
Without this, sandboxes running Claude Code in its native Bedrock mode
(`CLAUDE_CODE_USE_BEDROCK=1`, `ANTHROPIC_BEDROCK_BASE_URL`, AWS-style
auth) hit the supervisor with `403 connection not allowed by policy`
because their URL doesn't match `/v1/*` shapes. The fix unblocks
operators wanting to register direct AWS Bedrock, an in-cluster
Bedrock-compatible bridge, or a Bedrock-emulating LiteLLM as
`--type aws-bedrock` providers.
Tests cover: positive matches for invoke + invoke-with-response-stream,
query-string handling, GET rejection, empty-segment rejection,
multi-segment rejection, and unknown-action rejection.
Companion changes (provider discovery spec + YAML profile) follow in
the next commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Adds `aws-bedrock` to the built-in provider catalog so operators can run `openshell provider create --type aws-bedrock --credential ...` and have the gateway treat it as a first-class inference provider alongside `anthropic`, `openai`, etc. - `providers/aws-bedrock.yaml`: YAML profile declaring four credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION). Default endpoint is `bedrock-runtime.us-east-1.amazonaws.com:443`; operators in other regions or running against a Bedrock-compatible proxy override via the operator-supplied `BEDROCK_BASE_URL` config-key (mirrors `ANTHROPIC_BASE_URL` for the `anthropic` provider). - `crates/openshell-providers/src/providers/aws_bedrock.rs`: the `ProviderDiscoverySpec` so `openshell provider create --auto-providers` picks up AWS_* env vars from local credentials. - `crates/openshell-providers/src/providers/mod.rs`: register the module. - `crates/openshell-providers/src/lib.rs`: register the SPEC in the default registry alongside the other providers. - `crates/openshell-providers/src/profiles.rs`: include the new YAML in `BUILT_IN_PROFILE_YAMLS`. What this PR explicitly does NOT add (intentionally separated for review-size reasons; will follow up): - A SigV4 signer in `openshell-router`. The current change simply declares the protocol; a follow-up PR adds outbound SigV4 signing using the `aws-sigv4` crate and a new `auth_style: sigv4` validator branch in profiles.rs. Operators who don't need SigV4 (e.g. an in-cluster bridge that ignores it and authenticates separately to the upstream) can use this PR today. - Body translation between Bedrock InvokeModel shape and other inference shapes. The router treats Bedrock requests as opaque pass-through; if the operator's upstream is real AWS Bedrock it speaks Bedrock natively, if it's a translating bridge the bridge does any conversion server-side. - `BEDROCK_BASE_URL` placeholder substitution in the YAML loader. Today the YAML's `host` is a literal default; operators override with the config-key the same way `ANTHROPIC_BASE_URL` works. Tested: `cargo test -p openshell-providers` (35 tests green) and `cargo test -p openshell-sandbox --lib l7::inference` (40 tests green including the seven new aws_bedrock cases from the previous commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses johntmyers's review on NVIDIA#1704: net-new providers should land via the v2 YAML profile only and should NOT require changes to the legacy `ProviderDiscoverySpec` registry. - Delete `crates/openshell-providers/src/providers/aws_bedrock.rs` (the legacy SPEC + `test_discovers_env_credential!` invocation). - Drop `pub mod aws_bedrock;` from `crates/openshell-providers/src/providers/mod.rs`. - Drop `registry.register(providers::aws_bedrock::SPEC)` from `crates/openshell-providers/src/lib.rs`. Kept: - `providers/aws-bedrock.yaml` and the `include_str!` in `BUILT_IN_PROFILE_YAMLS` (`profiles.rs`) — the v2 path. `discover_from_profile()` (`crates/openshell-providers/src/discovery.rs`) picks up AWS_* env vars via `discovery.credentials` in the YAML. - L7 router patterns in `crates/openshell-sandbox/src/l7/inference.rs` — orthogonal to the provider registry. The discovery test in the deleted file goes with it; v2 doesn't have an established per-provider env-var-pickup unit test pattern, and other YAML-only registrations (none today, but this is the new direction) won't carry one either. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
…le assertion Two fixes from johntmyers's gator-agent re-check on NVIDIA#1704: 1. `providers/aws-bedrock.yaml`: add `aws_session_token` to `discovery.credentials`. The credential is declared in the profile but was missing from the discovery scan list, so Providers v2 `--from-existing` would silently drop temporary AWS credentials (STS / IRSA scenarios). 2. `crates/openshell-server/src/grpc/provider.rs`: update the static `list_provider_profiles_returns_built_in_profile_categories` assertion to include `aws-bedrock` at alphabetical position 0. Adding `providers/aws-bedrock.yaml` to BUILT_IN_PROFILE_YAMLS made the prior `["claude-code", "github", "nvidia"]` expectation stale. Remaining blockers from the same review (deferred to follow-up commits): `inference::profile_for` registration for aws-bedrock, user-facing provider + inference-routing docs, and an `upsert_cluster_inference_route` integration test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses johntmyers's blocking review feedback on PR NVIDIA#1704: "aws-bedrock still is not wired into the managed inference.local route registry. profile_for only registers openai, anthropic, and nvidia, so inference set --provider <aws-bedrock-provider> will reject this provider before the new sandbox L7 patterns can be used." Approach: register aws-bedrock as a *bridge-fronted* upstream — the router does not inject any auth header on outbound requests; the configured BEDROCK_BASE_URL is expected to point at a translating bridge / Bedrock-compatible proxy that handles auth in its own pod. This is the shape the L7 patterns commit (8b30211) and the YAML profile (6b51e1a) were designed for. SigV4 signing for direct AWS Bedrock is a separate follow-up; see PR thread. Changes: - core::inference::AuthHeader: add `None` variant for upstreams that authenticate themselves. - core::inference: add AWS_BEDROCK_PROFILE static + register in profile_for. Default base URL is bedrock-runtime.us-east-1, override via BEDROCK_BASE_URL config-key (mirrors ANTHROPIC_BASE_URL pattern). Empty credential_key_names + auth: None means no router-side credential lookup at route time. - router::backend: handle AuthHeader::None as a no-op (skip auth injection). - server::inference::resolve_provider_route: gate find_provider_api_key on auth != None. aws-bedrock providers with empty credentials now resolve cleanly. Updated the unsupported-type error message to include aws-bedrock in the supported list. - server::inference tests: add positive upsert_cluster_route_succeeds_for_aws_bedrock_without_api_key test covering the new code path end-to-end (provider with empty creds + BEDROCK_BASE_URL config → upsert succeeds → resolved route has empty api_key + provider_type aws-bedrock + bridge URL). - core::inference tests: profile_for_known_types covers aws-bedrock, case-insensitive lookup, plus three new aws-bedrock-specific tests (auth: None, no credential keys, bedrock-specific protocols). - docs/sandboxes/inference-routing.mdx: header forwarding row mentions aws-bedrock has no passthrough headers; new tabs in Supported API Patterns (InvokeModel + InvokeModelWithResponseStream) and Create a Provider (with the bridge-fronted shape note + SigV4 deferral). - docs/sandboxes/manage-providers.mdx: new row in Supported Provider Types table; new row in Supported Inference Providers table. Verification (in dev container): - cargo check -p openshell-core -p openshell-router -p openshell-server: clean - cargo test -p openshell-core --lib inference: 14/14 pass (incl. 3 new) - cargo test -p openshell-server --lib inference::tests::upsert: 6/6 pass (incl. new aws-bedrock test) - cargo fmt --check: clean - cargo clippy --all-targets -D warnings: clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Addresses four findings from gator-agent's NVIDIA#1704 re-check on 4ab587f: - **Item 5** (YAML collects unused AWS creds): mark all four AWS credentials `required: false` and clear `discovery.credentials`. Bridge-fronted routing intentionally does not consume AWS credentials, so `--from-existing` no longer scans for them. The credentials remain in the schema (not deleted) so the SigV4 follow-up can flip them back without a schema migration. Added a multi-line description that names the bridge-fronted shape and the SigV4 deferral so readers don't have to cross-reference the PR thread. - **Item 3** (docs show command that the CLI rejects): rewrite the Create-a-Provider example for AWS Bedrock to use the actual required shape — placeholder `--credential AWS_ACCESS_KEY_ID= unused-bridge-fronted-shape` plus the `--config BEDROCK_BASE_URL`. The placeholder satisfies the gRPC handler's `provider.credentials.is_empty()` rejection without expanding server-side validation; the router ignores it on the outbound path because `auth: AuthHeader::None` skips header injection. Operators see a clearly-labeled placeholder in `provider get` output. - **Item 1** (validator probe): document `--no-verify` as required for `openshell inference set --provider <aws-bedrock>` since the default validation probe doesn't recognize the `aws_bedrock_invoke` / `aws_bedrock_invoke_stream` protocols. Doc now shows the full `provider create` + `inference set --no-verify` flow with rationale for both decisions inline. - **Item 6** (docs polish): `inference-routing.mdx` summary row now lists AWS Bedrock alongside NVIDIA, Anthropic, Vertex AI, and OpenAI-compatible providers, with the bridge-fronted caveat inline. Test additions in `crates/openshell-server/src/inference.rs`: - Renamed the existing aws-bedrock test from `..._without_api_key` to `..._with_bridge_url` and updated it to use a placeholder credential (mirroring the doc-recommended pattern operators will copy-paste). The `auth: None` path still produces an empty `api_key` on the resolved route — the test now documents that the credential is *stored* but not *used*. - Added `upsert_cluster_route_rejects_aws_bedrock_without_bedrock_base_url`: the negative half of johntmyers' "successfully used by upsert_cluster_inference_route or intentionally rejected with a clear documented error" ask. With `default_base_url: ""` and no `BEDROCK_BASE_URL` config, route resolution returns `InvalidArgument` naming the missing base_url rather than silently forwarding prompts to AWS Bedrock with no usable auth. Verification (in dev container): - cargo test -p openshell-core --lib inference: 18/18 (incl. 3 new) - cargo test -p openshell-server --lib inference::tests::upsert: 8/8 (incl. 2 new aws-bedrock cases — positive + negative) - cargo fmt --check: clean - cargo clippy --all-targets -D warnings: clean Item 2 (router-side enforcement of operator-configured Bedrock model path, replacing the current verbatim path forwarding + body-only model rewrite) is the remaining blocker and is genuinely separable — it touches the L7 router with streaming-aware test coverage. Deferring to its own commit so the security-critical change gets the review attention it deserves. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
Closes the security-blocking item from gator-agent's NVIDIA#1704 re-check on 4ab587f: "Bedrock carries the model id in /model/{modelId}/invoke, but the router currently forwards the caller's original path and only rewrites JSON body model. That lets sandbox code choose a different upstream model than the operator-configured route model, and may also mutate native Bedrock request bodies incorrectly." Two changes in `prepare_backend_request`: 1. **Path rewrite for Bedrock routes.** Before computing the upstream URL, parse the inbound path's `/model/<id>/invoke[-with-response-stream]` shape and substitute the operator-configured `route.model` for the caller-supplied model segment. Sandbox code that hardcodes a different model still works (we don't reject on mismatch), but the operator's configured model is what reaches the upstream / bridge. If the inbound path is somehow not a recognized Bedrock shape on a Bedrock route (the L7 pattern detector upstream of the router should never produce this combination), reject with RouterError::Internal naming the offending path rather than forwarding verbatim. 2. **Skip body-model injection for Bedrock routes.** The existing body rewriter unconditionally inserts `route.model` into the JSON body for non-Vertex routes. AWS Bedrock InvokeModel encodes the model in the URL path; the body is the raw provider-specific payload (Anthropic Messages for Claude, Mistral payload for Mistral, etc.) and must not be mutated. The branch ordering is now: needs_vertex_anthropic_version → strip body model + inject anthropic_version; route_is_bedrock → leave body alone; else → inject route.model (existing default). New helpers, all in `crates/openshell-router/src/backend.rs`: - `route_is_bedrock(route)` — true when route.protocols contains aws_bedrock_invoke or aws_bedrock_invoke_stream. - `parse_bedrock_invocation_path(path)` — returns Some((model_id, "/invoke" | "/invoke-with-response-stream")) for paths matching the recognized Bedrock shapes. Strips query strings. Rejects empty model ids and multi-segment ids (defense-in-depth matching the L7 pattern detector's existing guards). - `rewrite_bedrock_path(route, path)` — returns the path with the caller's model segment replaced by route.model. Test coverage in the same file (9 new tests): - parse_bedrock_invocation_path: positive cases for both invoke variants, query-string stripping; negative cases for empty model id, multi-segment id, unknown action, wrong prefix, missing slash. - route_is_bedrock: matches both protocol variants singly and combined; rejects openai_chat_completions. - rewrite_bedrock_path: substitutes operator model on both invoke variants; returns None for non-Bedrock paths. - bedrock_route_rewrites_model_in_path_and_preserves_body (wiremock end-to-end): caller sends /model/some-other-model/invoke with a body containing model: "caller-supplied-model-name". Mock asserts the upstream receives /model/<operator-model>/invoke and the body's model field is the caller's value (NOT route.model) — proves both the path rewrite and the body preservation. - bedrock_route_streaming_rewrites_model_in_path: same contract for invoke-with-response-stream. - bedrock_route_rejects_non_bedrock_path: defense-in-depth coverage of the Internal-error path when a Bedrock route receives a path that doesn't match Bedrock shape. Verification (in dev container): - cargo test -p openshell-router --lib: 53/53 (incl. 9 new) - cargo fmt --check: clean - cargo clippy -p openshell-core -p openshell-router -p openshell-server --all-targets -- -D warnings: clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
2a8e23b to
a1055e2
Compare
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. The unrelated fork infrastructure and external compute driver changes are removed, the Bedrock route now rewrites the caller path to the operator-configured model, and the docs now describe the bridge-only/no-SigV4 setup. Blocking review feedback remains. Remaining items:
Checks: DCO was passing in the latest cached state. Branch Checks and Helm Lint still need to run after review feedback is resolved; I am not posting Next state: |
Closes the buffered-vs-streaming framing warning from gator-agent's re-check on 4ab587f: "Bedrock InvokeModel should be buffered while InvokeModelWithResponseStream is streaming. Please add framing/coverage so /model/{id}/invoke cannot be corrupted by the streaming proxy's truncation/error-frame behavior." The InferenceApiPattern struct gained a `framing: ResponseFraming` field upstream after the original Bedrock-patterns commit (#22b78cff) landed; the cherry-pick onto current upstream/main left the two Bedrock entries without the new field. Fixed here: - aws_bedrock_invoke (POST /model/{id}/invoke): framing = ResponseFraming::Buffered InvokeModel returns one JSON object the caller decodes whole. Sending it through the streaming proxy would risk a mid-body size-cap truncation or idle-timeout failure appending an SSE error event onto bytes the caller decodes as one JSON body — the same corruption mode that drove the existing embeddings + model-discovery to Buffered. - aws_bedrock_invoke_stream (POST /model/{id}/invoke-with-response-stream): framing = ResponseFraming::Streaming InvokeModelWithResponseStream returns an AWS event-stream of binary chunks; the caller wants chunks incrementally, so the streaming proxy path is correct. Two new tests in `crates/openshell-sandbox/src/l7/inference.rs` pin down the contract: - aws_bedrock_invoke_is_buffered — detect_inference_pattern returns a Buffered pattern for /model/<id>/invoke, with explanatory message naming the corruption mode being prevented. - aws_bedrock_invoke_stream_is_streaming — same shape, asserting Streaming for /model/<id>/invoke-with-response-stream. Verification (in dev container): - cargo check -p openshell-sandbox: clean (was failing on missing `framing` field before this commit) - cargo test -p openshell-sandbox --lib l7::inference::tests::aws_bedrock: 7/7 (incl. 2 new framing tests) - cargo fmt --check: clean - cargo clippy --all-targets -- -D warnings: clean Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: st-gr <38470677+st-gr@users.noreply.github.com>
|
@johntmyers — force-rebased the branch onto Why the force-rebase. Your scope concern was structurally correct — the previous branch was based off my fork's Addressing the remaining findings:
Status
Verification (run locally in the rust-1.95 dev container)
Branch Checks / Helm Lint stand by for |
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. The scope contamination is resolved, the route now rewrites caller-supplied Bedrock model paths to the operator-configured model, Remaining items:
Checks: DCO is passing. Branch Checks and Helm Lint are still waiting for Next state: |
Summary
Adds
aws-bedrockas a recognized inference protocol in the supervisor's L7 router and the providers catalog so operators can register a Bedrock-shaped upstream as--type aws-bedrockand route Claude Code Bedrock-mode traffic (POST /model/{id}/invoke[-with-response-stream]) throughinference.local. Without this, sandboxes hit403 "connection not allowed by policy"because no L7 pattern matches Bedrock URLs. The canonical no-SigV4 use case is SAP AI Core deployed Bedrock models (Anthropic models behind a Bedrock-shape API with XSUAA bearer auth instead of SigV4); operators wanting real AWS Bedrock additionally need #1630's proxy-side SigV4 signing.Related Issue
Complementary to #1630 ("Sigv4 credential signing"). #1630 adds proxy-side SigV4 re-signing as a
credential_signing: sigv4policy field. This PR is the URL-pattern half: the supervisor's L7 router needs to recognize Bedrock InvokeModel paths before anything can be signed, regardless of whether the upstream needs SigV4. The two patches don't touch the same files; they're complementary, not overlapping.No upstream tracking issue filed — happy to file one if reviewers prefer.
Changes
crates/openshell-sandbox/src/l7/inference.rs:default_patterns():POST /model/*/invoke(aws_bedrock_invoke) andPOST /model/*/invoke-with-response-stream(aws_bedrock_invoke_stream).detect_inference_patternto support a single middle/*/glob in addition to the existing trailing/*. The middle wildcard matches exactly one non-empty path segment containing no/—/model//invokeand/model/a/b/invokeboth no-match.providers/aws-bedrock.yaml: new YAML profile declaring four credentials (AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_SESSION_TOKEN,AWS_REGION) and a default endpoint ofbedrock-runtime.us-east-1.amazonaws.com:443. Operators in other regions or pointing at non-AWS Bedrock-compatible upstreams override per-deployment via the operator-suppliedBEDROCK_BASE_URLconfig-key (mirroring how theanthropicprovider acceptsANTHROPIC_BASE_URL).crates/openshell-providers/src/providers/aws_bedrock.rs: theProviderDiscoverySpecso--auto-providerspicks upAWS_*env vars from local credentials.crates/openshell-providers/src/{providers/mod.rs,lib.rs,profiles.rs}: register the new module + SPEC + YAML.Testing
mise run pre-commitpasses — not run end-to-end (misenot on author's dev environment), but the equivalent rust pieces verified independently:cargo fmt --all -- --checkclean;cargo clippy --no-deps -p openshell-providers -p openshell-sandbox --all-targets -- -D warningsclean.crates/openshell-sandbox/src/l7/inference.rs::testscover positive path, query-string handling, GET rejection, empty-segment rejection, multi-segment rejection, unknown-action rejection. Provider-discovery test follows the existingtest_discovers_env_credential!macro convention.cargo test -p openshell-sandbox --lib l7::inference: 40 passed;cargo test -p openshell-providers: 35 passed.aws-bedrockprovider end-to-end requires either real AWS Bedrock with SigV4 (covered by Sigv4 credential signing #1630) or a Bedrock-compatible stub backend. Deferring the E2E test to whichever PR lands second so it can exercise the full URL-pattern + auth path together.Checklist
feat(sandbox):,feat(providers):)default_patterns(); the new YAML profile follows the same shape asclaude-code.yaml/nvidia.yaml. Happy to add a paragraph todocs/sandboxes/manage-providers.mdx(or another spot reviewers prefer) in this PR rather than a doc-only follow-up.Notes for reviewers
Use cases (which PRs you need for which upstream):
credential_signing: sigv4policy field. The two are complementary; this PR is the prerequisite that makes #1630's signing applicable to Bedrock paths.In all three cases,
provider create --type aws-bedrockrequires--no-verifyuntil a Bedrock-aware arm is added tovalidation_probe()incrates/openshell-router/src/backend.rs. That extension is left for a follow-up PR to keep this one focused on the URL-pattern + provider-registration changes.Out of scope (intentional):
{region}placeholder substitution in the YAML loader. Operators override per-deployment viaBEDROCK_BASE_URLconfig-key the same wayANTHROPIC_BASE_URLworks for theanthropicprovider.openshell provider create --type aws-bedrockbecause the registry recognizes the new id; surfacing it in the TUI's provider-type picker is a follow-up.Operator context:
The
st-gr/openshell-driver-kymaHelm chart currently registers its SAP AI Core ↔ Bedrock translation bridge as--type anthropicwith/v1/messageson the inside, becauseaws-bedrockisn't a recognized provider type. The chart therefore carries a server-side Anthropic→Bedrock body translator and a denylist for Anthropic-API-only fields the SAP gateway rejects. After this PR, the bridge becomes a path-translating + auth-substituting pass-through with no body work — the chart's translator code goes away.