Skip to content

feat: add Amazon Bedrock Mantle as a model provider#479

Merged
philmerrell merged 3 commits into
developfrom
feature/mantle-provider
Jun 13, 2026
Merged

feat: add Amazon Bedrock Mantle as a model provider#479
philmerrell merged 3 commits into
developfrom
feature/mantle-provider

Conversation

@philmerrell

Copy link
Copy Markdown
Contributor

Overview

Adds Amazon Bedrock Mantle as a first-class model provider. Mantle is AWS's OpenAI-compatible inference surface for Bedrock-hosted open-weight models (Qwen, GPT-OSS, Gemma, DeepSeek, …). It's distinct from the existing bedrock provider because it speaks the OpenAI wire protocol with a short-term bearer token, not the Converse API with SigV4.

Admins can now browse Mantle models and add them via curated quick-add cards (with a manual escape-hatch form), and users can chat with them like any other managed model.

What's included

Backend (feat(models))

  • apis/shared/bedrock/bearer_token.py — SigV4-presigned CallWithBearerToken bearer token, ported inline from aws-bedrock-token-generator (no new dependency), plus a region+path base-URL helper.
  • GET /admin/mantle/models — browse the live regional Mantle roster via the OpenAI SDK.
  • ModelProvider.MANTLE threaded end to end: model_config translation → agent_factory builds a Strands OpenAIModel against the Mantle base URL → BaseAgent + inference chat pipeline.
  • mantle_endpoint_path on the managed-model shape, persisted and resolved server-authoritatively in _resolve_model_settings, and carried through the paused-turn snapshot for resume correctness.

Infra (feat(infra))

  • Grants the bedrock-mantle:* IAM namespace (mirrors AWS-managed AmazonBedrockMantleInferenceAccess): CreateInference + Get*/List* + CallWithBearerToken on the runtime role; read-only + CallWithBearerToken on the app-API task role.

Frontend (feat(spa))

  • A Bedrock Mantle catalog tab with curated cards (Qwen3 Coder 30B on /v1, Gemma 4 31B on /openai/v1), a Mantle-aware model form (endpoint-path selector, caching hidden, model-id datalist seeded from the live roster).

Key design decisions

  • The endpoint path is recorded, not discovered. Mantle serves different models on different OpenAI-compatible paths (/v1 vs /openai/v1) and exposes no API to tell you which — verified against both the data plane (/v1/models returns only {id, created, object, owned_by}) and the absence of a bedrock-mantle boto3 control-plane client. We rejected both a maintained code-level mapping and runtime probing in favor of storing the path per model (sourced from the AWS model card, baked into curated cards, admin-settable in the escape hatch). The wrong path returns a misleading 401 "not enabled for this account" at chat time, so this matters.
  • Caching is intentionally off for Mantle. Prompt caching on Bedrock is model-bound to Anthropic Claude + a few Amazon Nova models, none of which run through the Mantle provider. (Claude/Nova continue to use the bedrock provider where CacheConfig caching already works.)
  • Curated-card data is verified: pricing against the AWS Bedrock pricing page (2026-06); modalities/capabilities/context/path against each model card.

Reviewer notes / deploy

  • Two preconditions for cloud: (1) Bedrock Mantle must be enabled for the account in the target region (an AWS-console action, not a deploy); (2) platform.yml must deploy the bedrock-mantle:* IAM before inference works in cloud. Verified end-to-end locally against us-west-2 (token mint → /v1/models → chat completions on both /v1 and /openai/v1).
  • Known Gemma-4 behavior (Chat Completions path): reasoning happens but its trace isn't returned (Responses-API only), and parallel tool calls aren't supported (one per turn).
  • Follow-up (not in this PR): expose reasoning_effort as a tunable for Gemma — needs a small shared-catalog fix (numberselect).

Testing

  • Backend: 3861 passed, 3 skipped
  • Frontend: 1216 passed
  • Infra: tsc clean, integration 24 passed

🤖 Generated with Claude Code

philmerrell and others added 3 commits June 13, 2026 14:16
Mantle is AWS's OpenAI-compatible inference surface for Bedrock-hosted
open-weight models (qwen, gpt-oss, gemma, deepseek, ...). It is a distinct
provider from `bedrock` because it speaks the OpenAI wire protocol with a
short-term bearer token rather than the Converse API with SigV4.

What this adds:
- apis/shared/bedrock/bearer_token.py: SigV4-presigned `CallWithBearerToken`
  bearer token (ported inline from aws-bedrock-token-generator, no new dep)
  plus region+path base-URL helper.
- GET /admin/mantle/models: browse the live regional Mantle roster via the
  OpenAI SDK (seeds the escape-hatch form's model-id suggestions).
- ModelProvider.MANTLE end to end: model_config translation, agent_factory
  builds a Strands OpenAIModel against the Mantle base URL, BaseAgent +
  inference chat pipeline thread the value through.
- mantle_endpoint_path on the managed-model shape, persisted and resolved
  server-authoritatively in _resolve_model_settings. Mantle serves different
  models on different paths (/v1 vs /openai/v1) and exposes NO API to discover
  which, so the path is recorded per model (from the model card) rather than
  probed or mapped. Carried through the paused-turn snapshot so a resumed
  Mantle turn rebuilds the same base URL.

Caching is intentionally left off for Mantle: prompt caching on Bedrock is
model-bound to Anthropic Claude + a few Amazon Nova models, none of which run
through the Mantle provider.

Requires `bedrock-mantle:*` IAM (separate infra commit) and Mantle being
enabled for the account/region.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bedrock Mantle has its own IAM service namespace — `bedrock-mantle:*`, NOT
`bedrock:*`. Mirror the AWS-managed AmazonBedrockMantleInferenceAccess policy:

- AgentCore runtime role: `bedrock-mantle:CreateInference` + `Get*`/`List*`
  on `project/*`, plus `bedrock-mantle:CallWithBearerToken` (mantle-provider
  inference).
- App-API task role: read-only `Get*`/`List*` + `CallWithBearerToken` for the
  GET /admin/mantle/models browse endpoint.

The token signer is authorized against this namespace; without it inference
returns an IAM denial even when Mantle is enabled for the account. Integration
test asserts both roles carry CallWithBearerToken and the runtime carries
CreateInference.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a "Bedrock Mantle" tab to the admin model catalog, mirroring the Bedrock
tab: curated cards with vetted, human-reviewed settings (pricing, modalities,
context, and the all-important endpoint path baked in). No probing, no magic.

- CURATED_MANTLE_MODELS seeded with Qwen3 Coder 30B (/v1) and Gemma 4 31B
  (/openai/v1). Pricing verified against the AWS Bedrock pricing page (2026-06);
  modalities/capabilities/context/path verified against each model card
  (Gemma 4: text+image+video in, text out, reasoning + tool use + vision).
- Model shape gains `mantleEndpointPath`; the model form becomes Mantle-aware:
  a /v1 vs /openai/v1 selector (with model-card guidance), caching controls
  hidden (Mantle open-weight models don't cache), and a model-id datalist
  seeded from the live GET /admin/mantle/models roster for off-catalog adds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@philmerrell philmerrell merged commit 02b81a8 into develop Jun 13, 2026
1 check passed
@philmerrell philmerrell deleted the feature/mantle-provider branch June 13, 2026 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant