feat: add Amazon Bedrock Mantle as a model provider#479
Merged
Conversation
Mantle is AWS's OpenAI-compatible inference surface for Bedrock-hosted open-weight models (qwen, gpt-oss, gemma, deepseek, ...). It is a distinct provider from `bedrock` because it speaks the OpenAI wire protocol with a short-term bearer token rather than the Converse API with SigV4. What this adds: - apis/shared/bedrock/bearer_token.py: SigV4-presigned `CallWithBearerToken` bearer token (ported inline from aws-bedrock-token-generator, no new dep) plus region+path base-URL helper. - GET /admin/mantle/models: browse the live regional Mantle roster via the OpenAI SDK (seeds the escape-hatch form's model-id suggestions). - ModelProvider.MANTLE end to end: model_config translation, agent_factory builds a Strands OpenAIModel against the Mantle base URL, BaseAgent + inference chat pipeline thread the value through. - mantle_endpoint_path on the managed-model shape, persisted and resolved server-authoritatively in _resolve_model_settings. Mantle serves different models on different paths (/v1 vs /openai/v1) and exposes NO API to discover which, so the path is recorded per model (from the model card) rather than probed or mapped. Carried through the paused-turn snapshot so a resumed Mantle turn rebuilds the same base URL. Caching is intentionally left off for Mantle: prompt caching on Bedrock is model-bound to Anthropic Claude + a few Amazon Nova models, none of which run through the Mantle provider. Requires `bedrock-mantle:*` IAM (separate infra commit) and Mantle being enabled for the account/region. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bedrock Mantle has its own IAM service namespace — `bedrock-mantle:*`, NOT `bedrock:*`. Mirror the AWS-managed AmazonBedrockMantleInferenceAccess policy: - AgentCore runtime role: `bedrock-mantle:CreateInference` + `Get*`/`List*` on `project/*`, plus `bedrock-mantle:CallWithBearerToken` (mantle-provider inference). - App-API task role: read-only `Get*`/`List*` + `CallWithBearerToken` for the GET /admin/mantle/models browse endpoint. The token signer is authorized against this namespace; without it inference returns an IAM denial even when Mantle is enabled for the account. Integration test asserts both roles carry CallWithBearerToken and the runtime carries CreateInference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a "Bedrock Mantle" tab to the admin model catalog, mirroring the Bedrock tab: curated cards with vetted, human-reviewed settings (pricing, modalities, context, and the all-important endpoint path baked in). No probing, no magic. - CURATED_MANTLE_MODELS seeded with Qwen3 Coder 30B (/v1) and Gemma 4 31B (/openai/v1). Pricing verified against the AWS Bedrock pricing page (2026-06); modalities/capabilities/context/path verified against each model card (Gemma 4: text+image+video in, text out, reasoning + tool use + vision). - Model shape gains `mantleEndpointPath`; the model form becomes Mantle-aware: a /v1 vs /openai/v1 selector (with model-card guidance), caching controls hidden (Mantle open-weight models don't cache), and a model-id datalist seeded from the live GET /admin/mantle/models roster for off-catalog adds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Adds Amazon Bedrock Mantle as a first-class model provider. Mantle is AWS's OpenAI-compatible inference surface for Bedrock-hosted open-weight models (Qwen, GPT-OSS, Gemma, DeepSeek, …). It's distinct from the existing
bedrockprovider because it speaks the OpenAI wire protocol with a short-term bearer token, not the Converse API with SigV4.Admins can now browse Mantle models and add them via curated quick-add cards (with a manual escape-hatch form), and users can chat with them like any other managed model.
What's included
Backend (
feat(models))apis/shared/bedrock/bearer_token.py— SigV4-presignedCallWithBearerTokenbearer token, ported inline fromaws-bedrock-token-generator(no new dependency), plus a region+path base-URL helper.GET /admin/mantle/models— browse the live regional Mantle roster via the OpenAI SDK.ModelProvider.MANTLEthreaded end to end:model_configtranslation →agent_factorybuilds a StrandsOpenAIModelagainst the Mantle base URL →BaseAgent+ inference chat pipeline.mantle_endpoint_pathon the managed-model shape, persisted and resolved server-authoritatively in_resolve_model_settings, and carried through the paused-turn snapshot for resume correctness.Infra (
feat(infra))bedrock-mantle:*IAM namespace (mirrors AWS-managedAmazonBedrockMantleInferenceAccess):CreateInference+Get*/List*+CallWithBearerTokenon the runtime role; read-only +CallWithBearerTokenon the app-API task role.Frontend (
feat(spa))/v1, Gemma 4 31B on/openai/v1), a Mantle-aware model form (endpoint-path selector, caching hidden, model-id datalist seeded from the live roster).Key design decisions
/v1vs/openai/v1) and exposes no API to tell you which — verified against both the data plane (/v1/modelsreturns only{id, created, object, owned_by}) and the absence of abedrock-mantleboto3 control-plane client. We rejected both a maintained code-level mapping and runtime probing in favor of storing the path per model (sourced from the AWS model card, baked into curated cards, admin-settable in the escape hatch). The wrong path returns a misleading401 "not enabled for this account"at chat time, so this matters.bedrockprovider whereCacheConfigcaching already works.)Reviewer notes / deploy
platform.ymlmust deploy thebedrock-mantle:*IAM before inference works in cloud. Verified end-to-end locally againstus-west-2(token mint →/v1/models→ chat completions on both/v1and/openai/v1).reasoning_effortas a tunable for Gemma — needs a small shared-catalog fix (number→select).Testing
tscclean, integration 24 passed🤖 Generated with Claude Code