feat: add Amazon Bedrock Mantle as a model provider by philmerrell · Pull Request #479 · Boise-State-Development/agentcore-public-stack

philmerrell · 2026-06-13T20:17:00Z

Overview

Adds Amazon Bedrock Mantle as a first-class model provider. Mantle is AWS's OpenAI-compatible inference surface for Bedrock-hosted open-weight models (Qwen, GPT-OSS, Gemma, DeepSeek, …). It's distinct from the existing bedrock provider because it speaks the OpenAI wire protocol with a short-term bearer token, not the Converse API with SigV4.

Admins can now browse Mantle models and add them via curated quick-add cards (with a manual escape-hatch form), and users can chat with them like any other managed model.

What's included

Backend (feat(models))

apis/shared/bedrock/bearer_token.py — SigV4-presigned CallWithBearerToken bearer token, ported inline from aws-bedrock-token-generator (no new dependency), plus a region+path base-URL helper.
GET /admin/mantle/models — browse the live regional Mantle roster via the OpenAI SDK.
ModelProvider.MANTLE threaded end to end: model_config translation → agent_factory builds a Strands OpenAIModel against the Mantle base URL → BaseAgent + inference chat pipeline.
mantle_endpoint_path on the managed-model shape, persisted and resolved server-authoritatively in _resolve_model_settings, and carried through the paused-turn snapshot for resume correctness.

Infra (feat(infra))

Grants the bedrock-mantle:* IAM namespace (mirrors AWS-managed AmazonBedrockMantleInferenceAccess): CreateInference + Get*/List* + CallWithBearerToken on the runtime role; read-only + CallWithBearerToken on the app-API task role.

Frontend (feat(spa))

A Bedrock Mantle catalog tab with curated cards (Qwen3 Coder 30B on /v1, Gemma 4 31B on /openai/v1), a Mantle-aware model form (endpoint-path selector, caching hidden, model-id datalist seeded from the live roster).

Key design decisions

The endpoint path is recorded, not discovered. Mantle serves different models on different OpenAI-compatible paths (/v1 vs /openai/v1) and exposes no API to tell you which — verified against both the data plane (/v1/models returns only {id, created, object, owned_by}) and the absence of a bedrock-mantle boto3 control-plane client. We rejected both a maintained code-level mapping and runtime probing in favor of storing the path per model (sourced from the AWS model card, baked into curated cards, admin-settable in the escape hatch). The wrong path returns a misleading 401 "not enabled for this account" at chat time, so this matters.
Caching is intentionally off for Mantle. Prompt caching on Bedrock is model-bound to Anthropic Claude + a few Amazon Nova models, none of which run through the Mantle provider. (Claude/Nova continue to use the bedrock provider where CacheConfig caching already works.)
Curated-card data is verified: pricing against the AWS Bedrock pricing page (2026-06); modalities/capabilities/context/path against each model card.

Reviewer notes / deploy

Two preconditions for cloud: (1) Bedrock Mantle must be enabled for the account in the target region (an AWS-console action, not a deploy); (2) platform.yml must deploy the bedrock-mantle:* IAM before inference works in cloud. Verified end-to-end locally against us-west-2 (token mint → /v1/models → chat completions on both /v1 and /openai/v1).
Known Gemma-4 behavior (Chat Completions path): reasoning happens but its trace isn't returned (Responses-API only), and parallel tool calls aren't supported (one per turn).
Follow-up (not in this PR): expose reasoning_effort as a tunable for Gemma — needs a small shared-catalog fix (number → select).

Testing

Backend: 3861 passed, 3 skipped
Frontend: 1216 passed
Infra: tsc clean, integration 24 passed

🤖 Generated with Claude Code

Mantle is AWS's OpenAI-compatible inference surface for Bedrock-hosted open-weight models (qwen, gpt-oss, gemma, deepseek, ...). It is a distinct provider from `bedrock` because it speaks the OpenAI wire protocol with a short-term bearer token rather than the Converse API with SigV4. What this adds: - apis/shared/bedrock/bearer_token.py: SigV4-presigned `CallWithBearerToken` bearer token (ported inline from aws-bedrock-token-generator, no new dep) plus region+path base-URL helper. - GET /admin/mantle/models: browse the live regional Mantle roster via the OpenAI SDK (seeds the escape-hatch form's model-id suggestions). - ModelProvider.MANTLE end to end: model_config translation, agent_factory builds a Strands OpenAIModel against the Mantle base URL, BaseAgent + inference chat pipeline thread the value through. - mantle_endpoint_path on the managed-model shape, persisted and resolved server-authoritatively in _resolve_model_settings. Mantle serves different models on different paths (/v1 vs /openai/v1) and exposes NO API to discover which, so the path is recorded per model (from the model card) rather than probed or mapped. Carried through the paused-turn snapshot so a resumed Mantle turn rebuilds the same base URL. Caching is intentionally left off for Mantle: prompt caching on Bedrock is model-bound to Anthropic Claude + a few Amazon Nova models, none of which run through the Mantle provider. Requires `bedrock-mantle:*` IAM (separate infra commit) and Mantle being enabled for the account/region. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Bedrock Mantle has its own IAM service namespace — `bedrock-mantle:*`, NOT `bedrock:*`. Mirror the AWS-managed AmazonBedrockMantleInferenceAccess policy: - AgentCore runtime role: `bedrock-mantle:CreateInference` + `Get*`/`List*` on `project/*`, plus `bedrock-mantle:CallWithBearerToken` (mantle-provider inference). - App-API task role: read-only `Get*`/`List*` + `CallWithBearerToken` for the GET /admin/mantle/models browse endpoint. The token signer is authorized against this namespace; without it inference returns an IAM denial even when Mantle is enabled for the account. Integration test asserts both roles carry CallWithBearerToken and the runtime carries CreateInference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add a "Bedrock Mantle" tab to the admin model catalog, mirroring the Bedrock tab: curated cards with vetted, human-reviewed settings (pricing, modalities, context, and the all-important endpoint path baked in). No probing, no magic. - CURATED_MANTLE_MODELS seeded with Qwen3 Coder 30B (/v1) and Gemma 4 31B (/openai/v1). Pricing verified against the AWS Bedrock pricing page (2026-06); modalities/capabilities/context/path verified against each model card (Gemma 4: text+image+video in, text out, reasoning + tool use + vision). - Model shape gains `mantleEndpointPath`; the model form becomes Mantle-aware: a /v1 vs /openai/v1 selector (with model-card guidance), caching controls hidden (Mantle open-weight models don't cache), and a model-id datalist seeded from the live GET /admin/mantle/models roster for off-catalog adds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

philmerrell and others added 3 commits June 13, 2026 14:16

philmerrell merged commit 02b81a8 into develop Jun 13, 2026
1 check passed

philmerrell deleted the feature/mantle-provider branch June 13, 2026 20:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Amazon Bedrock Mantle as a model provider#479

feat: add Amazon Bedrock Mantle as a model provider#479
philmerrell merged 3 commits into
developfrom
feature/mantle-provider

philmerrell commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

philmerrell commented Jun 13, 2026

Overview

What's included

Key design decisions

Reviewer notes / deploy

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant