ADR 0034 · Bedrock + Vertex providers
- Status: accepted
- Date: 2026-05-24
- Author: john.ford2002@gmail.com
- Spec:
docs/superpowers/specs/2026-05-24-bedrock-vertex-providers-design.md
Context
caliban-provider-anthropic already contains feature-gated
BedrockTransport and VertexTransport implementations (bedrock and
vertex Cargo features), plus the workspace already declares
aws-config, aws-sdk-bedrockruntime, aws-smithy-types, and
gcp_auth as dependencies in anticipation of this work. What's
missing is the top-level Provider-implementing crates that expose
these transports as first-class providers with their own name(),
their own list_models (which require control-plane APIs the
Anthropic crate has no business knowing about), and their own auth
refresh policy. Parity with Claude Code's --bedrock / --vertex
flags requires both crates.
Decision
Two new crates, both thin wrappers around the existing transports
caliban-provider-bedrock and caliban-provider-vertex each contain
~300 lines of glue:
- A
Provider-implementing struct wrappingAnthropicProvider<BedrockTransport>orAnthropicProvider<VertexTransport>. - A
*Configstruct +from_env/from_configconstructors. - An
AuthRefreshbackground task. - A
list_modelsthat hits the relevant control-plane API (bedrock:ListInferenceProfiles/publishers/anthropic/models), caches the result for the session, and falls back to a vendored list on failure. - A
name()returning"bedrock"/"vertex"so the model router and telemetry attribute these correctly.
We do not extend caliban-provider-anthropic to expose Bedrock /
Vertex as alternate constructors because (a) it would force the
Anthropic crate to depend on aws-sdk-bedrock (control plane) and
gain its own non-trivial auth code, and (b) operators have a real
mental-model expectation that provider = "bedrock" and
provider = "anthropic" are separate provider entries.
Auth refresh is a per-provider tokio task with a 5-minute default
Both crates spawn one background task on construction that calls
provider.get_token() (via aws-config's ProvideCredentials or
gcp_auth's TokenProvider) on a configurable interval. Settings
fields aws_auth_refresh and gcp_auth_refresh (and env
CALIBAN_AWS_AUTH_REFRESH / CALIBAN_GCP_AUTH_REFRESH) control the
interval; default 5m; 0 disables proactive refresh and relies on
inline 401 recovery only. Refresh failures back off exponentially up
to the configured interval and surface as tracing::warn! until they
succeed; the cached token continues to be served until it expires.
Model-id canonicalization stays in caliban-provider-anthropic
Transport::wire_model_id already lives in the Anthropic crate. The
new provider crates expose a small per-base-model release-date table
(e.g. ("claude-opus-4-7", "20260423")) consumed by the transport's
wire_model_id. The caliban canonical model name (claude-opus-4-7)
remains the same across Anthropic / Bedrock / Vertex — only the wire
form differs.
Capabilities mirror direct Anthropic per base model
The hyperscalers serve the same Anthropic models with the same context
windows, vision support, and tool-use semantics. Until a real
discrepancy emerges (e.g. some regions lacking prompt caching), both
crates' capabilities() strip the platform suffix and delegate to
caliban_provider_anthropic::models::capabilities_for. Any future
regional / platform restriction is added as a small subtraction layer
on top — not by forking the capabilities table.
list_models is on-demand + per-session-cached, with fallback
We resist the temptation to call list_inference_profiles at provider
startup because (a) startup latency is precious and (b) operators with
read-restricted IAM principals shouldn't fail startup just because
they can't introspect. Both crates call the control-plane API the
first time list_models is invoked, cache the result in a
tokio::sync::OnceCell, and fall back to a vendored list of
well-known models if the API call fails.
Request metadata flows through unchanged
RequestMetadata.purpose, user_id, and any future fields pass
through both crates untouched into the transport into the wire body.
The provider crates own auth + endpoint + list_models — not request
shape.
Consequences
- Positive: Closes two 🔴 rows under I. Model router & providers
(
Bedrock,Vertex). Enables operators in regulated industries (financial services, healthcare, gov) to use caliban with their contractual cloud provider. Composes cleanly withcaliban-model-routerso the same operator can route Sonnet via Bedrock for compliance and Haiku via direct Anthropic for cost. Reuses the Anthropic IR adapter so the message-shape correctness surface stays single-sourced. - Negative: Adds two new crates to the workspace; the
aws-*dependency tree is heavy (~30 transitive crates, mostly hyper/tower stack). Bedrock model-id rotation (Anthropic occasionally re-dates Bedrock models without changing direct-API names) requires per-base-model date-table maintenance. Two new mock-based test surfaces to maintain. - Revisit if: AWS or GCP changes the canonical wire format
significantly (e.g. Bedrock unifies under inference-profile ARNs
exclusively), in which case the canonical→wire mapping simplifies.
If
caliban-provider-anthropic's embeddedbedrock/vertexfeatures turn out to be confusing duplicate paths, deprecate those feature flags in favor of the new crates and route all hyperscaler-served Anthropic through here.