Model and Provider Configuration (models.yml)
This document describes how the coding-agent currently loads models, applies overrides, resolves credentials, and chooses models at runtime.
What controls model behavior
Primary implementation files:
src/config/model-registry.ts— loads built-in + custom models, provider overrides, runtime discovery, auth integrationsrc/config/model-resolver.ts— parses model patterns and selects initial/smol/slow modelssrc/config/settings-schema.ts— model-related settings (modelRoles, provider transport preferences)src/session/auth-storage.ts— API key + OAuth resolution orderpackages/ai/src/models.tsandpackages/ai/src/types.ts— built-in providers/models andModel/compattypes
Config file location and legacy behavior
Default config path:
~/.pisces/agent/models.yml
Legacy behavior still present:
- If
models.ymlis missing andmodels.jsonexists at the same location, it is migrated tomodels.yml. - Explicit
.json/.jsoncconfig paths are still supported when passed programmatically toModelRegistry.
models.yml shape
providers:
<provider-id>:
# provider-level configprovider-id is the canonical provider key used across selection and auth lookup.
Provider-level fields
providers:
my-provider:
baseUrl: https://api.example.com/v1
apiKey: MY_PROVIDER_API_KEY
api: openai-completions
headers:
X-Team: platform
authHeader: true
auth: apiKey
discovery:
type: ollama
modelOverrides:
some-model-id:
name: Renamed model
models:
- id: some-model-id
name: Some Model
api: openai-completions
reasoning: false
input: [text]
cost:
input: 0
output: 0
cacheRead: 0
cacheWrite: 0
contextWindow: 128000
maxTokens: 16384
headers:
X-Model: value
compat:
supportsStore: true
supportsDeveloperRole: true
supportsReasoningEffort: true
maxTokensField: max_completion_tokens
openRouterRouting:
only: [anthropic]
vercelGatewayRouting:
order: [anthropic, openai]
extraBody:
gateway: m1-01
controller: mlxAllowed provider/model api values
openai-completionsopenai-responsesopenai-codex-responsesazure-openai-responsesanthropic-messagesgoogle-generative-aigoogle-vertex
Allowed auth/discovery values
auth:apiKey(default) ornonediscovery.type:ollama
Validation rules (current)
Full custom provider (models is non-empty)
Required:
baseUrlapiKeyunlessauth: noneapiat provider level or each model
Override-only provider (models missing or empty)
Must define at least one of:
baseUrlmodelOverridesdiscovery
Discovery
discoveryrequires provider-levelapi.
Model value checks
idrequiredcontextWindowandmaxTokensmust be positive if provided
Merge and override order
ModelRegistry pipeline (on refresh):
- Load built-in providers/models from
@oh-my-pi/pi-ai. - Load
models.ymlcustom config. - Apply provider overrides (
baseUrl,headers) to built-in models. - Apply
modelOverrides(per provider + model id). - Merge custom
models:- same
provider + idreplaces existing - otherwise append
- same
- Apply runtime-discovered models (currently Ollama and LM Studio), then re-apply model overrides.
Provider defaults vs per-model overrides:
- Provider
headersare baseline. - Model
headersoverride provider header keys. modelOverridescan override model metadata (name,reasoning,input,cost,contextWindow,maxTokens,headers,compat,contextPromotionTarget).compatis deep-merged for nested routing blocks (openRouterRouting,vercelGatewayRouting,extraBody).
Runtime discovery integration
Implicit Ollama discovery
If ollama is not explicitly configured, registry adds an implicit discoverable provider:
- provider:
ollama - api:
openai-completions - base URL:
OLLAMA_BASE_URLorhttp://127.0.0.1:11434 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery calls GET /api/tags on Ollama and synthesizes model entries with local defaults.
Implicit llama.cpp discovery
If llama.cpp is not explicitly configured, registry adds an implicit discoverable provider: Note: it's using the newer antropic messages api instead of the openai-competions.
- provider:
llama.cpp - api:
openai-responses - base URL:
LLAMA_CPP_BASE_URLorhttp://127.0.0.1:8080 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery calls GET models on llama.cpp and synthesizes model entries with local defaults.
Implicit LM Studio discovery
If lm-studio is not explicitly configured, registry adds an implicit discoverable provider:
- provider:
lm-studio - api:
openai-completions - base URL:
LM_STUDIO_BASE_URLorhttp://127.0.0.1:1234/v1 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery fetches models (GET /models) and synthesizes model entries with local defaults.
Explicit provider discovery
You can configure discovery yourself:
providers:
ollama:
baseUrl: http://127.0.0.1:11434
api: openai-completions
auth: none
discovery:
type: ollama
llama.cpp:
baseUrl: http://127.0.0.1:8080
api: openai-responses
auth: none
discovery:
type: llama.cppExtension provider registration
Extensions can register providers at runtime (pi.registerProvider(...)), including:
- model replacement/append for a provider
- custom stream handler registration for new API IDs
- custom OAuth provider registration
Auth and API key resolution order
When requesting a key for a provider, effective order is:
- Runtime override (CLI
--api-key) - Stored API key credential in
agent.db - Stored OAuth credential in
agent.db(with refresh) - Environment variable mapping (
OPENAI_API_KEY,ANTHROPIC_API_KEY, etc.) - ModelRegistry fallback resolver (provider
apiKeyfrommodels.yml, env-name-or-literal semantics)
models.yml apiKey behavior:
- Value is first treated as an environment variable name.
- If no env var exists, the literal string is used as the token.
If authHeader: true and provider apiKey is set, models get:
Authorization: Bearer <resolved-key>header injected.
Keyless providers:
- Providers marked
auth: noneare treated as available without credentials. getApiKey*returnskNoAuthfor them.
Model availability vs all models
getAll()returns the loaded model registry (built-in + merged custom + discovered).getAvailable()filters to models that are keyless or have resolvable auth.
So a model can exist in registry but not be selectable until auth is available.
Runtime model resolution
CLI and pattern parsing
model-resolver.ts supports:
- exact
provider/modelId - exact model id (provider inferred)
- fuzzy/substring matching
- glob scope patterns in
--models(e.g.openai/*,*sonnet*) - optional
:thinkingLevelsuffix (off|minimal|low|medium|high|xhigh)
--provider is legacy; --model is preferred.
Initial model selection priority
findInitialModel(...) uses this order:
- explicit CLI provider+model
- first scoped model (if not resuming)
- saved default provider/model
- known provider defaults (e.g. OpenAI/Anthropic/etc.) among available models
- first available model
Role aliases and settings
Supported model roles:
default,smol,slow,plan,commit
Role aliases like pi/smol expand through settings.modelRoles. Each role value can also append a thinking selector such as :minimal, :low, :medium, or :high.
If a role points at another role, the target model still inherits normally and any explicit suffix on the referring role wins for that role-specific use.
Related settings:
modelRoles(record)enabledModels(scoped pattern list)providers.kimiApiFormat(openaioranthropicrequest format)providers.openaiWebsockets(auto|off|onwebsocket preference for OpenAI Codex transport)
Context promotion (model-level fallback chains)
Context promotion is an overflow recovery mechanism for small-context variants (for example *-spark) that automatically promotes to a larger-context sibling when the API rejects a request with a context length error.
Trigger and order
When a turn fails with a context overflow error (e.g. context_length_exceeded), AgentSession attempts promotion before falling back to compaction:
- If
contextPromotion.enabledis true, resolve a promotion target (see below). - If a target is found, switch to it and retry the request — no compaction needed.
- If no target is available, fall through to auto-compaction on the current model.
Target selection
Selection is model-driven, not role-driven:
currentModel.contextPromotionTarget(if configured)- smallest larger-context model on the same provider + API
Candidates are ignored unless credentials resolve (ModelRegistry.getApiKey(...)).
OpenAI Codex websocket handoff
If switching from/to openai-codex-responses, session provider state key openai-codex-responses is closed before model switch. This drops websocket transport state so the next turn starts clean on the promoted model.
Persistence behavior
Promotion uses temporary switching (setModelTemporary):
- recorded as a temporary
model_changein session history - does not rewrite saved role mapping
Configuring explicit fallback chains
Configure fallback directly in model metadata via contextPromotionTarget.
contextPromotionTarget accepts either:
provider/model-id(explicit)model-id(resolved within current provider)
Example (models.yml) for Spark -> non-Spark on the same provider:
providers:
openai-codex:
modelOverrides:
gpt-5.3-codex-spark:
contextPromotionTarget: openai-codex/gpt-5.3-codexThe built-in model generator also assigns this automatically for *-spark models when a same-provider base model exists.
Compatibility and routing fields
models.yml supports this compat subset:
supportsStoresupportsDeveloperRolesupportsReasoningEffortmaxTokensField(max_completion_tokensormax_tokens)openRouterRouting.only/openRouterRouting.ordervercelGatewayRouting.only/vercelGatewayRouting.order
These are consumed by the OpenAI-completions transport logic and combined with URL-based auto-detection.
Practical examples
Local OpenAI-compatible endpoint (no auth)
providers:
local-openai:
baseUrl: http://127.0.0.1:8000/v1
auth: none
api: openai-completions
models:
- id: Qwen/Qwen2.5-Coder-32B-Instruct
name: Qwen 2.5 Coder 32B (local)Hosted proxy with env-based key
providers:
anthropic-proxy:
baseUrl: https://proxy.example.com/anthropic
apiKey: ANTHROPIC_PROXY_API_KEY
api: anthropic-messages
authHeader: true
models:
- id: claude-sonnet-4-20250514
name: Claude Sonnet 4 (Proxy)
reasoning: true
input: [text, image]Override built-in provider route + model metadata
providers:
openrouter:
baseUrl: https://my-proxy.example.com/v1
headers:
X-Team: platform
modelOverrides:
anthropic/claude-sonnet-4:
name: Sonnet 4 (Corp)
compat:
openRouterRouting:
only: [anthropic]Legacy consumer caveat
Most model configuration now flows through models.yml via ModelRegistry.
One notable legacy path remains: web-search Anthropic auth resolution still reads ~/.pisces/agent/models.json directly in src/web/search/auth.ts.
If you rely on that specific path, keep JSON compatibility in mind until that module is migrated.
Failure mode
If models.yml fails schema or validation checks:
- registry keeps operating with built-in models
- error is exposed via
ModelRegistry.getError()and surfaced in UI/notifications