Documentation Index
Fetch the complete documentation index at: https://orbit-docs.devotel.io/llms.txt
Use this file to discover all available pages before exploring further.
ADR-001: Anthropic-only LLM stack with OpenAI carve-outs
Status: CURRENT Date: 2026-05-16Context
A multi-provider LLM gateway (Portkey / LiteLLM / OpenAI / Anthropic) was over-abstracted: 1816 LOC of routing for a single real provider. Two surfaces still required OpenAI specifically — embeddings (Qdrant compatibility) and voice realtime — but everything else funnelled through dead routing code that hid native Anthropic features (1h tool-cache TTL, MessageStream, per-call Sentry tags).Decision
All chat-completion and agent text generation route throughpackages/agents/src/llm/ → @anthropic-ai/sdk directly. OpenAI is carved out for ONE surface only post-STRATEGIC-VOICE-001 (2026-05-23):
- Embeddings —
text-embedding-3-large(load-bearing; Qdrant requires consistent embed model).
gpt-realtime-2) was DROPPED 2026-05-23. Realtime voice now runs entirely on the Anthropic chain (packages/voice/src/anthropic-voice-pipeline.ts): Deepgram STT → Claude streaming → Cartesia Sonic TTS. Reasoning: Anthropic partnership commit + single-stack operational simplicity + Claude reasoning depth as the differentiator vs gpt-realtime-2.
Reason
Anthropic partnership commit plus product positioning (AI-native, claude-haiku/sonnet/opus everywhere). Deleting the gateway recovered 1816 LOC and reinstated native Anthropic features the abstraction had erased.Consequences
- No multi-provider failover (acceptable — Anthropic SLA).
- Per-tenant circuit breaker keyed
(tenantId, provider)only (invariant #7). gpt-*as chat default is forbidden.- Future lint rule should forbid
@anthropic-ai/sdkimports outsidepackages/agents/src/llm/(invariant #63).
Source memos
- [[architecture_anthropic_sdk_direct_2026_05_13]]
- [[wave_m3_anthropic_embeddings_carveout_2026_05_16]]
- [[provider_anthropic]]
- [[provider_openai]]
invariants.md#63