v1.70.1-stable - Gemini Realtime API Support May 17, 2025
New Models / Updated Models​ Gemini (VertexAI + Google AI Studio ) /chat/completion
Handle audio input - PR Fixes maximum recursion depth issue when using deeply nested response schemas with Vertex AI by Increasing DEFAULT_MAX_RECURSE_DEPTH from 10 to 100 in constants. PR Capture reasoning tokens in streaming mode - PR Google AI Studio /realtime
Gemini Multimodal Live API support Audio input/output support, optional param mapping, accurate usage calculation - PR VertexAI /chat/completion
Fix llama streaming error - where model response was nested in returned streaming chunk - PR Ollama /chat/completion
structure responses fix - PR Bedrock /chat/completion
Handle thinking_blocks when assistant.content is None - PR Fixes to only allow accepted fields for tool json schema - PR Add bedrock sonnet prompt caching cost information Mistral Pixtral support - PR Tool caching support - PR /messages
allow using dynamic AWS Params - PR Nvidia NIM /chat/completion
[NEED DOCS ON SUPPORTED PARAMS]Add tools, tool_choice, parallel_tool_calls support - PR Novita AI New Provider added for /chat/completion
routes - PR Azure Cohere /embeddings
Migrate embedding to use /v2/embed
- adds support for output_dimensions param - PR Anthropic VLLM /chat/completion
Support embedding input as list of integers - PR [NEEDS DOCS] OpenAI /chat/completion
Fix - b64 file data input handling - PR Add ‘supports_pdf_input’ to all vision models - PR LLM API Endpoints​ Responses API Fix delete API support - PR Rerank API /v2/rerank
now registered as ‘llm_api_route’ - enabling non-admins to call it - PR Realtime API Gemini Multimodal Live API support - PR Spend Tracking Improvements​ /chat/completion
, /messages
Anthropic - web search tool cost tracking - PR Groq - update model max tokens + cost information - PR /audio/transcription
Azure - Add gpt-4o-mini-tts pricing - PR Proxy - Fix tracking spend by tag - PR /embeddings
Azure AI - Add cohere embed v4 pricing - PR Management Endpoints / UI​ Models Ollama - adds api base param to UI Logs Teams Guardrails Test Key select guardrails to test on UI Logging / Alerting Integrations​ StandardLoggingPayload Log any x-
headers in requester metadata - PR [NEEDS DOCS] Guardrail tracing now in standard logging payload - PR [NEEDS DOCS] Generic API Logger Support passing application/json header Arize Phoenix fix: URL encode OTEL_EXPORTER_OTLP_TRACES_HEADERS for Phoenix Integration - PR add guardrail tracing to OTEL, Arize phoenix - PR PagerDuty Pagerduty is now a free feature - PR Alerting Sending slack alerts on virtual key/user/team updates is now free - PR Guardrails​ Guardrails New /apply_guardrail
endpoint for directly testing a guardrail - PR [NEEDS DOCS] Lakera /v2
endpoints support - PR Presidio Fixes handling of message content on presidio guardrail integration - PR Allow specifying PII Entities Config - PR Aim Security Support for anonymization in AIM Guardrails - PR General Proxy Improvements​ Authentication Handle Bearer $LITELLM_API_KEY in x-litellm-api-key custom header PR New Enterprise pip package - litellm-enterprise
- fixes issue where enterprise
folder was not found when using pip package Proxy CLI Add models import
command - PR OpenWebUI Configure LiteLLM to Parse User Headers from Open Web UI LiteLLM Proxy w/ LiteLLM SDK Option to force/always use the litellm proxy when calling via LiteLLM SDK New Contributors​