v1.70.1-stable - Gemini Realtime API Support

May 17, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

New Models / Updated Models

Gemini (VertexAI + Google AI Studio)
- /chat/completion
  - Handle audio input - PR
  - Fixes maximum recursion depth issue when using deeply nested response schemas with Vertex AI by Increasing DEFAULT_MAX_RECURSE_DEPTH from 10 to 100 in constants. PR
  - Capture reasoning tokens in streaming mode - PR
Google AI Studio
- /realtime
  - Gemini Multimodal Live API support
  - Audio input/output support, optional param mapping, accurate usage calculation - PR
VertexAI
- /chat/completion
  - Fix llama streaming error - where model response was nested in returned streaming chunk - PR
Ollama
- /chat/completion
  - structure responses fix - PR
Bedrock
- /chat/completion
  - Handle thinking_blocks when assistant.content is None - PR
  - Fixes to only allow accepted fields for tool json schema - PR
  - Add bedrock sonnet prompt caching cost information
  - Mistral Pixtral support - PR
  - Tool caching support - PR
- /messages
  - allow using dynamic AWS Params - PR
Nvidia NIM
- /chat/completion [NEED DOCS ON SUPPORTED PARAMS]
  - Add tools, tool_choice, parallel_tool_calls support - PR
Novita AI
- New Provider added for /chat/completion routes - PR
Azure
- /image/generation
  - Fix azure dall e 3 call with custom model name - PR
Cohere
- /embeddings
  - Migrate embedding to use /v2/embed - adds support for output_dimensions param - PR
Anthropic
- /chat/completion
  - Web search tool support - native + openai format - Get Started
VLLM
- /chat/completion
  - Support embedding input as list of integers - PR [NEEDS DOCS]
OpenAI
- /chat/completion
  - Fix - b64 file data input handling - PR
  - Add ‘supports_pdf_input’ to all vision models - PR

LLM API Endpoints

Responses API
- Fix delete API support - PR
Rerank API
- /v2/rerank now registered as ‘llm_api_route’ - enabling non-admins to call it - PR
Realtime API
- Gemini Multimodal Live API support - PR

Spend Tracking Improvements

/chat/completion, /messages
- Anthropic - web search tool cost tracking - PR
- Groq - update model max tokens + cost information - PR
/audio/transcription
- Azure - Add gpt-4o-mini-tts pricing - PR
- Proxy - Fix tracking spend by tag - PR
/embeddings
- Azure AI - Add cohere embed v4 pricing - PR

Management Endpoints / UI

Models
- Ollama - adds api base param to UI
Logs
- Add team id, key alias, key hash filter on logs - https://github.com/BerriAI/litellm/pull/10831
- Guardrail tracing now in Logs UI - https://github.com/BerriAI/litellm/pull/10893
Teams
- Patch for updating team info when team in org and members not in org - https://github.com/BerriAI/litellm/pull/10835
Guardrails
- Add Bedrock, Presidio, Lakers guardrails on UI - https://github.com/BerriAI/litellm/pull/10874
- See guardrail info page - https://github.com/BerriAI/litellm/pull/10904
- Allow editing guardrails on UI - https://github.com/BerriAI/litellm/pull/10907
Test Key
- select guardrails to test on UI

Logging / Alerting Integrations

StandardLoggingPayload
- Log any x- headers in requester metadata - PR [NEEDS DOCS]
- Guardrail tracing now in standard logging payload - PR [NEEDS DOCS]
Generic API Logger
- Support passing application/json header
Arize Phoenix
- fix: URL encode OTEL_EXPORTER_OTLP_TRACES_HEADERS for Phoenix Integration - PR
- add guardrail tracing to OTEL, Arize phoenix - PR
PagerDuty
- Pagerduty is now a free feature - PR
Alerting
- Sending slack alerts on virtual key/user/team updates is now free - PR

Guardrails

Guardrails
- New /apply_guardrail endpoint for directly testing a guardrail - PR [NEEDS DOCS]
Lakera
- /v2 endpoints support - PR
Presidio
- Fixes handling of message content on presidio guardrail integration - PR
- Allow specifying PII Entities Config - PR
Aim Security
- Support for anonymization in AIM Guardrails - PR

Performance / Loadbalancing / Reliability improvements

Allow overriding all constants using a .env variable - PR
Maximum retention period for spend logs
- Add retention flag to config - PR
- Support for cleaning up logs based on configured time period - PR

General Proxy Improvements

Authentication
- Handle Bearer $LITELLM_API_KEY in x-litellm-api-key custom header PR
New Enterprise pip package - litellm-enterprise - fixes issue where enterprise folder was not found when using pip package
Proxy CLI
- Add models import command - PR
OpenWebUI
- Configure LiteLLM to Parse User Headers from Open Web UI
LiteLLM Proxy w/ LiteLLM SDK
- Option to force/always use the litellm proxy when calling via LiteLLM SDK

New Contributors

@imdigitalashish made their first contribution in PR #10617
@LouisShark made their first contribution in PR #10688
@OscarSavNS made their first contribution in PR #10764
@arizedatngo made their first contribution in PR #10654
@jugaldb made their first contribution in PR #10805
@daikeren made their first contribution in PR #10781
@naliotopier made their first contribution in PR #10077
@damienpontifex made their first contribution in PR #10813
@Dima-Mediator made their first contribution in PR #10789
@igtm made their first contribution in PR #10814
@shibaboy made their first contribution in PR #10752
@camfarineau made their first contribution in PR #10629
@ajac-zero made their first contribution in PR #10439
@damgem made their first contribution in PR #9802
@hxdror made their first contribution in PR #10757
@wwwillchen made their first contribution in PR #10894

New Models / Updated Models​

LLM API Endpoints​

Spend Tracking Improvements​

Management Endpoints / UI​

Logging / Alerting Integrations​

Guardrails​

Performance / Loadbalancing / Reliability improvements​

General Proxy Improvements​

New Contributors​

New Models / Updated Models

LLM API Endpoints

Spend Tracking Improvements

Management Endpoints / UI

Logging / Alerting Integrations

Guardrails

Performance / Loadbalancing / Reliability improvements

General Proxy Improvements

New Contributors