Releases · tensorzero/tensorzero

@wangfenjin

Warning

Planned Deprecations

The configuration for inference evaluations should be nested under the relevant functions moving forward [docs]. You can run evaluations by providing a function name and a list of evaluators. The legacy format will be removed in a future release.
```
[functions.write_haiku.evaluators.exact_match]
type = "exact_match"
```
The legacy implementation of GEPA (launch_optimization with GEPAConfig) will be removed in a future release. Please use t0.optimization.gepa.launch instead. [docs]

Bug Fixes

Fixed a UI bug where a custom gateway base_path was not handled correctly in certain routes. (thanks @wangfenjin!)

New Features

Started including embeddings requests in the Prometheus metrics tensorzero_requests_total and tensorzero_inferences_total.
Added the configuration field observability.batch_writes.write_queue_capacity to enable backpressure for observability data in the gateway.

& multiple under-the-hood and UI improvements (thanks @majiayu000)!

Important

🆕 TensorZero Autopilot

TensorZero Autopilot is an automated AI engineer powered by TensorZero that analyzes LLM observability data, sets up evals, optimizes prompts and models, and runs A/B tests.

It dramatically improves the performance of LLM agents across diverse tasks:

Bar chart showing baseline vs. optimized scores across diverse LLM tasks

Learn more → Schedule a demo →

@eibrahim95

Bug Fixes

Fixed two edge cases affecting batch inference.
Fixed a UI bug affecting "Try with..." with inputs that include base64 files.
Removed assistant message prefill for JSON functions + Anthropic (deprecated by Anthropic).

New Features

Added an implementation of GEPA (automated prompt engineering) based on durable workflows.
Allow users to specify duplicate tool calls in all_of tool evaluators to evaluate parallel tool calling.
Allow users to specify an expiration date for API keys in the UI. (thanks @eibrahim95)
Allow users to specify object_storage.endpoint = "env::MY_ENV_VAR" in addition to static values. (thanks @Meredith2328)

& multiple under-the-hood and UI improvements (thanks @majiayu000)!

Bug Fixes

Fixed an UI issue that prevented certain pages from rendering when depending on historical configuration.

New Features

Added Postgres as an alternative observability backend to ClickHouse. Postgres is the simplest way to get started; we recommend ClickHouse if you're handling >100 RPS.
Added the openrouter::xxx short-hand for embedding models.
Added support for per-session API keys in the browser (instead of a global environment variable) when auth is enabled.

& multiple under-the-hood and UI improvements!

Warning

Completed Deprecations

Removed the deprecated model_provider_name filter for extra_body and extra_headers. Please use model_name and provider_name instead.
Removed the legacy experimental list_inferences endpoint and method. Please use the new endpoint instead. [docs]
Removed several long-deprecated types and methods from the TensorZero Python SDK.

Warning

Planned Deprecations

The embedded gateway in the TensorZero Python SDK will be removed in a future release (2026.6+). patch_openai_client and build_embedded are deprecated. Please deploy a standalone TensorZero Gateway instead (usage: base_url for OpenAI SDK; build_http for TensorZero SDK).
The variant configuration field weight will be removed in a future release (2026.6+). Please use the new experimentation configuration semantics. [docs]

Bug Fixes

Fixed a compatibility bug with Valkey-based caching that only affected Redis.

New Features

Added support for launching optimization workflows with dataset_name (instead of an inference query) in launch_optimization_workflow.

& multiple under-the-hood and UI improvements!

Warning

Completed Deprecations

The deprecated Prometheus metric tensorzero_inference_latency_overhead_seconds_histogram was removed. Use tensorzero_inference_latency_overhead_seconds instead.

Warning

Planned Deprecations

The configuration for experimentation (e.g. static_weights, track_and_stop) was simplified. The old notation will be removed in a future release. See Run adaptive A/B tests and Run static A/B tests for more information.
The evaluator configuration field cutoff will be removed in a future release. Instead, provide --cutoffs evaluator=value,... in the CLI.
The gateway route /variant_sampling_probabilities will be removed in a future release.
The configuration field postgres.enabled will be removed in a future release. Instead, the gateway will consider whether the environment variable TENSORZERO_POSTGRES_URL is set.

New Features

Add regex and tool_use evaluators. [docs]
Add experimental_launch_optimization_workflow to the TensorZero Python SDK.

& multiple under-the-hood and UI improvements!

@Nfemz

Caution

Breaking Changes

The --config-file globbing behavior has changed: single-level wildcards (*) no longer match files across directory boundaries. To match files across directory boundaries, use recursive wildcards (**). This aligns the behavior with standard glob semantics. For example:
- --config-file *.toml matches tensorzero.toml, but not subdir/tensorzero.toml.
- --config-file **/*.toml matches both tensorzero.toml and subdir/tensorzero.toml.

Warning

Completed Deprecations

Removed deprecated legacy endpoints for dataset management. The functionality is fully covered by the new endpoints.

New Features

Add cost tracking and cost-based rate limiting.
Add namespaces: the ability to set up multiple granular experiments (A/B tests) for the same TensorZero function.
Improve reasoning support for Anthropic (including adaptive thinking), Fireworks AI, SGLang, and Together AI.
Allow users to whitelist automatic tool approvals for TensorZero Autopilot.
Report provider errors when include_raw_response is enabled.
Add include_aggregated_response to streaming inferences. When enabled, the final chunk includes an aggregated output aggregated_response that combines previous chunks.
Allow users to kill ongoing evaluation runs from UI.
Allow custom gateway bind addresses with the environment variable TENSORZERO_GATEWAY_BIND_ADDRESS.

& multiple under-the-hood and UI improvements (thanks @Nfemz @greg80303)!

Caution

Breaking Changes

The default value for cache_options.enabled changed from write_only to off.

New Features

Support reasoning models from Groq, Mistral, and vLLM.
Support multi-turn reasoning with Gemini and OpenAI-compatible models.
Support embedding models from Together AI.
Add configurable total_ms timeout to streaming inferences.
Display charts with top-k evaluation results in the TensorZero Autopilot UI.
Add "Ask Autopilot" buttons throughout the UI.
Allow TensorZero Autopilot to edit your local configuration files.
Return thought and unknown content blocks in the OpenAI-compatible endpoint (tensorzero_extra_content).

& multiple under-the-hood and UI improvements!

@pratikbuilds

Warning

Planned Deprecations

Anthropic's structured output feature is out of beta, so the TensorZero configuration field beta_structured_outputs is now ignored and deprecated. It'll be removed in a future release.

Bug Fixes

Fix a regression in the aws_bedrock provider that affected long-term bearer API keys.
Fix a horizontal overflow issue for tool calls and results in the inference detail UI page.

New Features

Add YOLO Mode for TensorZero Autopilot.
Add interruption feature for TensorZero Autopilot sessions.
Add summary to the TensorZero Autopilot session table in the UI.

& multiple under-the-hood and UI improvements (thanks @pratikbuilds)!

Bug Fixes

Fix a race condition in the TensorZero Autopilot UI that could disable the chat input.
Increase timeouts for slow tool calls triggered by TensorZero Autopilot (e.g. evaluations).

& multiple under-the-hood and UI improvements!

New Features

[Preview] TensorZero Autopilot — an automated AI engineer that analyzes LLM observability data, optimizes prompts and models, sets up evals, and runs A/B tests. Learn more → Join the waitlist →
Support multi-turn reasoning for xAI (reasoning_content only).

& multiple under-the-hood and UI improvements!

Releases: tensorzero/tensorzero

2026.3.4

🆕 TensorZero Autopilot

Contributors

Uh oh!

2026.3.3

Contributors

Uh oh!

2026.3.2

Uh oh!

2026.3.1

Uh oh!

2026.3.0

Uh oh!

2026.2.2

Contributors

Uh oh!

2026.2.1

Uh oh!

2026.2.0

Contributors

Uh oh!

2026.1.8

Uh oh!

2026.1.7

Uh oh!