Releases · Arize-ai/phoenix
GitHub
01.31.2026
01.31.2026: Tool Selection and Tool Invocation Evaluators
Available in arize-phoenix-evals 0.16.0+ (Python) and @arizeai/phoenix-evals 1.3.0+ (TypeScript)Phoenix now provides two specialized evaluators for assessing AI agent tool usage. The Tool Selection Evaluator judges whether an agent correctly chose the most appropriate tool from its available toolkit to answer a user’s question, without evaluating the parameters passed. The Tool Invocation Evaluator assesses whether the agent correctly invoked a tool with proper parameters, JSON formatting, and safe values.These evaluators help developers:- Identify tool selection errors where agents choose suboptimal or incorrect tools
- Debug parameter issues including hallucinated fields, malformed JSON, and incorrect values
- Improve tool descriptions and agent prompts based on systematic evaluation
- Validate multi-tool and multi-turn interactions across complex agent workflows
ToolSelectionEvaluator and ToolInvocationEvaluator in Python’s phoenix.evals.metrics module, and as createToolSelectionEvaluator and createToolInvocationEvaluator in TypeScript.01.28.2026
01.28.2026: Configurable Email Extraction for OAuth2 Providers
Available in Phoenix 12.33.1+Phoenix now supports custom email extraction from OAuth2 identity providers through thePHOENIX_OAUTH2_{IDP}_EMAIL_ATTRIBUTE_PATH environment variable. This solves authentication issues with providers like Azure AD/Entra ID where the standard email claim may be null but alternative claims like preferred_username contain the user’s identity.Configure email extraction using JMESPath expressions:email claim when no custom path is specified. JMESPath expressions are validated at startup for immediate feedback on configuration errors.01.22.2026
01.22.2026: CLI Commands for Prompts, Datasets, and Experiments
Available in @arizeai/phoenix-cli 0.4.0+The Phoenix CLI now provides comprehensive commands for managing prompts, datasets, and experiments directly from your terminal. Access version-controlled prompts, create evaluation datasets, and run experiments—all without leaving your development environment.Prompt Management:- List and view prompts with
px promptsandpx prompt <name> - Pipe prompts to AI assistants for optimization and analysis
- Text format output with XML-style role tags for LLM consumption
- Create and manage datasets with
px datasetsandpx dataset <name> - Add examples and query dataset contents
- Export datasets for offline analysis
- Run experiments and compare results across configurations
- View experiment details and performance metrics
- Track changes across prompt and model variations
01.23.2026
01.23.2026: CLI Authentication Configuration
Available in @arizeai/phoenix-cli 0.4.0+The Phoenix CLI now includes enhanced authentication configuration commands, resolving database race conditions and improving connection reliability. Users can configure authentication settings directly through the CLI for more predictable and stable connections to Phoenix servers.01.21.2026
01.21.2026: Create Datasets from Traces with Span Associations
Available in arize-phoenix-client 1.28.0+ (Python) and @arizeai/phoenix-client 2.0.0+ (TypeScript)Phoenix now enables converting production traces into curated datasets while preserving bidirectional links back to source spans. Use the newspan_id_key parameter to maintain traceability from evaluation examples to their original production executions.Python Example:- Batch resolution of span IDs for optimal performance
- Graceful fallback when span IDs are missing or invalid
- Backwards compatible with existing dataset creation workflows
- Bidirectional navigation between evaluation results and production traces
01.19.2026
01.19.2026: Export Annotations with Traces
Available in @arizeai/phoenix-cli 0.3.0+The Phoenix CLI now supports exporting annotations alongside traces using the--include-annotations flag. Annotations—including manual labels, LLM evaluation scores, and programmatic feedback—are now preserved when exporting traces for offline analysis, backup, or migration workflows.01.22.2026
01.22.2026: CLI Prompt Commands: Pipe Prompts to AI Assistants 📝
Available in @arizeai/phoenix-cli 0.4.0+Phoenix CLI now supports prompt introspection withpx prompts and px prompt. List prompts, view their content, and pipe them directly to AI assistants like Claude Code for optimization suggestions. The --format text option outputs prompts with XML-style role tags, ideal for analysis workflows.01.21.2026
01.21.2026: Create Datasets from Traces with Span Associations 🔗
Available in arize-phoenix-client 1.28.0+ (Python) and @arizeai/phoenix-client 2.0.0+ (TypeScript)The Phoenix client now enables converting production traces into curated datasets while preserving associations back to source spans. Query spans using client methods, then create datasets with span associations to maintain bidirectional links. Use this to build golden datasets from validated interactions, curate edge cases from failed traces, or create regression test suites from critical user flows.01.21.2026
01.21.2026: Phoenix CLI: Datasets, Experiments & Annotations 🧪
Available in @arizeai/phoenix-cli 0.2.0+The Phoenix CLI now supports datasets, experiments, and annotations. Pull evaluation data, export experiment results, and access human feedback directly from the terminal. Works well with AI coding assistants for analyzing test cases and reviewing results.01.17.2026
01.17.2026: Phoenix CLI: Terminal Access for AI Coding Assistants 🖥️
Available in @arizeai/phoenix-cli 0.1.0+AI coding assistants operate through terminals and files—they run shell commands, read output, and process data. The new Phoenix CLI makes trace data accessible through these interfaces, enabling tools like Claude Code, Cursor, and Windsurf to query your Phoenix instance directly. Export traces to JSON, pipe tojq, or save to disk for analysis.12.20.2025
12.20.2025: Improved User Preferences ⚙️
Available in Phoenix 12.27+Phoenix now offers enhanced user preference settings, giving you more control over your experience. This update includes theme selection in viewer preferences and programming language preference.12.12.2025
12.12.2025: Support for Gemini Tool Calls 🤖
Available in Phoenix 12.25+Phoenix now supports Gemini tool calls, enabling enhanced integration capabilities with Google’s Gemini models. This update allows for more robust and feature-complete interactions with Gemini, including improved request/response translation and advanced conversation handling with tool calls.12.09.2025
12.09.2025: Span Notes API 📝
Available in Phoenix 12.21+New dedicated endpoints for span notes enable open coding and seamless annotation integrations. Add notes to spans programmatically using the Phoenix client in both Python and TypeScript—perfect for debugging sessions, human feedback, and building custom annotation pipelines.12.06.2025
12.06.2025: LDAP Authentication Support 🔐
Available in Phoenix 12.20+Phoenix now supports authentication against LDAP directories, enabling integration with enterprise identity infrastructure including Microsoft Active Directory, OpenLDAP, and any LDAP v3 compliant directory. Key features include group-based role mapping, multi-server failover, TLS encryption, and automatic user provisioning.12.04.2025
12.04.2025: Evaluator Message Formats 💬
Available in phoenix-evals 0.22+ (Python) and @arizeai/phoenix-evals 2.0+ (TypeScript)Phoenix evaluators now support flexible prompt formats including simple string templates and OpenAI-style message arrays for multi-turn prompts. Python supports both f-string and mustache syntax, while TypeScript uses mustache syntax. Adapters handle provider-specific transformations automatically.12.03.2025
12.03.2025: TypeScript createEvaluator 🧪
Available in @arizeai/phoenix-evals 2.0+ThecreateEvaluator utility provides a type-safe way to build custom code evaluators for experiments in TypeScript. Define evaluators with full type inference, access input, output, expected, and metadata parameters, and integrate seamlessly with runExperiment.12.01.2025
12.01.2025: Splits on Experiments Table 📊
Available in Phoenix 12.20+You can now view and filter experiment results by data splits directly in the experiments table. This enhancement makes it easier to analyze performance across different data subsets (such as train, validation, and test) and compare how your models perform on each split.11.29.2025
11.29.2025: Add support for Claude Opus 4-5 🤖
Available in Phoenix 12.18+
11.27.2025
11.27.2025: Show Server Credential Setup in Playground API Keys 🔐
Available in Phoenix 12.18+
11.25.2025
11.25.2025: Split Assignments When Uploading a Dataset 🗂️
Available in Phoenix 12.18+11.23.2025
11.23.2025: Repetitions for Manual Playground Invocations 🛝
Available in Phoenix 12.17+
11.14.2025
11.14.2025: Expanded Provider Support with OpenAI 5.1 + Gemini 3 🔧
Available in Phoenix 12.15+This update enhances LLM provider support by adding OpenAI v5.1 compatibility (including reasoning capabilities), expanding support for Google DeepMind/Gemini models, and introducing the gemini-3 model variant.11.12.2025
11.12.2025: Updated Anthropic Model List 🧠
Available in Phoenix 12.15+This update enhances the Anthropic model registrations in Arize Phoenix by adding support for the 4.5 Sonnet/Haiku variants and removing several legacy 3.x Sonnet/Opus entries.11.09.2025
11.09.2025: OpenInference TypeScript 2.0 💻
- Added easy manual instrumentation with the same decorators, wrappers, and attribute helpers found in the Python
openinference-instrumentationpackage. - Introduced function tracing utilities that automatically create spans for sync/async function execution, including specialized wrappers for chains, agents, and tools.
- Added decorator-based method tracing, enabling automatic span creation on class methods via the
@observedecorator. - Expanded attribute helper utilities for standardized OpenTelemetry metadata creation, including helpers for inputs/outputs, LLM operations, embeddings, retrievers, and tool definitions.
- Overall, tracing workflows, agent behavior, and external tool calls is now significantly simpler and more consistent across languages.
11.07.2025
11.07.2025: Timezone Preference 🌍
Available in Phoenix 12.11+11.05.2025
11.05.2025: Metadata for Prompts 🗂️
Available in Phoenix 12.10+
metadata field for prompts.11.03.2025
11.03.2025: Playground Dataset Label Display 🏷️
Available in Phoenix 12.10+
11.01.2025
11.01.2025: Resume Experiments and Evaluations 🔄
Available in Phoenix 12.10+This release allows you to resume your experiments and evaluations at your convenience. If certain examples fail, there is no need to repeat an entire task you already completed. This feature provides you with new management capabilities across servers and clients. It’s designed to save effort, making your experimentation workflow more flexible.10.30.2025
10.30.2025: Metadata Support for Experiment Run Annotations 🧩
Available in Phoenix 12.9+
10.28.2025
10.28.2025: Enable AWS IAM Auth for DB Configuration 🔐
Available in Phoenix 12.9+Added support for AWS IAM–based authentication for PostgreSQL connections to AWS Aurora and RDS. This enhancement enables the use of short-lived IAM tokens instead of static passwords, improving security and compliance for database access.10.26.2025
10.26.2025: Add Split Edit Menu to Examples ䷖
Available in Phoenix 12.8+
10.24.2025
10.24.2025: Filter Prompts Page by Label 🏷️
Available in Phoenix 12.7+
10.20.2025
10.20.2025: Splits ䷖
Available in Phoenix 12.7+In Arize Phoenix, splits let you categorize your dataset into distinct subsets—such as train, validation, or test—enabling structured workflows for experiments and evaluations. This capability offers more flexibility in how you organize, filter, and compare your data across different stages or experimental conditions.10.18.2025
10.18.2025: Filter Annotations in Compare Experiments Slideover ✍️
Available in Phoenix 12.7+
10.15.2025
10.15.2025: Enhanced Filtering for Examples Table 🔍
Available in Phoenix 12.5+
10.13.2025
10.13.2025: View Traces in Compare Experiments 🧪
Available in Phoenix 12.5+
10.10.2025
10.10.2025: Viewer Role 👀
Available in Phoenix 12.5+Introduced a new VIEWER role with enforced read-only permissions across both GraphQL and REST APIs, improving access control and security.10.08.2025
10.08.2025: Dataset Labels 🏷️
Available in Phoenix 12.3+
10.06.2025
10.06.2025: Paginate Compare Experiments 📃
Available in Phoenix 12.3+
10.05.2025
10.05.2025: Load Prompt by Tag into Playground 🛝
Available in Phoenix 12.2+
10.03.2025
10.03.2025: Prompt Version Editing in Playground 🛝
Available in Phoenix 12.2+
09.29.2025
09.29.2025: Day 0 support for Claude Sonnet 4.5 ⚡
Available in Phoenix 12.1+09.27.2025
09.27.2025: Dataset Splits 📊
Available in Phoenix 12.0+Add support for custom dataset splits to organize examples by category.09.26.2025
09.26.2025: Session Annotations 🗂️
Available in Phoenix 12.0+
09.25.2025
09.25.2025: Repetitions 🔁
Available in Phoenix 11.38+09.24.2025
09.24.2025: Custom HTTP headers for requests in Playground 🛠️
Available in Phoenix 11.36+
09.23.2025
09.23.2025: Repetitions in experiment compare slideover 🔄
Available in Phoenix 11.36+09.22.2025
09.22.2025: Helm configurable image registry & IPv6 support 🌐
Available in Phoenix 11.35+09.17.2025
09.17.2025: Experiment compare details slideover in list view 🔍
Available in Phoenix 11.34+09.15.2025
09.15.2025: Prompt Labels 🏷️
Available in Phoenix 11.33+09.12.2025
09.12.2025: Enable Paging in Experiment Compare Details 📄
Available in Phoenix 11.33+J / K). Pagination09.08.2025
09.08.2025: Experiment Annotation Popover in Detail View 🔍
Available in Phoenix 11.33+
09.04.2025
09.04.2025: Experiment Lists Page Frontend Enhancements 💻
Available in Phoenix 11.32+09.03.2025
09.03.2025: Add Methods to Log Document Annotations 📜
Available in Phoenix 11.31+Added client-side support for logging document annotations with a newlog_document_annotations(...) method, supporting both sync and async API calls.08.28.2025
08.28.2025: New arize-phoenix-client Package 📦
arize-phoenix-client is a lightweight, fully-featured package for interacting with Phoenix. It lets you manage datasets, experiments, prompts, spans, annotations, and projects - without needing a local Phoenix installation. 08.22.2025
08.22.2025: New Trace Timeline View 🔭
Available in Phoenix 11.26+08.20.2025
08.20.2025: New Experiment and Annotation Quick Filters 🏎️
Available in Phoenix 11.25+08.15.2025
08.14.2025
08.14.2025: Trace Transfer for Long-Term Storage 📦
Available in Phoenix 11.23+08.12.2025
08.12.2025: UI Design Overhauls 🎨
Available in Phoenix 11.22+08.09.2025
08.07.2025
08.07.2025: Improved Error Handling in Prompt Playground ⚠️
Available in Phoenix 11.20+
08.06.2025
08.06.2025: Expanded Search Capabilities 🔍
Available in Phoenix 11.19+08.05.2025
08.05.2025: Claude Opus 4-1 Support 🤖
Available in Phoenix 11.19+08.04.2025
08.04.2025: Manual Project Creation & Trace Duplication 📂
Available in Phoenix 11.19+08.03.2025
08.03.2025: Delete Spans via REST API 🧹
Available in Phoenix 11.18+You can now delete spans using the REST API, enabling efficient data redaction and giving teams greater control over trace data.See more
2026
2025



