07.09.2025: Baseline for experiment comparisons 🔁

07.09.2025

07.09.2025

Baseline for Experiment Comparisons

You can now set a baseline run when comparing multiple experiments. This is especially useful when one run represents a known-good output (e.g. a previous model version or a CI-approved run), and you want to evaluate changes relative to it.For example, in an evaluation like accuracy, you can easily see where the value flipped from correct → incorrect or incorrect → correct between your baseline and the current comparison - helping you quickly spot regressions or improvements.This feature makes it easier to isolate the impact of changes like a new prompt, model, or dataset.

feat(experiments): add baseline to compare experiments page by axiomofjoy · Pull Request #8461 · Arize-ai/phoenix

GitHub

07.13.2025: Experiments module in phoenix-client 🧪07.07.2025: Database disk usage monitor 🛑

⌘I

​Baseline for Experiment Comparisons

feat(experiments): add baseline to compare experiments page by axiomofjoy · Pull Request #8461 · Arize-ai/phoenix

Baseline for Experiment Comparisons