1
Set environment variables to connect to your Phoenix instance:
2
You’ll need to install the evals library that’s apart of Phoenix.
- Python
- TypeScript
3
Since, we are running our evaluations on our trace data from our first project, we’ll need to pull that data into our code.
- Python
- TypeScript
4
In this example, we will define, create, and run our own evaluator. There’s a number of different evaluators you can run, but this quick start will go through an LLM as a Judge Model.1) Define your LLM Judge ModelWe’ll use OpenAI as our evaluation model for this example, but Phoenix also supports a number of other models.If you haven’t yet defined your OpenAI API Key from the previous step, let’s first add it to our environment.
- Python
- TypeScript
5
Now that we have defined our evaluator, we’re ready to evaluate our traces.
- Python
6
You’ll now be able to log your evaluations in your project view.
- Python
- TypeScript
Next Steps
LLM as a Judge
Learn how LLM-based evaluation works and best practices
Pre-built Evaluators
Use pre-tested evaluators for hallucinations, relevance, toxicity, and more
Custom Evaluators
Build custom evaluators tailored to your use case
Datasets & Experiments
Run evaluations systematically with datasets and experiments

