PassDocumentation Index
Fetch the complete documentation index at: https://arizeai-433a7140.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
example_id_key to create_dataset when you want Phoenix to diff incoming rows against the existing dataset version and apply the minimal set of adds, edits, and deletes — rather than always appending new examples.
When to Use Stable IDs
Useexample_id_key when:
- Your examples have stable IDs (e.g., a primary key from your database or a content hash).
- You want to update a production dataset without losing experiment or evaluation links tied to existing examples.
- You need the new dataset version to mirror your source of truth exactly, including deletions.
example_id_key. Phoenix will assign a server-generated ID to each row and append all rows to the dataset.
Diff Against an Existing Dataset
Passexample_id_key to create_dataset. Phoenix creates the dataset on the first call, and on subsequent calls it diffs the incoming rows against the current version.
example_id_key must not overlap with input_keys, output_keys, metadata_keys, or split_key. Phoenix raises a ValueError if it does.
Append Without Deleting
Useadd_examples_to_dataset when you want to add or update rows but leave existing examples that are absent from the upload untouched. With example_id_key, incoming rows whose IDs match existing examples still update those examples — but unlike create_dataset, no row is deleted just because its ID is missing from the upload.
Diff Semantics
Whencreate_dataset is called with example_id_key against an existing dataset, Phoenix produces a new version that mirrors the upload exactly:
- Incoming ID not in dataset — example is created.
- Incoming ID matches an existing example — example is updated if its content changed; otherwise the existing example is carried forward unchanged.
- Existing example absent from the upload — example is deleted from the new version.
| Method | Behavior |
|---|---|
create_dataset with example_id_key | Full-replace diff. The new version exactly matches the upload. |
create_dataset without example_id_key | Appends all rows; Phoenix generates IDs. |
add_examples_to_dataset with example_id_key | Adds new rows and updates rows whose IDs match. Examples absent from the upload are preserved. |
add_examples_to_dataset without example_id_key | Appends all rows; Phoenix generates IDs. Existing examples are preserved. |
Compatibility With Older Servers
Diffing requires Phoenix>= 15.0.0. Against older servers the client falls back to a plain create and emits a UserWarning. The fallback succeeds when the dataset does not yet exist, but if a dataset with that name is already on the server, the create is rejected with a name conflict. For the diff-and-update workflow you must be running Phoenix >= 15.0.0.
Return Value
Bothcreate_dataset and add_examples_to_dataset return a Dataset object representing the new version:
| Attribute | Description |
|---|---|
id | Dataset ID. |
name | Dataset name. |
description | Dataset description, or None. |
version_id | ID of the version produced by this call. |
example_count | Number of examples in this version. |
examples | List of DatasetExample objects in this version. |
metadata | Dataset-level metadata. |
created_at | When the dataset was first created. |
updated_at | When the dataset was last updated. |
example_count (or the IDs in examples) against the previous version returned by client.datasets.get_dataset(dataset="my-eval-dataset") before the call.
