Skip to main content

TypeScript createEvaluator

The createEvaluator utility in @arizeai/phoenix-evals provides a type-safe way to build custom code evaluators for experiments in TypeScript. Define evaluators with full type inference for inputs, outputs, and expected values.

Basic Usage

Create simple evaluators that validate experiment outputs:
import { createEvaluator } from "@arizeai/phoenix-evals";

const inBounds = createEvaluator<{ output: number }>(
  ({ output }) => {
    return 1 <= output && output <= 100 ? 1 : 0;
  },
  { name: "in_bounds" }
);

Multiple Parameters

Access input, output, expected, and metadata in your evaluator:
import { createEvaluator } from "@arizeai/phoenix-evals";
import { distance } from "fastest-levenshtein";

const editDistance = createEvaluator<{ output: string; expected: string }>(
  ({ output, expected }) => distance(output, expected),
  { name: "edit_distance" }
);

Evaluator Options

Customize display properties for better integration with the Experiments UI:
const containsLink = createEvaluator<{ output: string }>(
  ({ output }) => /https?:\/\/[^\s]+/.test(output) ? 1 : 0,
  { name: "contains_link", kind: "CODE" }
);

Running in Experiments

Pass evaluators directly to runExperiment:
import { runExperiment } from "@arizeai/phoenix-client/experiments";
import { createEvaluator } from "@arizeai/phoenix-evals";

const hasGreeting = createEvaluator<{ output: string }>(
  ({ output }) => 
    ["hello", "hi", "hey"].some(w => output.toLowerCase().includes(w)) ? 1 : 0,
  { name: "has_greeting", kind: "CODE" }
);

const exactMatch = createEvaluator<{ output: string; expected: string }>(
  ({ output, expected }) => output.trim() === expected.trim() ? 1 : 0,
  { name: "exact_match", kind: "CODE" }
);

const experiment = await runExperiment({
  dataset: myDataset,
  task: myTask,
  evaluators: [hasGreeting, exactMatch],
});

More Information:

Using Evaluators Documentation