Skip to main content
The llm entrypoint provides reusable evaluator factories that call an AI SDK model and return structured evaluation results.

Relevant Source Files

  • src/llm/index.ts

Example

import { openai } from "@ai-sdk/openai";
import { createFaithfulnessEvaluator } from "@arizeai/phoenix-evals";

const faithfulness = createFaithfulnessEvaluator({
  model: openai("gpt-4o-mini"),
});

const result = await faithfulness.evaluate({
  input: "What is the capital of France?",
  context: "France is a country in Europe. Paris is its capital city.",
  output: "The capital of France is Paris.",
});

Built-In Evaluator Factories

  • createConcisenessEvaluator
  • createCorrectnessEvaluator
  • createDocumentRelevanceEvaluator
  • createFaithfulnessEvaluator
  • createRefusalEvaluator
  • createClassificationEvaluator
  • createToolSelectionEvaluator
  • createToolInvocationEvaluator
  • createToolResponseHandlingEvaluator
import { openai } from "@ai-sdk/openai";
import {
  createCorrectnessEvaluator,
  createRefusalEvaluator,
} from "@arizeai/phoenix-evals";

const model = openai("gpt-4o-mini");

const correctness = createCorrectnessEvaluator({ model });
const refusal = createRefusalEvaluator({ model });

Source Map

  • src/llm/createClassificationEvaluator.ts
  • src/llm/ClassificationEvaluator.ts
  • src/llm/LLMEvaluator.ts
  • src/llm/createFaithfulnessEvaluator.ts
  • src/types/evals.ts