Skip to main content

Overview

The MatchesRegex evaluator is a code-based evaluator that checks if the output contains substrings matching a specified regular expression pattern. It’s useful for validating output format, detecting specific content patterns, or checking for required elements.
This evaluator is only available as a built-in for Python. For TypeScript, see the usage example below showing how to create an equivalent evaluator using createEvaluator.

When to Use

Use the MatchesRegex evaluator when you need to:
  • Validate output format - Check that responses follow expected patterns (URLs, emails, dates)
  • Detect specific content - Find phone numbers, IDs, or other structured data in outputs
  • Enforce formatting rules - Verify outputs contain required elements
  • Pattern-based quality checks - Check for presence of citations, code blocks, or other patterns
This is a code-based evaluator using Python’s re module. For exact string matching, use exact_match instead.

Supported Levels

LevelSupportedNotes
SpanYesEvaluate any span output against regex patterns.

Input Requirements

The MatchesRegex evaluator requires one input:
FieldTypeDescription
outputstringThe text to evaluate against the regex pattern

Constructor Arguments

ArgumentTypeDescription
patternstr or PatternThe regex pattern (string or compiled)
namestr (optional)Custom evaluator name (default: “matches_regex”)
include_explanationbool (optional)Include match details in explanation (default: True)

Output Interpretation

The evaluator returns a Score object with the following properties:
PropertyValueDescription
score1.0 or 0.01.0 if pattern matches, 0.0 if no match
explanationstringNumber of matches found or “no match” message
kind"code"Indicates this is a code-based evaluator
direction"maximize"Higher scores are better

Usage Examples

import re
from phoenix.evals.metrics import MatchesRegex

# Create evaluator with a URL detection pattern
url_pattern = re.compile(r"https?://[^\s]+")
contains_url = MatchesRegex(pattern=url_pattern)

# Inspect the evaluator's requirements
print(contains_url.describe())

# Evaluate output with a URL
eval_input = {"output": "Check out https://github.com/Arize-ai/phoenix!"}
scores = contains_url.evaluate(eval_input)
print(scores[0])
# Score(name='matches_regex', score=1.0, explanation='There are 1 matches...', ...)

# Evaluate output without a URL
eval_input = {"output": "This text has no links."}
scores = contains_url.evaluate(eval_input)
print(scores[0].score)  # 0.0

Common Pattern Examples

import re
from phoenix.evals.metrics import MatchesRegex

# Email detection
email_pattern = re.compile(r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")
contains_email = MatchesRegex(pattern=email_pattern, name="contains_email")

# Phone number detection (US format)
phone_pattern = re.compile(r"\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}")
contains_phone = MatchesRegex(pattern=phone_pattern, name="contains_phone")

# Code block detection (markdown)
code_block_pattern = re.compile(r"```[\s\S]*?```")
contains_code = MatchesRegex(pattern=code_block_pattern, name="contains_code_block")

# JSON object detection
json_pattern = re.compile(r"\{[^{}]*\}")
contains_json = MatchesRegex(pattern=json_pattern, name="contains_json")

Using String Patterns

You can pass patterns as strings instead of compiled regex:
from phoenix.evals.metrics import MatchesRegex

# String pattern (will be compiled automatically)
date_evaluator = MatchesRegex(
    pattern=r"\d{4}-\d{2}-\d{2}",
    name="contains_date"
)

eval_input = {"output": "The event is scheduled for 2024-03-15."}
scores = date_evaluator.evaluate(eval_input)
print(scores[0].score)  # 1.0

Using with Phoenix

Evaluating Traces

Run evaluations on traces collected in Phoenix and log results as annotations:

Running Experiments

Use the MatchesRegex evaluator in Phoenix experiments:

API Reference