Pre-Built Metrics

Code Evaluators
LLM Evaluators

Server-side evaluators run entirely in the Phoenix UI — no local code or API key setup required. Add them to any dataset from your project’s Evaluators tab and Phoenix runs them automatically on every new experiment run. There are two types of pre-built server-side evaluators:

Code evaluators are deterministic. They apply a rule or algorithm to your data and return a result without calling any model.
LLM evaluators use a judge model via a managed prompt template. Phoenix handles model configuration and API access — you only need to configure the input column mappings.

Code Evaluators

Contains

Check whether a text contains one or more specified words.

Exact Match

Check whether two strings are identical.

Regex Match

Check whether a text matches a regular expression pattern.

Levenshtein Distance

Measure the edit distance between two strings.

JSON Distance

Measure the number of structural differences between two JSON values.

LLM Evaluators

Correctness

Evaluate whether LLM responses are factually accurate and complete.

Tool Selection

Evaluate whether the LLM selected the correct tool for a given task.

Tool Invocation

Evaluate whether tool calls have correct arguments and formatting.

Input Mapping Contains

⌘I

Get Started

Tracing

Evaluation

Datasets & Experiments

Prompts

Settings

Concepts

Resources

Pre-Built Metrics

Code Evaluators

Contains

Exact Match

Regex Match

Levenshtein Distance

JSON Distance

LLM Evaluators

Correctness

Tool Selection

Tool Invocation

Get Started

Tracing

Evaluation

Datasets & Experiments

Prompts

Settings

Concepts

Resources

​Code Evaluators

Contains

Exact Match

Regex Match

Levenshtein Distance

JSON Distance

​LLM Evaluators

Correctness

Tool Selection

Tool Invocation

Code Evaluators

LLM Evaluators