Skip to main content
Server-side evaluators run entirely in the Phoenix UI — no local code or API key setup required. Add them to any dataset from your project’s Evaluators tab and Phoenix runs them automatically on every new experiment run. There are two types of pre-built server-side evaluators:
  • Code evaluators are deterministic. They apply a rule or algorithm to your data and return a result without calling any model.
  • LLM evaluators use a judge model via a managed prompt template. Phoenix handles model configuration and API access — you only need to configure the input column mappings.

Code Evaluators

Contains

Check whether a text contains one or more specified words.

Exact Match

Check whether two strings are identical.

Regex Match

Check whether a text matches a regular expression pattern.

Levenshtein Distance

Measure the edit distance between two strings.

JSON Distance

Measure the number of structural differences between two JSON values.

LLM Evaluators

Correctness

Evaluate whether LLM responses are factually accurate and complete.

Tool Selection

Evaluate whether the LLM selected the correct tool for a given task.

Tool Invocation

Evaluate whether tool calls have correct arguments and formatting.