- Code evaluators are deterministic. They apply a rule or algorithm to your data and return a result without calling any model.
- LLM evaluators use a judge model via a managed prompt template. Phoenix handles model configuration and API access — you only need to configure the input column mappings.
Code Evaluators
Contains
Check whether a text contains one or more specified words.
Exact Match
Check whether two strings are identical.
Regex Match
Check whether a text matches a regular expression pattern.
Levenshtein Distance
Measure the edit distance between two strings.
JSON Distance
Measure the number of structural differences between two JSON values.
LLM Evaluators
Correctness
Evaluate whether LLM responses are factually accurate and complete.
Tool Selection
Evaluate whether the LLM selected the correct tool for a given task.
Tool Invocation
Evaluate whether tool calls have correct arguments and formatting.

