All phoenix-evals Evaluators have the following properties:
Sync and async evaluate methods for evaluating a single record or example
Single record evals return a list of Score objects. Oftentimes, this is a list of length 1 (e.g. exact_match), but some evaluators return multiple scores (e.g. precision-recall).
A discoverable input_schema that describes what inputs it requires to run.
Evaluators accept an arbitrary eval_input payload, and an optional input_mapping which map/transforms the input to the shape they require.