Migration Guide
The Hallucination evaluator has been superseded by the Faithfulness evaluator, which uses clearer terminology and a more intuitive scoring direction.Key Differences
| Aspect | Hallucination (Deprecated) | Faithfulness (Recommended) |
|---|---|---|
| Labels | factual / hallucinated | faithful / unfaithful |
| Score direction | Minimize (0.0 = good) | Maximize (1.0 = good) |
| Score meaning | 0.0 = factual, 1.0 = hallucinated | 1.0 = faithful, 0.0 = unfaithful |
Migration Example
- Python
- TypeScript
Before (deprecated):After (recommended):
Updating Score Interpretation
If you have existing code that interprets hallucination scores, you’ll need to update your logic:Why the Change?
The Faithfulness evaluator provides several improvements:- Intuitive scoring: Higher scores = better outcomes, which aligns with most evaluation metrics
- Clearer terminology: “Faithful/unfaithful” more accurately describes the relationship between response and context
- Consistency: Aligns with other evaluators that use maximize direction
See Also
- Faithfulness Evaluator - The recommended replacement

