All evals templates are tested against golden data that are available as part of the LLM eval libraryβs benchmarked data and target precision at 70-90% and F1 at 70-85%.
Hallucination Eval
Hallucinations on answers to public and private data
Tested on:Hallucination QA Dataset,Hallucination RAG Dataset
Tested on:Hallucination QA Dataset,Hallucination RAG Dataset

