Skip to content

RAG Evaluation Checklist (Triad)

  • [ ] Context relevance (retrieved chunks topically relevant)
  • [ ] Groundedness (answer supported by retrieved text)
  • [ ] Answer relevance (addresses the query)
  • [ ] Latency: p50, p95 (ms); Token counts (prompt/response)
  • [ ] Citations contain stable doc IDs + spans
  • [ ] Regression set: 10–20 fixed questions with expected sources