Open-source LLM-as-judge eval suite with root cause analysis and failure mining

(github.com)

2 points | by colinfly 10 hours ago ago

1 comments