Tag: ai-testing
skill
★ 50
Eval-Driven Development Framework for AI Agents
This skill provides a formal framework for implementing Eval-Driven Development (EDD) within AI coding sessions. It enables developers to define capability and regression tests, track agent reliability using metrics like pass@k, and generat…
tool
Comprehensive AI Data and Model Quality Evaluator
Dingo provides a comprehensive framework for evaluating data and AI outputs using both deterministic rule-based checks and advanced LLM-based metrics. It supports complex workflows, including RAG evaluation and autonomous fact-checking, via…