Tag: ai-testing

All Skills Tools

Eval-Driven Development Framework for AI Agents

This skill provides a formal framework for implementing Eval-Driven Development (EDD) within AI coding sessions. It enables developers to define capability and regression tests, track agent reliability using metrics like pass@k, and generat…

tan-yong-sheng/ai-vision-mcp evaluation-framework edd ai-testing regression-testing

tool

Comprehensive AI Data and Model Quality Evaluator

Dingo provides a comprehensive framework for evaluating data and AI outputs using both deterministic rule-based checks and advanced LLM-based metrics. It supports complex workflows, including RAG evaluation and autonomous fact-checking, via…

DataEval/dingo data-quality llm-evaluation rule-based rag-metrics