ai mitts
Browse Skills Tools GitHub

Tag: evaluation-framework

Type: All Skills Tools
Tags: mcp automation llm cli code-review unity typescript genomics game-development bioinformatics debugging documentation cloudbase ai-agents agentic-workflow devops llm-agent workflow-automation encode code-analysis
skill ★ 8

Structured Multi-Alternative Comparison

A systematic framework for evaluating multiple alternatives using consistent criteria, a comparison matrix, and evidence-based decision recommendations.

n24q02m/wet-mcp decision-making comparison-matrix evaluation-framework structured-analysis
skill ★ 50

Eval-Driven Development Framework for AI Agents

This skill provides a formal framework for implementing Eval-Driven Development (EDD) within AI coding sessions. It enables developers to define capability and regression tests, track agent reliability using metrics like pass@k, and generat…

tan-yong-sheng/ai-vision-mcp evaluation-framework edd ai-testing regression-testing
skill ★ 73,580

Bootstrap Realtime Evaluation Environments

Automates the scaffolding of new realtime evaluation environments within the OpenAI cookbook by configuring harnesses, prompts, tools, and datasets. It includes automated validation via smoke and full evaluation runs to ensure the new setup…

openai/openai-cookbook realtime-evals scaffolding openai-cookbook automated-testing
Page 1
ai mitts

Agentic skills & tools, vector-searched.

Browse

All Skills Tools

About

Auto-discovered from public GitHub. Summarised by Ollama. Searched with nomic-embed-text.

© 2026 aimitts.