ai mitts
Browse Skills Tools GitHub

Tag: llm-testing

Type: All Skills Tools
Tags: mcp automation llm cli code-review unity typescript genomics game-development bioinformatics debugging documentation cloudbase ai-agents agentic-workflow devops llm-agent workflow-automation encode code-analysis
skill ★ 21,403

Authoring and Running Promptfoo Evaluation Suites

This skill guides developers through authoring comprehensive promptfoo evaluation suites for robust regression testing and quality assurance. It covers defining prompts, structuring test cases, implementing various assertions, and validatin…

promptfoo/promptfoo promptfoo evaluation qa regression-testing
skill ★ 24,025

E2E Behavior Validation for Agentic Systems

This skill guides developers in creating robust end-to-end tests using Playwright, focusing on validating core product behaviour and data flow rather than superficial UI states. It provides patterns for testing complex agentic interactions,…

mastra-ai/mastra e2e-testing playwright behavior-validation agent-testing
skill ★ 4

MCP Server Evaluation Creator

Provides a structured methodology for generating complex, multi-hop Q pairs to benchmark the effectiveness of MCP servers through verifiable tool-use evaluations.

jmrplens/gitlab-mcp-server mcp evaluation benchmarking llm-testing
Page 1
ai mitts

Agentic skills & tools, vector-searched.

Browse

All Skills Tools

About

Auto-discovered from public GitHub. Summarised by Ollama. Searched with nomic-embed-text.

© 2026 aimitts.