Tag: llm-testing
skill
★ 21,403
Authoring and Running Promptfoo Evaluation Suites
This skill guides developers through authoring comprehensive promptfoo evaluation suites for robust regression testing and quality assurance. It covers defining prompts, structuring test cases, implementing various assertions, and validatin…
skill
★ 24,025
E2E Behavior Validation for Agentic Systems
This skill guides developers in creating robust end-to-end tests using Playwright, focusing on validating core product behaviour and data flow rather than superficial UI states. It provides patterns for testing complex agentic interactions,…
skill
★ 4
MCP Server Evaluation Creator
Provides a structured methodology for generating complex, multi-hop Q pairs to benchmark the effectiveness of MCP servers through verifiable tool-use evaluations.