Tag: regression-testing
Authoring and Running Promptfoo Evaluation Suites
This skill guides developers through authoring comprehensive promptfoo evaluation suites for robust regression testing and quality assurance. It covers defining prompts, structuring test cases, implementing various assertions, and validatin…
Comprehensive prompt evaluation and regression testing tool
This framework facilitates the creation and execution of robust prompt evaluation suites, supporting deterministic assertions, model-graded rubrics, and diverse providers like LLMs and HTTP endpoints. It ensures comprehensive coverage for r…
Continuous regression monitoring during development
This skill provides continuous regression monitoring by observing specified directories for file changes and automatically re-running evaluation checks. It displays a live scorecard, allowing developers to track pass/fail status and score d…
AI agent regression testing with EvalView
Detect regressions in AI agent behaviour by comparing current outputs and tool calls against golden baselines. It identifies changes in outputs, tool usage, and significant score drops.
Eval-Driven Development Framework for AI Agents
This skill provides a formal framework for implementing Eval-Driven Development (EDD) within AI coding sessions. It enables developers to define capability and regression tests, track agent reliability using metrics like pass@k, and generat…
Structured bug fixing and feature remediation workflow
This skill executes a structured workflow to diagnose and resolve bugs within a feature or module. It supports optional error classification (implementation, spec, or architectural) and ensures all fixes are validated with regression tests.
structured workflow for bug fixing and testing
This skill automates the process of fixing bugs within a feature or module by following a structured workflow. It supports optional error classification (implementation, spec, or architectural) and ensures all fixes are validated with regre…
mcp schema enum compatibility regression testing
This skill addresses schema generation issues where nullable enum signatures cause MCP clients to reject tool definitions. It ensures strict-client compatibility by managing enum arrays, particularly by treating optional enum inputs as stri…
Dependency Validation for Excel MCP Server
A skill for verifying the integrity of dependency or toolchain upgrades within the Excel MCP server repository by executing sequential build, test, and packaging commands.
Software Testing Best Practices and Guidelines
Provides a framework for writing robust integration tests, emphasising the use of real-world fixtures, mocking external services, and implementing regression tests for bug fixes.
XcodeBuildMCP Snapshot Fixture Review
This skill audits XcodeBuildMCP snapshot fixture changes to maintain contract integrity across MCP, CLI, and JSON output interfaces.
Automated Diff-Driven QA Validation
This skill automates code validation by analysing git diffs to determine the appropriate testing strategy for frontend, backend, or mixed changes. It executes targeted browser automation, API requests, or repository-native tests to report p…