skill

Automated Experiment Benchmarking Skill

benchmarking experiment-tracking machine-learning metrics-extraction data-science

Summary

Defines comparison metrics and extracts baseline values from notebook outputs to record them in a structured JSON log for downstream evaluation.