AI Agent Workflow Evaluation Framework (v1.0.0)
A deterministic, CI/CD-ready framework for measuring how accurately AI agents follow complex, multi-step workflow instructions. Features a three-tier hybrid evaluation engine, progressive scoring, and dual-metric reporting.