Skip to content

Add AWB (AI Workflow Benchmark) to Testing & Security#54

Merged
eltociear merged 1 commit into
eltociear:mainfrom
xmpuspus:add-awb
Jun 10, 2026
Merged

Add AWB (AI Workflow Benchmark) to Testing & Security#54
eltociear merged 1 commit into
eltociear:mainfrom
xmpuspus:add-awb

Conversation

@xmpuspus

Copy link
Copy Markdown
Contributor

Adds AWB to the Testing & Security section next to other evaluation/observability tools (Hawkeye, Vet, Eval Marketplace).

AWB benchmarks the AI coding tools already listed under AI Code Editors & IDEs and Terminal & CLI Agents — Claude Code, Cursor, Aider, Gemini CLI, Codex CLI, Windsurf, Copilot, Pi — on 100 real OSS tasks at pinned commit SHAs, scoring across 7 capability dimensions plus derived cost discipline.

Happy to relocate or split into a new "Benchmarks & Evaluation" section if you'd prefer.

@eltociear eltociear merged commit 8349c3a into eltociear:main Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants