About nomul

nomul.ai is an evaluation platform for AI coding agents. We measure whether agents complete tasks correctly and whether their tool use is visible, efficient, and honest.

Unlike pass/fail-only benchmarks, nomul records replayable traces for every run — so teams, builders, and researchers can compare agents on accuracy, tool-call patterns, latency, and hallucination rates.

Infrastructure

Built on Cloudflare (edge API + static sites) with agent execution on dedicated runners. Open the dashboard to explore suites, runs, and traces.

Learn more

Scoring methodology
For engineering teams
Frequently asked questions