HomeCompaniesBenchspan
Benchspan

Run agent benchmarks in minutes, not hours

Benchspan is a benchmarking platform for AI agents. If you're building an agent, you need to know if it's getting better. But running benchmarks is slow, expensive, and fragile. You spend days writing glue code every time you want to run a new benchmark, runs take forever on your laptop, and when they fail halfway through you burn hundreds of dollars in tokens with nothing to show for it. Benchspan fixes all of it. Onboard your agent once, and it works with every benchmark on the platform. We onboarded Claude Code in 37 lines of code. Running a benchmark becomes a single command, executed in parallel in the cloud. Every result goes to one place your whole team can see, with full trajectories, token usage, latency, and custom metrics. When runs partially fail, rerun just the subset that errored instead of starting from scratch. Compare runs side by side to see exactly where your agent is improving and where it's regressing.
Active Founders
Avi Arora
Avi Arora
Founder
Founder @ Benchspan Prev. ML Researcher @ Microsoft Trained models doing 20 billion requests / year
Ritesh Malpani
Ritesh Malpani
Founder
Founder @ Benchspan Prev. SWE @ Bloomberg, Amazon | BS/MS @ Georgia Tech Architected systems processing 100K+ TPS across trading infrastructure
Benchspan
Founded:2026
Batch:Spring 2026
Team Size:2
Status:
Active
Location:San Francisco
Primary Partner:David Lieb