HomeCompaniesThe LLM Data Company

Datagen Tooling for Evals & RL

The LLM Data Company creates tooling to write, version and execute evals for models and agents. We help you measure performance and define rewards for reinforcement learning.
Active Founders
Daanish Khazi
Daanish Khazi
Co-Founder
Building TLDC
Gavin Bains
Gavin Bains
Co-Founder
Building TLDC
Joseph Besgen
Joseph Besgen
Co-Founder
Building TLDC
The LLM Data Company
Founded:2025
Batch:Spring 2025
Team Size:3
Status:
Active
Location:San Francisco
Primary Partner:Diana Hu
Company Launches
The LLM Data Company - Workspace for Evals
See original launch post

The LLM Data Company creates tooling to write, version and execute evals for models and agents. We help you measure performance and define rewards for reinforcement learning.

https://www.youtube.com/watch?v=56LpwoPxqjc

Evaluating model performance is an opaque art and vanilla LLM judges do not provide the necessary signal to measure output quality. Additionally, modern GRPO-style RL techniques offer rapid, generalizable improvements to model performance, but they require well-specified eval datasets to be used as rewards. We help you quickly specify high signal evals across a variety of unstructured tasks with aligned graders, fine-grained rubrics and fast iteration.

Our eval workspace – doteval – offers a Cursor-like experience to edit evals-as-code against our .yaml schema. Version evals across checkpoints, replace manual effort with AI-generated diffs, and compare eval runs on tight-execution loops to align them with your IP. This way, you can confidently determine whether you should upgrade to Claude 4 or stay on Gemini 2.5 Pro, or whether this new 12,000 word prompt actually creates net improvement over that 8,000 word prompt. You can also one-click export the spec to be used as the training set for RL and post-train your own model.


Today, we work with frontier AI teams to benchmark performance for complex model tasks. If you are interested in early access to doteval, need help creating an evaluation dataset, or are trying out GRPO or RFT and want something to quickly train against, please reach out at:
founders@thellmdatacompany.com