Frontier coding data for training and evaluating LLMs
About Datacurve
Datacurve supplies frontier coding data to top AI labs and enterprises to train and evaluate the next generation of coding LLMs. Datacurve has scaled from $0 to multiple 7-figures ARR in 6 months with a team of 3 people and continues to skyrocket in growth.
Who you are
More on Datacurve’s mission
Abundant post-training data is one of the biggest bottlenecks to achieve autonomous SWEs and to break through plateaus in coding LLMs’ capabilities. We built a gamified coding platform attracts skilled engineers from all over the world to produce high quality data.
Our goal is to enable next generation coding LLMs from a foundation model level through quality data abundance. We will create a future where coding LLMs aren’t just productivity-boost devtools but capable to give anyone from any industry production power to build and engineer solutions.
Who we are
Datacurve is founded by Waterloo CS dropouts. Serena (CEO) interned at Cohere with the CTO and pioneered on early coding & synthetic data at Cohere. Charley (CTO) interned at Google before dropping out. The best way to describe what it’s like to work here is a long hackathon with friends. We are extremely ambitious. We don’t care, we will just do it. Join us.
Backed by Y Combinator, Afore Capital, Pioneer Fund, Amjad Masad (Replit), Oriol Vinyals (Gemini Technical Lead), Cohere, and Vercel.
Datacurve (YC W24) is a platform that produces high quality coding data for foundation model companies. Datacurve's gamified coding platform pays elite engineers to do fun problems in Leetcode style. Customers buy data from Datacurve to train better LLMs.