Reinforcement Learning Startups funded by Y Combinator (YC) 2026

April 2026

Browse 27 of the top Reinforcement Learning startups funded by Y Combinator.

We also have a Startup Directory where you can search through over 5,000 companies.

Carrot Labs
W2026
• Active • 2 employees • San Francisco, CA, USA
We build specialized LLMs for your business’s specific workflows and use cases, then continuously hone them against your success metrics, capturing your proprietary know-how in the model so it gets more valuable and harder to copy as you grow.
artificial-intelligence
reinforcement-learning
automation

GrazeMate
W2026
• Active • 3 employees • Sydney NSW, Australia
GrazeMate builds autonomous drones that herd cattle. On command, our drones fly to a paddock, position themselves around the mob, and move them where they need to go. What used to take a full day of helicopters, motorbikes, and horses now runs on a schedule. We work with some of the largest cattle ranches in the world. While the drones are herding, they're also estimating animal weights, measuring grass biomass, monitoring water levels, and flagging sick animals. We're building physical AI that lets a grazier manage thousands of head across millions of acres from their phone.
agriculture
reinforcement-learning
computer-vision
drones

Haladir
W2026
• Active • 4 employees
Haladir is the operational AI layer for logistics. We unify data across WMS, TMS, OMS, etc., deploy event-driven agents with execution authority, embed solver-grade optimization into 3PL and distributor operations, and produce RL environments for frontier AI labs. We help logistics operators embed observed AI at every level of the supply chain, helping make insights and deliver results. Today's AI brought intelligence. The next frontier is judgement. We define operational superintelligence as AI that consistently makes maximally-optimal operational decisions in complex environments. The first component is speed: continuous decisions that take operations managers and OR analysts weeks to make, in seconds. The second is reliability: every decision is guaranteed to satisfy operational constraints. The third is scope: optimize across thousands of constraints no human or team could possibly reason about.
reinforcement-learning
data-engineering
logistics
operations

Cortex AI
F2025
• Active • 3 employees • San Francisco, CA, USA
Cortex AI builds the world’s most diverse and large-scale real-world workplace robot & egocentric dataset — where the physical world becomes the next training and evaluation set for embodied AI. We power frontier labs developing robotics foundation models and general-purpose robots by providing the data they need: 1️⃣ Egocentric Data — real-workplace human video with hand/body pose, depth, and subtask labels. 2️⃣ Robot Data — trajectories collected from manipulators and humanoids in real industry settings. 3️⃣ Human-in-the-Loop Rollouts & Evals — real-world deployments with remote operators who recover robots when they fail, capturing data that feeds back into training and continuously improves models. Additionally, through the Cortex Marketplace, workplaces get paid to host data-collection and evaluation sessions, while labs access the in-the-wild data that truly matters. This draws on Lucas’s previous experience as co-founder of Carousell, a C2C marketplace that scaled to a $1B+ valuation.
robotics
reinforcement-learning
artificial-intelligence

hillclimb
F2025
• Active • 4 employees • San Francisco, CA, USA
We work with frontier AI labs to help train their agents to become AI research scientists
reinforcement-learning

Topological
S2025
• Active • 2 employees
Topological is developing physics-based foundation models for CAD optimization. We help hardware teams iterate at the same speed that software teams do. Our technology is accelerating the engineering workflow with AI and scales design and optimization to identify the ideal designs for complex problems given their physical constraints with enhanced speed and performance. Our first model, UToP-v1, is a SOTA topology optimization model that understands physics, geometry, and manufacturability. It can generate the most efficient design given a problem’s physical requirements. It has <5% compliance error and is 1930x faster than current methods. We're reimagining mechanical engineering and computational design with precision spatial AI.
ai
3d-printing
design-tools
reinforcement-learning
robotics

Idler
S2025
• Active • 13 employees • San Francisco, CA, USA
Idler builds reinforcement learning environments that teach AI models to code at expert human levels. We create training environments based on real-world coding scenarios that prepare models for the complex challenges they'll face in production.
reinforcement-learning

Janus
P2025
• Active • 2 employees • San Francisco, CA, USA
Janus automates AI evaluations by using high-fidelity simulation environments, catching failures in reasoning, compliance, tool usage, and performance. The resulting datasets benchmark products and feed post-training loops to continuously improve performance over time.
ai
reinforcement-learning
aiops
developer-tools
monitoring

Theta
P2025
• Active • 7 employees • San Francisco, CA, USA
artificial-intelligence
reinforcement-learning
b2b

Aviro
P2025
• Active • 2 employees • San Francisco, CA, USA
Aviro builds RL environments for long-horizon tool use across ML research, live web, and enterprise knowledge work. We partner with frontier labs and Fortune 100 companies to train models as high-stakes operators and perform research-grade work.
reinforcement-learning

TypeOS
P2025
• Active • 2 employees
Built to advance human taste to the frontier, TypeOS is the best way to write with AI.
consumer
productivity
b2b
reinforcement-learning
ai

Cartpole
P2025
• Active • 1 employees • San Francisco, CA, USA
We're creating reinforcement learning environments for training frontier models.
reinforcement-learning
ml
ai
data-labeling

Klavis AI
P2025
• Active • 3 employees
Powering frontier AI labs with real world MCP environments and complex, long-horizon agentic tool-use data.
reinforcement-learning
data-science
ai
artificial-intelligence

Freesolo
P2025
• Active • 4 employees • San Francisco, CA, USA
Freesolo works with companies to encode user-trajectory knowledge into specialized models that outperform the state of the art at much lower latency and cost.
reinforcement-learning
artificial-intelligence
b2b

hud
W2025
• Active • 15 employees • San Francisco, CA, USA
HUD (YC W25) is developing agentic evals and RL environments for Computer Use Agents (CUAs) that browse the web for frontier AI labs. Our CUA Evals framework is the first comprehensive evaluation tool for CUAs. People don't actually know if AI agents are working reliably. To make AI agents work in the real world, we need detailed evals for a huge range of tasks. We're backed by Y Combinator, and work closely with frontier AI labs to provide agent evaluation and training infrastructure at scale.
artificial-intelligence
reinforcement-learning

Agentin AI
W2025
• Active • 2 employees • San Francisco, CA, USA
At Agentin AI, we build Agents that move data and take actions across enterprise systems, like Salesforce, NetSuite and SAP. These agents are difficult to build because each enterprise heavily customizes their systems but we solved that by training our Agents to learn and adapt from failures, applying reinforcement learning techniques we developed.
enterprise
reinforcement-learning
ai

TrainLoop
W2025
• Active • 6 employees • San Francisco, CA, USA
TrainLoop makes it effortless for developers to supercharge LLM performance through reinforcement learning.
developer-tools
generative-ai
reinforcement-learning

Osmosis
W2025
• Active • 6 employees
Osmosis helps companies use reinforcement learning to fine-tune open source models that outperform foundation models.
reinforcement-learning
machine-learning
artificial-intelligence
infrastructure

Synth
F2024
• Active • 2 employees • San Francisco, CA, USA
Choose a coding agent harness, model, and task dataset and optimize context and prompts to get the best performance for long-horizon tasks
ai
reinforcement-learning

Vibrant Labs
W2024
• Active • 2 employees • San Francisco, CA, USA
We work on benchmarking and improving the long-horizon capabilities of AI Agents. We build out specialised environments to improve the long-horizon capabilities of browser and computer use agents.
generative-ai
open-source
developer-tools
ai
reinforcement-learning

JustAI
W2024
• Active • 4 employees • San Francisco, CA, USA
Always-on AI agents for 1-1 personalization at scale
ai
marketing
personalization
reinforcement-learning
workflow-automation

Velos
W2023
• Active • 3 employees • San Francisco, CA, USA
Velos helps non-technical operations teams automate complex, manual back-office tasks with AI workers instead of overseas teams. Unlike traditional robotic process automation (RPA) platforms like UiPath, Velos automations use machine learning to reliably handle ambiguities in their tasks, eliminating the need for an army of maintenance engineers and consultants to build and maintain your automations. We're automating the repetitive work people hate to do.
generative-ai
reinforcement-learning
artificial-intelligence
automation
robotic-process-automation

Atmeto
W2023
• Active • 3 employees • Los Angeles, CA, USA
Founded in 2022, Atmeto was started as a place to develop and apply machine learning to solve the world's biggest problem—climate change. Our current priority is getting the grid to run on 100% clean energy, which is currently limited by battery storage (specifically, the algorithms that control them). We're redefining these algorithms to unlock gigawatts of untapped energy storage capacity, enabling the grid to run on more clean energy from wind and solar.
climate
climatetech
reinforcement-learning
energy-storage
energy

WorldQL
W2022
• Active • 3 employees • San Francisco, CA, USA
gaming
developer-tools
reinforcement-learning
artificial-intelligence

rct AI
W2019
• Active • 40 employees • Los Angeles, CA, USA
rct AI is providing AI solutions to the game industry and building the true Metaverse with AI generated content. By using cutting-edge technologies, especially deep learning and reinforcement learning, rct AI creates a truly dynamic and intelligent user experience both on the consumers’ side and production’s side. The founding team ever built a company, Raventech together and helped make it acquired by Baidu (NASDAQ:BIDU) in 2017.
gaming
reinforcement-learning
metaverse

Sepal AI
S2024
• Acquired • 15 employees • San Francisco, CA, USA
Sepal is a data research company on a mission to advance human knowledge and capabilities through safe AI. We partner with the world’s leading AI labs and enterprises to help their models get better at the tasks people actually want them to do. We’ve built a Cloud-Native Agent Dataset Factory which turns the process of generating evaluation and training data from manual, inconsistent, and labor-intensive into something automated, standardized, and scalable. Sepal AI was founded in 2024 by engineers and operators from Vercel and Turing. We went through Y Combinator, raised several million dollars from leading investors, and already count multiple Fortune 500s and top AI research labs as paying customers.
data-labeling
aiops
reinforcement-learning
ai

Resonance
W2024
• Acquired • 2 employees • San Francisco, CA, USA
Resonance hyper-personalizes MarTech campaign content and automatically refreshes and stores high performing content for re-use.
reinforcement-learning
artificial-intelligence
saas
subscriptions
marketing

Reinforcement Learning Startups funded by Y Combinator (YC) 2026

Hottest Startup Categories