Reinforcement Learning Startups funded by Y Combinator (YC) 2026

April 2026

Browse 27 of the top Reinforcement Learning startups funded by Y Combinator.

We also have a Startup Directory where you can search through over 5,000 companies.

  • Carrot Labs
    Carrot Labs
    Y Combinator LogoW2026
    Active • 2 employees • San Francisco, CA, USA
    We build specialized LLMs for your business’s specific workflows and use cases, then continuously hone them against your success metrics, capturing your proprietary know-how in the model so it gets more valuable and harder to copy as you grow.
    artificial-intelligence
    reinforcement-learning
    automation
  • GrazeMate
    GrazeMate
    Y Combinator LogoW2026
    Active • 3 employees • Sydney NSW, Australia
    GrazeMate builds autonomous drones that herd cattle. On command, our drones fly to a paddock, position themselves around the mob, and move them where they need to go. What used to take a full day of helicopters, motorbikes, and horses now runs on a schedule. We work with some of the largest cattle ranches in the world. While the drones are herding, they're also estimating animal weights, measuring grass biomass, monitoring water levels, and flagging sick animals. We're building physical AI that lets a grazier manage thousands of head across millions of acres from their phone.
    agriculture
    reinforcement-learning
    computer-vision
    drones
  • Haladir
    Haladir
    Y Combinator LogoW2026
    Active • 4 employees
    Haladir is the operational AI layer for logistics. We unify data across WMS, TMS, OMS, etc., deploy event-driven agents with execution authority, embed solver-grade optimization into 3PL and distributor operations, and produce RL environments for frontier AI labs. We help logistics operators embed observed AI at every level of the supply chain, helping make insights and deliver results. Today's AI brought intelligence. The next frontier is judgement. We define operational superintelligence as AI that consistently makes maximally-optimal operational decisions in complex environments. The first component is speed: continuous decisions that take operations managers and OR analysts weeks to make, in seconds. The second is reliability: every decision is guaranteed to satisfy operational constraints. The third is scope: optimize across thousands of constraints no human or team could possibly reason about.
    reinforcement-learning
    data-engineering
    logistics
    operations
  • Cortex AI
    Cortex AI
    Y Combinator LogoF2025
    Active • 3 employees • San Francisco, CA, USA
    Cortex AI builds the world’s most diverse and large-scale real-world workplace robot & egocentric dataset — where the physical world becomes the next training and evaluation set for embodied AI. We power frontier labs developing robotics foundation models and general-purpose robots by providing the data they need: 1️⃣ Egocentric Data — real-workplace human video with hand/body pose, depth, and subtask labels. 2️⃣ Robot Data — trajectories collected from manipulators and humanoids in real industry settings. 3️⃣ Human-in-the-Loop Rollouts & Evals — real-world deployments with remote operators who recover robots when they fail, capturing data that feeds back into training and continuously improves models. Additionally, through the Cortex Marketplace, workplaces get paid to host data-collection and evaluation sessions, while labs access the in-the-wild data that truly matters. This draws on Lucas’s previous experience as co-founder of Carousell, a C2C marketplace that scaled to a $1B+ valuation.
    robotics
    reinforcement-learning
    artificial-intelligence
  • hillclimb
    hillclimb
    Y Combinator LogoF2025
    Active • 4 employees • San Francisco, CA, USA
    We work with frontier AI labs to help train their agents to become AI research scientists
    reinforcement-learning
  • Topological
    Topological
    Y Combinator LogoS2025
    Active • 2 employees
    Topological is developing physics-based foundation models for CAD optimization. We help hardware teams iterate at the same speed that software teams do. Our technology is accelerating the engineering workflow with AI and scales design and optimization to identify the ideal designs for complex problems given their physical constraints with enhanced speed and performance. Our first model, UToP-v1, is a SOTA topology optimization model that understands physics, geometry, and manufacturability. It can generate the most efficient design given a problem’s physical requirements. It has <5% compliance error and is 1930x faster than current methods. We're reimagining mechanical engineering and computational design with precision spatial AI.
    ai
    3d-printing
    design-tools
    reinforcement-learning
    robotics
  • Idler
    Idler
    Y Combinator LogoS2025
    Active • 13 employees • San Francisco, CA, USA
    Idler builds reinforcement learning environments that teach AI models to code at expert human levels. We create training environments based on real-world coding scenarios that prepare models for the complex challenges they'll face in production.
    reinforcement-learning
  • Janus
    Janus
    Y Combinator LogoP2025
    Active • 2 employees • San Francisco, CA, USA
    Janus automates AI evaluations by using high-fidelity simulation environments, catching failures in reasoning, compliance, tool usage, and performance. The resulting datasets benchmark products and feed post-training loops to continuously improve performance over time.
    ai
    reinforcement-learning
    aiops
    developer-tools
    monitoring
  • Aviro
    Aviro
    Y Combinator LogoP2025
    Active • 2 employees • San Francisco, CA, USA
    Aviro builds RL environments for long-horizon tool use across ML research, live web, and enterprise knowledge work. We partner with frontier labs and Fortune 100 companies to train models as high-stakes operators and perform research-grade work.
    reinforcement-learning
  • TypeOS
    TypeOS
    Y Combinator LogoP2025
    Active • 2 employees
    Built to advance human taste to the frontier, TypeOS is the best way to write with AI.
    consumer
    productivity
    b2b
    reinforcement-learning
    ai
  • Cartpole
    Cartpole
    Y Combinator LogoP2025
    Active • 1 employees • San Francisco, CA, USA
    We're creating reinforcement learning environments for training frontier models.
    reinforcement-learning
    ml
    ai
    data-labeling
  • Klavis AI
    Klavis AI
    Y Combinator LogoP2025
    Active • 3 employees
    Powering frontier AI labs with real world MCP environments and complex, long-horizon agentic tool-use data.
    reinforcement-learning
    data-science
    ai
    artificial-intelligence
  • Freesolo
    Freesolo
    Y Combinator LogoP2025
    Active • 4 employees • San Francisco, CA, USA
    Freesolo works with companies to encode user-trajectory knowledge into specialized models that outperform the state of the art at much lower latency and cost.
    reinforcement-learning
    artificial-intelligence
    b2b
  • hud
    hud
    Y Combinator LogoW2025
    Active • 15 employees • San Francisco, CA, USA
    HUD (YC W25) is developing agentic evals and RL environments for Computer Use Agents (CUAs) that browse the web for frontier AI labs. Our CUA Evals framework is the first comprehensive evaluation tool for CUAs. People don't actually know if AI agents are working reliably. To make AI agents work in the real world, we need detailed evals for a huge range of tasks. We're backed by Y Combinator, and work closely with frontier AI labs to provide agent evaluation and training infrastructure at scale.
    artificial-intelligence
    reinforcement-learning
  • Agentin AI
    Agentin AI
    Y Combinator LogoW2025
    Active • 2 employees • San Francisco, CA, USA
    At Agentin AI, we build Agents that move data and take actions across enterprise systems, like Salesforce, NetSuite and SAP. These agents are difficult to build because each enterprise heavily customizes their systems but we solved that by training our Agents to learn and adapt from failures, applying reinforcement learning techniques we developed.
    enterprise
    reinforcement-learning
    ai
  • TrainLoop
    TrainLoop
    Y Combinator LogoW2025
    Active • 6 employees • San Francisco, CA, USA
    TrainLoop makes it effortless for developers to supercharge LLM performance through reinforcement learning.
    developer-tools
    generative-ai
    reinforcement-learning
  • Osmosis
    Osmosis
    Y Combinator LogoW2025
    Active • 6 employees
    Osmosis helps companies use reinforcement learning to fine-tune open source models that outperform foundation models.
    reinforcement-learning
    machine-learning
    artificial-intelligence
    infrastructure
  • Synth
    Synth
    Y Combinator LogoF2024
    Active • 2 employees • San Francisco, CA, USA
    Choose a coding agent harness, model, and task dataset and optimize context and prompts to get the best performance for long-horizon tasks
    ai
    reinforcement-learning
  • Vibrant Labs
    Vibrant Labs
    Y Combinator LogoW2024
    Active • 2 employees • San Francisco, CA, USA
    We work on benchmarking and improving the long-horizon capabilities of AI Agents. We build out specialised environments to improve the long-horizon capabilities of browser and computer use agents.
    generative-ai
    open-source
    developer-tools
    ai
    reinforcement-learning
  • JustAI
    JustAI
    Y Combinator LogoW2024
    Active • 4 employees • San Francisco, CA, USA
    Always-on AI agents for 1-1 personalization at scale
    ai
    marketing
    personalization
    reinforcement-learning
    workflow-automation
  • Velos
    Velos
    Y Combinator LogoW2023
    Active • 3 employees • San Francisco, CA, USA
    Velos helps non-technical operations teams automate complex, manual back-office tasks with AI workers instead of overseas teams. Unlike traditional robotic process automation (RPA) platforms like UiPath, Velos automations use machine learning to reliably handle ambiguities in their tasks, eliminating the need for an army of maintenance engineers and consultants to build and maintain your automations. We're automating the repetitive work people hate to do.
    generative-ai
    reinforcement-learning
    artificial-intelligence
    automation
    robotic-process-automation
  • Atmeto
    Atmeto
    Y Combinator LogoW2023
    Active • 3 employees • Los Angeles, CA, USA
    Founded in 2022, Atmeto was started as a place to develop and apply machine learning to solve the world's biggest problem—climate change. Our current priority is getting the grid to run on 100% clean energy, which is currently limited by battery storage (specifically, the algorithms that control them). We're redefining these algorithms to unlock gigawatts of untapped energy storage capacity, enabling the grid to run on more clean energy from wind and solar.
    climate
    climatetech
    reinforcement-learning
    energy-storage
    energy
  • WorldQL
    WorldQL
    Y Combinator LogoW2022
    Active • 3 employees • San Francisco, CA, USA
    gaming
    developer-tools
    reinforcement-learning
    artificial-intelligence
  • rct AI
    rct AI
    Y Combinator LogoW2019
    Active • 40 employees • Los Angeles, CA, USA
    rct AI is providing AI solutions to the game industry and building the true Metaverse with AI generated content. By using cutting-edge technologies, especially deep learning and reinforcement learning, rct AI creates a truly dynamic and intelligent user experience both on the consumers’ side and production’s side. The founding team ever built a company, Raventech together and helped make it acquired by Baidu (NASDAQ:BIDU) in 2017.
    gaming
    reinforcement-learning
    metaverse
  • Sepal AI
    Sepal AI
    Y Combinator LogoS2024
    Acquired • 15 employees • San Francisco, CA, USA
    Sepal is a data research company on a mission to advance human knowledge and capabilities through safe AI. We partner with the world’s leading AI labs and enterprises to help their models get better at the tasks people actually want them to do. We’ve built a Cloud-Native Agent Dataset Factory which turns the process of generating evaluation and training data from manual, inconsistent, and labor-intensive into something automated, standardized, and scalable. Sepal AI was founded in 2024 by engineers and operators from Vercel and Turing. We went through Y Combinator, raised several million dollars from leading investors, and already count multiple Fortune 500s and top AI research labs as paying customers.
    data-labeling
    aiops
    reinforcement-learning
    ai
  • Resonance
    Resonance
    Y Combinator LogoW2024
    Acquired • 2 employees • San Francisco, CA, USA
    Resonance hyper-personalizes MarTech campaign content and automatically refreshes and stores high performing content for re-use.
    reinforcement-learning
    artificial-intelligence
    saas
    subscriptions
    marketing