Homeβ€ΊCompaniesβ€ΊTrainy
Trainy

Infrastructure for managing GPU clusters for training/serving.

Goodbye Slurm, Hello Konduktor. Trainy Konduktor is a software platform for AI teams to schedule workloads with priority, control resource allocation, and improve GPU reliability. With Konduktor, teams submit jobs to a healthy pool of GPUs, assign job priority with a simple user interface, and never worry about hardware faults again.
Active Founders
Roanak Baviskar
Roanak Baviskar
Founder
Studied CS & Mathematics at UC Santa Cruz. Led Audio team at Hive AI, where we trained and deployed 500M parameter-scale models to production.
Andrew Aikawa
Andrew Aikawa
Founder
Co-founder and CTO at Trainy building a training platform to make deep learning go faster. Previously a lead Machine Learning Engineer for Hive AI's object detection products. I completed my Physics Ph.D. UC Berkeley '22 where my thesis focused on applying computer vision and deep learning on nanoscience. Physics & Computer Science B.A. UC Berkeley '17.
Company Launches
πŸͺ Pluto – OSS experiment tracker for Neptune users
See original launch post

β†’ Try the live playground – explore Pluto with real sample experiments, no signup required.

NeptuneAI is shutting down on March 5th after being acquired by OpenAI. We built Pluto so you can migrate safely without rushing.


πŸ‘‹ Who we are

Hey YC, we're Roanak and Andrew! We founded Trainy to help AI teams efficiently manage and utilize GPU infrastructure. When we heard Neptune was shutting down, our customers started panicking about losing years of experiment history and having to rewrite their logging code on a tight deadline.

So we built Pluto.

uploaded image

πŸ”₯ The problem

Neptune's March 5th shutdown is forcing research teams into one of the worst kinds of migrations: rushed, high-stakes, and hard to verify.

Most alternatives require you to:

  • Rewrite your logging code
  • Lose your historical runs (or spend days exporting and reformatting)
  • Hope the new tool handles your multi-GPU workloads the same way

And you have to do all of this before March 5th or your data gets deleted.

πŸ›  How Pluto solves it

1. Dual-log to both platforms simultaneously

Add one import, keep your existing Neptune code running, and Pluto logs side-by-side. Validate that everything matches on your real training workloads before you cut over.

2. Export your Neptune history

Pluto includes a Neptune exporter that brings your old runs into Pluto. Years of experiment data, preserved.

3. Built for the workloads Neptune users actually run

We're optimizing for what Neptune users loved: UI responsiveness at scale, reliable logging throughput under multi-node/multi-GPU workloads, and the ability to track thousands of per-layer metrics without lag.

Deployment options:

  • Self-hosted – for sensitive data or air-gapped environments, instructions in server readme
  • Hosted – $250/seat/month (free for existing Trainy customers)

πŸ™ Our ask

If you're a Neptune user, please try Pluto with dual-logging before the March shutdown.

We're specifically looking for feedback on:

  • Anything that would block your cutover
  • Features you rely on that we're missing
  • UI/performance differences at scale

Early Bird offer: Start dual-logging in the next 2 weeks β†’ 3 months hosted free.

πŸ‘‰ Once you’ve made an account on pluto.trainy.ai, email roanak@trainy.ai or grab a time to chat and we’ll add seats/storage.


We're listed on Neptune's official transition hub.

Thank you!

Previous Launches
Dashboards to help ML engineers training large models isolate performance bottlenecks and boost training speed.
Trainy
Founded:2023
Batch:Summer 2023
Team Size:3
Status:
Active
Location:San Francisco
Primary Partner:Diana Hu