Homeโ€บCompaniesโ€บRaindrop

Sentry for AI Agents

Monitor your AI agents the right way. AI engineers use Raindrop to get alerts about hidden issues and successes in their AI agents. Raindrop sends you alerts when your AI misbehaves and links straight to the events, so you can dig into the conversations or traces, understand the root cause, and fix it, fast.
Active Founders
Zubin Koticha
Zubin Koticha

Zubin Koticha, Founder

Building Raindrop โ€“ Sentry for AI products. Prev. cofounder & CEO of Opyn, the first and largest DeFi options platform, which grew to Series B stage with $15 + billion volume. UC Berkeley
Alexis Gauba
Alexis Gauba

Alexis Gauba, Founder

Building Raindrop: Sentry for AI agents. Previously Co-Founder at Opyn, the first and largest DeFi options platform (series B, $15b+ volume), inventing a new financial asset class known as the power perpetual (an option that has no expiry). Dropped out of UC Berkeley EECS
Ben Hylak
Ben Hylak

Ben Hylak, Founder

building dawn -- a platform + api where companies can categorize anything. I was previously on the Human Interface team at Apple for 4 years, building out visionOS. before that, dabbled with robotics + avionics
Jobs at Raindrop
San Francisco, CA, US
$80K - $200K
0.40% - 1.50%
6+ years
Raindrop
Founded:2023
Batch:Winter 2024
Team Size:4
Status:
Active
Location:San Francisco
Primary Partner:Diana Hu
Company Launches
Raindrop Deep Search
See original launch post

https://www.youtube.com/watch?v=pN82WxN-_G0

Today, we're excited to launch Raindrop Deep Search

Itโ€™s like Deep Research for your Production AI Data

Search for anything, and Raindrop automatically trains little models to accurately classify any topic or issue, across millions of events.

๐—ง๐—ต๐—ฒ ๐—ฃ๐—ฟ๐—ผ๐—ฏ๐—น๐—ฒ๐—บ

Weโ€™ve heard from thousands of AI engineers and theyโ€™re struggling to track issues with their agents.

Imagine a user reports a problem: your agent is saying it canโ€™t search the web for documentation. You need to know if this is a one-off problem or a much bigger issueโ€ฆ but how? Keyword search, or even semantic search, doesnโ€™t tell the full story.

๐—–๐—ฎ๐—ปโ€™๐˜ ๐˜„๐—ฒ ๐—ท๐˜‚๐˜€๐˜ ๐˜‚๐˜€๐—ฒ ๐—ง๐—ฟ๐—ฎ๐—ฑ๐—ถ๐˜๐—ถ๐—ผ๐—ป๐—ฎ๐—น ๐—˜๐˜ƒ๐—ฎ๐—น๐˜€?

Offline evals work well as unit tests. But since theyโ€™re running on preset data, you have no visibility into whatโ€™s actually happening in production.

Online evals just run these unit tests on a tiny sample of production data, leaving you blind to how widespread problems are.

๐—œ๐—ป๐˜๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐—ถ๐—ป๐—ด ๐—ฅ๐—ฎ๐—ถ๐—ป๐—ฑ๐—ฟ๐—ผ๐—ฝ ๐——๐—ฒ๐—ฒ๐—ฝ ๐—ฆ๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต

Thatโ€™s why we built Deep Search. Itโ€™s like Deep Research for your production data.

How Deep Search works:

1. Describe the issue (eg. agent failing to search the web)

2. Deep Search finds examples out of millions of events

3. Refine search with feedback

4. Start tracking the issue

Deep Search runs across all of your production data to give you an accurate metric of issue frequency.

Traditional classification systems require humans to manually label thousands of data points. So to achieve this, Raindrop Deep Search introduces a new research breakthrough, bespoke few-shot classifiers, which only need a few examples.

Itโ€™s essentially bootstrapping weaker systems from stronger systems, ultimately training custom small models that analyze millions of events a day. You can think of it like creating materialized views for natural language.

Once you start tracking the issue you can use Raindrop to dive into traces and tool calls to find the root cause. And you can quickly confirm whether your fixes are effective by monitoring issue frequency and receiving real-time Slack alerts.

You can try out Deep Search at raindrop.ai.

Weโ€™re excited to hear what you think!

Other Company Launches

Raindrop - Sentry for AI Products

AI engineers use Raindrop to get alerts about hidden issues and wins in their AI products.
Read Launch โ€บ

Dawn - Analytics for AI products

We transform user requests and model outputs into metrics you actually care about.
Read Launch โ€บ