Unlocking data behind complex documents

Reducto offers robust and reliable document ingestion for any workflow. Our API allows you to convert complex, unstructured documents into structured outputs that are perfect for RAG, process automation, and more.

Jobs at Reducto

San Francisco, CA, US
$100K - $230K
0.25% - 2.00%
1+ years
Team Size:2
Group Partner:Diana Hu

Active Founders

Adit Abraham

Co-founder/CEO at Reducto. Before Reducto I studied CS at MIT, made Ads/Search things as a PM for Google, did ML research at MIT's Media Lab, and spent an unreasonable amount of time playing Pokemon Showdown.

Adit Abraham
Adit Abraham

Raunak Chowdhuri

Co-founder/CTO at Reducto. Before Reducto I studied CS at MIT, founded/scaled a comp. chem consulting company to 200k ARR, published computer vision papers with 100+ citations before finishing high school, and spent a little too much time shit posting on Twitter.

Raunak Chowdhuri
Raunak Chowdhuri

Company Launches

Hey everyone - Raunak and Adit here 👋🏼

We met four years ago while studying computer science at MIT, and we’ve spent the past few years building ML products at companies like Google and NVIDIA. We started building Reducto after struggling with document ingestion while consulting for teams integrating LLMs into their applications.

📃 The Problem

Nearly 80% of enterprise data is in unstructured formats like PDFs

PDFs are the status quo for enterprise knowledge in nearly every industry. Insurance claims, financial statements, invoices, and health records are all stored in a structure that’s simply impractical for use in digital workflows. This isn’t an inconvenience—it’s a critical bottleneck that leads to dozens of wasted hours every week.

Traditional approaches fail at reliably extracting information in complex PDFs

OCR and even more sophisticated ML approaches work for simple text documents but are unreliable for anything more complex. Text from different columns are jumbled together, figures are ignored, and tables are a nightmare to get right. Overcoming this usually requires a large engineering effort dedicated to building specialized pipelines for every document type you work with.

🚀 Our Solution

Reducto breaks document layouts into subsections and then contextually parses each depending on the type of content.

This is made possible by a combination of vision models, LLMs, and a suite of heuristics we built over time. Put simply, we can help you:

  • Accurately extract text and tables even with nonstandard layouts
  • Automatically convert graphs to tabular data and summarize images in documents
  • Extract important fields from complex forms with simple, natural language instructions
  • Build powerful retrieval pipelines using Reducto’s document metadata
  • Intelligently chunk information using the document’s layout data

You can try a demo here.

🙏🏼 Our Ask

We’re looking to onboard more teams across insurance, healthcare, and finance and would really appreciate your help. If you or someone you know is struggling with high-volume document workflows (5000+ pages), please reach out to us at founders@reducto.ai.