Unlocking data behind complex documents

Founding Product Engineer

$100K - $230K / 0.25% - 2.00%
San Francisco, CA, US
Job Type
1+ years
Connect directly with founders of the best YC-funded startups.
Apply to role ›
Raunak Chowdhuri
Raunak Chowdhuri

About the role

As the founding engineer hire, you'll play a pivotal role in shaping the core product, while also spearheading efforts to ensure scalability, optimize performance, and continuously improve the product to meet evolving needs and challenges. This is an in-person role based in San Francisco, CA.

  • Build and maintain user facing interfaces on-top of our document processing API (like our Document Playground) to service less technical users and enterprises.
  • Act as a product leader, taking full ownership over parts of our product and driving strategy and innovation.
  • Make improvements to API design and pre-processing algorithms (chunking, structured extraction, etc.) based on customer feedback.
  • Build internal tooling/visualizations to better understand/analyze failure cases.
  • Work directly with founders to shape and influence the direction of the product and contribute to the engineering strategy.

🚀 You’ll be successful in this role if you…

  • Are an autonomous and resourceful engineer with 2-5 years of experience building real-world applications, with a high bar for quality.
  • Can rapidly go from 0 to 1 in building apps in Typescript/Next.js.
  • Have a solid fundamental understanding of Python and algorithms.
  • Have a high bar for quality and craftsmanship.
  • Have excellent communication skills. You’re able to collaborate well with team members and work directly with our customers.

⭐ Bonus points

  • Prior experience as a startup entrepreneur or founding engineer.
  • You like keeping up to date with latest developments in ML/AI

About Reducto

Nearly 80% of enterprise data is in unstructured formats like PDFs

PDFs are the status quo for enterprise knowledge in nearly every industry. Insurance claims, financial statements, invoices, and health records are all stored in a structure that’s simply impractical for use in digital workflows. This isn’t an inconvenience—it’s a critical bottleneck that leads to dozens of wasted hours every week.

Traditional approaches fail at reliably extracting information in complex PDFs

OCR and even more sophisticated ML approaches work for simple text documents but are unreliable for anything more complex. Text from different columns are jumbled together, figures are ignored, and tables are a nightmare to get right. Overcoming this usually requires a large engineering effort dedicated to building specialized pipelines for every document type you work with.

Reducto breaks document layouts into subsections and then contextually parses each depending on the type of content. This is made possible by a combination of vision models, LLMs, and a suite of heuristics we built over time. Put simply, we can help you:

  • Accurately extract text and tables even with nonstandard layouts
  • Automatically convert graphs to tabular data and summarize images in documents
  • Extract important fields from complex forms with simple, natural language instructions
  • Build powerful retrieval pipelines using Reducto’s document metadata
  • Intelligently chunk information using the document’s layout data
Team Size:2
Adit Abraham
Adit Abraham
Raunak Chowdhuri
Raunak Chowdhuri