The simplest way to extract web data at scale

At Reworkd, we're working on multimodal LLM agents that serve as the simplest way to extract web data at scale. Customers come to us with lists of 100s to 1000s of websites along with a data schema. Our agents traverse these websites, understand their structure, and generate code to extract data from them. We've been working on LLM agents since their inception and have received over 30k stars on GitHub and 1M+ users across previous agent products. If you're interested in our pilot program, shoot us an email!

Jobs at Reworkd

San Francisco, CA, US
$130K - $170K
3+ years
Team Size:4
Location:San Francisco
Group Partner:Dalton Caldwell

Active Founders

Asim Shrestha

Software engineer and open source enthusiast. Also a co-founder @ Reworkd AI

Asim Shrestha
Asim Shrestha

Srijan Subedi

Co-founder @ Reworkd AI. Combined major in Science @ UBC. Previously worked at STEMCELL Technologies

Srijan Subedi
Srijan Subedi

Adam Watkins, Founder

Co-Founder & CTO of Reworkd AI - Pushing the boundaries of AGI agents. Deeply passionate about open-source, software architecture, engineering leadership, and emojis 🚀😀

Adam Watkins
Adam Watkins

Company Launches

tl;dr: Reworkd automates your entire web data pipeline, end-to-end. It understands websites, writes code, runs scrapers, and validates results — all from one simple system.

😩 The Problem

Collecting, monitoring, and maintaining a web data pipeline can be complex and time-consuming, especially at scale. Traditional methods often struggle with issues such as pagination, dynamic content, bot detection, and site changes, all of which can compromise data quality and availability.

To address web data needs, businesses are often faced with either building out an internal engineering team or outsourcing to a low-cost country. The former can be expensive, while the latter is often unsustainable and requires significant management oversight.

🚀 The Solution:

Recognizing the inefficiencies of traditional data collection methods, Reworkd was developed to simplify your web data pipeline. Simply provide us with a list of websites and the schema you want the data mapped to, and we will handle the rest.

At its core, Reworkd uses LLM code generation to enable companies to rapidly scale their extraction efforts across thousands of websites. Additionally, we offer:

  • Self-Healing Scrapers: These scrapers automatically adjust to website updates to maintain data integrity.
  • Scheduling and Deduplication: This feature ensures you have a complete and current view of all websites, providing a historical perspective on data changes.
  • Automatic Proxies: With Reworkd, there’s no need to choose between residential, data center, or other proxy types—we manage this for you.
  • Complex Data Types: We take care of downloading and hosting files, ensuring data availability even as source websites evolve.

🙏 Our Ask

  • Book a Chat! Have a few minutes? Schedule a time with us and let’s discuss how we can help scale your data needs efficiently.
  • Support our launch tweet and follow us on LinkedIn and Twitter
  • Share Reworkd with anyone you know who is facing challenges in scaling their web data pipeline.

Other Company Launches

🤖 Reworkd AI - The open-source Zapier of AI agents

We help automate core business workflows with the help of AI Agents
Read Launch ›