by Garry Tan3/22/2023
Nearly ten years ago I wrote about the API-ization of everything. It seemed obvious then that software that talks to other software would be critical for building world-changing startups. What was less obvious then and more obvious now, is that those APIs would need to be connected to harness the full potential of everyday apps.
One of the best companies at connecting APIs is Zapier, which went through the YC Summer 2012 batch. Zapier, the leader in easy automation, makes it simple to automate workflows and move data across 5000+ apps. Setup takes less than six minutes and there’s not a single line of code. And with over 5,000 of the most popular B2B and consumer apps integrated, they’re already powering over 10 million integration possibilities.
I got a chance to sit down with Bryan Helmig (@bryanhelmig), the CTO and founder of Zapier, to talk APIs, interoperability, and learn more about the company’s first-ever public API: Natural Language Actions (NLA). With this new API, they’re making it possible to plug integrations directly into your product, and it's optimized for LLMs.
Bryan, thanks for joining me and finding time to catch up. Given the launch of your new Natural Language (NLA) API, I’m sure there was some insight or trend you were seeing that guided your build.
Bryan: Absolutely. AI apps have become the fastest-growing category of apps on Zapier’s platform… ever. We're seeing a huge demand from our users and partner ecosystem, to plug AI and large language models into their existing tools, workflows and automation. And Zapier is well positioned to help – 81 billion workflow tasks have already been created on our platform.
We actually started by prototyping LLM products into our own tech stack. We had two previous product experiments before NLA. The first was a fully chat-based Zap setup flow. With current-generation models, this often felt like playing "20 questions" with the model – not a great user experience. But it made us realize that other developers were likely facing the same challenges, and that Zapier could really deliver a seamless and simple developer experience in a way that no other company could.
From there, we focused on how to wrap up and simplify each individual API endpoint you might find across Zapier's 20k+ actions. We then allowed the model to call each one as a separate “tool”. That was the fundamental design principle we used internally, and it helped us to expose this as the new NLA API – for any developer to add integrations into that products or internal tools in 5-10 minutes.
For a team that’s the expert in APIs, launching Zapier’s first public API is a big deal. What about LLMs made this project different from how you’ve previously approached APIs in the past?
Prior to LLMs, we never felt like we could deliver the magical developer experience that we wanted to. Under the hood, Zapier wraps up a ton of complexity from our ecosystem – our platform handles around 20 types of API auths, custom fields, versioning and migrations, arbitrary payload sizes, binary data. You name it. Making a Zapier API would have meant passing along all that complexity to our end users.
But now, AI and LLMs bring an interesting inflection point for Zapier: The new Natural Language Actions API abstracts all that complexity away from devs. In fact, the API has only one required parameter: "instructions". NLA can also be used in the more "classic" way by calling it hard-coded parameters instead of natural language parsing, but the natural language capabilities make it especially useful for people building products on top of LLMs. Ultimately we are using LLMs to make APIs easier to use for both humans and other LLMs!
And what are some of the exciting things you’re seeing people build with your APIs?
There's this amazing story about a contractor with dyslexia who teamed up with a client of his who happened to be familiar with Zapier. They built a Zap with OpenAI’s GPT-3 to write better business emails. It totally transformed his communication and even helped him land a massive $200,000 contract! It’s those stories of AI and automation coming together to help individual people that makes me excited to be building on this technology today.
But, really, we’re just scratching the surface. We can’t predict what all the builders on our Zapier platform will create. I mean, when we launched multi-step Zaps 5 years ago, we set a "sanity" limit of 30 [workflow] steps. We thought that would clearly be enough for anybody. But in less than 24 hours, users were inundating us to raise the limit. And as we dug in deeper, and found these beautiful, mind-blowing and complex Zaps – things we couldn’t have ever imagined. With LLMs in the mix, we’re hoping we’ll enable that same level of creativity and power, and now from the developer community.
So with all of the power that LLMs bring to the table, can you share what’s actually happening under the hood? How have you kept it simple?
At its core, we leverage OpenAI’s GPT3.5 series to understand and process natural language instructions from the user, map it to a specific API call, and return the response from the API – all in a way that’s optimized for LLMs.
First, users give explicit permission to the model to access certain actions. We try to make this super fast and simple, to feel like an OAuth flow to the end user. When a user is setting this up, they’re able to see what the required fields are and either let the AI guess or manually specify the values. Then once in a developer’s platform, the only required field for the user is the natural language instruction. We take that instruction from a user and let the model figure out how to fill in the required fields. The model then constructs an API call.
Before we can send the results back, we also need to make it LLM and human-readable. Many APIs return really complex data in their API responses that would not only cause an LLM to go over its token limit but it confuses both the model and the user. (As an example, a Gmail API call returns over 10,000 tokens!). We've done work on our end to trim down the results to expose just the relevant pieces. The NLA API currently guarantees arbitrary API payloads will fit into 350 tokens or fewer. This makes it incredibly easy to use and build on the NLA API without worrying about the data input or output with the APIs.
And for any aspiring API developer reading this – either looking to use your new APIs or even building their own – any tips from the guys who live and breathe APIs all day?
Definitely. The big thing many APIs "get wrong" is being overly complex, overly unique, and overly hard to get started. You’ve talked about how Stripe and Lob have gotten payments and shipping right by simplifying complexity; we leaned on similar examples for inspiration. If you’re building an API, you should too.
We're definitely big fans of libraries like django-ninja or FastAPI for creating compelling APIs with baked-in types and documentation. We're using that sort of technology under the hood as well, both for design consistency and for scalability.
In the development of our NLA, we've tried to be strict about not letting internal complexity filter down to end developers. NLA supports both OAuth and API keys for quickly getting started, and we have several off-the-shelf examples in the API documentation, including a published LangChain integration.
If you want to get started, any developer can create an API key right away. We’re excited to see what you can imagine, and please share – tag me on Twitter (@bryanhelmig) and show me what you’ve got. And even better, I’d love feedback on what we’ve built, and we’re here to answer questions. And if you’re an API geek like the rest of us at Zapier… we’re hiring.
Garry is the President & CEO of Y Combinator. Previously, he was the co-founder & Managing Partner of Initialized Capital. Before that, he co-founded Posterous (YC S08) which was acquired by Twitter.