Home
Companies
Guide Labs

Guide Labs

Interpretable foundation models that are easy to align

At Guide labs, we build interpretable foundation models that can reliably explain their reasoning, and are easy to align. We provide access to these models via an API. Over the past 6 years, our team has built and deployed interpretable models at Meta, and Google. Our models provide explanations that: 1) provide human-understandable explanations, for each output token, 2) which parts of the input (prompt) is most important for each part of the generated output, and 3) which inputs, in the training data, directly led to the model's generated output. Because our models can explain their outputs, they are easier to debug, steer, and align.

Guide Labs
Founded:2023
Team Size:2
Location:San Francisco
Group Partner:Nicolas Dessaigne

Active Founders

Julius Adebayo

I am a machine learning researcher developing methods to explain and align large-scale generative models. In the past, I developed a popular test for assessing the reliability of a model's explanation. More recently, I developed the first interpretable diffusion model, and large language model that can explain the reasoning for their outputs. These models can be more easily aligned via feedback on their explanations. Along the way, I have worked at Google, Meta, and Genentech.

Julius Adebayo
Julius Adebayo
Guide Labs

Fulton Wang

I have been doing research on explainability / interpretability in machine learning since before it was a thing, starting from my PhD studies at MIT. I've worked with doctors to build models they can understand, developed and scaled a new model debugging tool, and previously worked at Meta.

Fulton Wang
Fulton Wang
Guide Labs

Company Launches

At Guide Labs, we build interpretable foundation models that can reliably explain their reasoning and are easy to align.

The Problem: foundation models are black-boxes and difficult to align

Current transformer-based large language models (LLMs) and diffusion generative models are largely inscrutable and do not provide reliable explanations for their output. In medicine, lending, and drug discovery, it is not enough to only provide an answer; domain experts would also like to know why the model arrived at its output.

  • Current foundation models don’t explain their outputs. Would you trust a black-box model to propose medications for your illness or decide whether you should get a job interview?
  • You can't debug a system you don't understand: When you call a model API and the response is incorrect, what do you do? Change the prompt? What part of your prompt should you change? Switch to a new model API?
  • Difficult to reliably align or control model outputs: Even when you've identified the cause of the problem. How do you control the model so that it no longer makes the mistake you identified?

Our Solution

We’ve developed interpretable foundation models that can explain their reasoning, and are easy to align.

These models:

  • provide human-understandable explanations;
  • indicate what part of the prompt is important; and,
  • specify which tokens led to the model's output.

Using all these explanations, we can:

  • identify the part of the prompt that causes the model to err;
  • isolate the samples that cause those errors; and,
  • use explanations to control and align the model to fix its errors.

About Us

We are a team of machine learning researchers with PhDs from MIT who have worked on interpretable models for the past 6 years. We have published more than 10 papers at top machine learning conferences on the topic. We recently developed an interpretable diffusion and large language models that can explain their outputs using human understandable concepts. We have previously built and trained interpretable models at Google and Meta.

Our Ask

We are interested in working with companies across lending/finance, healthcare/medicine, insurance, and drug discovery who are looking for alternative models that can reliably explain their outputs. Please reach out to us at info@guidelabs.ai.