At Guide labs, we build interpretable foundation models that can reliably explain their reasoning, and are easy to align. We provide access to these models via an API. Over the past 6 years, our team has built and deployed interpretable models at Meta, and Google. Our models provide explanations that: 1) provide human-understandable explanations, for each output token, 2) which parts of the input (prompt) is most important for each part of the generated output, and 3) which inputs, in the training data, directly led to the model's generated output. Because our models can explain their outputs, they are easier to debug, steer, and align.
I am a machine learning researcher developing methods to explain and align large-scale generative models. In the past, I developed a popular test for assessing the reliability of a model's explanation. More recently, I developed the first interpretable diffusion model, and large language model that can explain the reasoning for their outputs. These models can be more easily aligned via feedback on their explanations. Along the way, I have worked at Google, Meta, and Genentech.
I have been doing research on explainability / interpretability in machine learning since before it was a thing, starting from my PhD studies at MIT. I've worked with doctors to build models they can understand, developed and scaled a new model debugging tool, and previously worked at Meta.
At Guide Labs, we build interpretable foundation models that can reliably explain their reasoning and are easy to align.
Current transformer-based large language models (LLMs) and diffusion generative models are largely inscrutable and do not provide reliable explanations for their output. In medicine, lending, and drug discovery, it is not enough to only provide an answer; domain experts would also like to know why the model arrived at its output.
We’ve developed interpretable foundation models that can explain their reasoning, and are easy to align.
These models:
Using all these explanations, we can:
We are a team of machine learning researchers with PhDs from MIT who have worked on interpretable models for the past 6 years. We have published more than 10 papers at top machine learning conferences on the topic. We recently developed an interpretable diffusion and large language models that can explain their outputs using human understandable concepts. We have previously built and trained interpretable models at Google and Meta.
We are interested in working with companies across lending/finance, healthcare/medicine, insurance, and drug discovery who are looking for alternative models that can reliably explain their outputs. Please reach out to us at info@guidelabs.ai.