FiddleCube helps developers fine-tune & deploy LLMs with synthetic data. We use AI to generate private, high-quality datasets for customers who want custom LLMs but don’t have the resources to create annotated training datasets for fine-tuning & reinforcement learning.
Creating high quality datasets at FiddleCube. Fascinated about AI alignment. Curious about health-tech, design and fitness. Full Stack engineer, part-time illustrator.
As the founder of FiddleCube, obsessed with creating high quality datasets. Making baby steps towards AI alignment. Prior to this, was working as a software engineer for nearly a decade at companies like LinkedIn, Uber and Google. Experienced in building software systems that are highly reliable, have low latency and fault tolerant at planet scale.
Tl;Dr; Fine-tuning LLMs requires high-quality datasets. FiddleCube automagically generates fine-tuning datasets from your data.
User Data Source > Fine-tuning Datasets (FiddleCube) > Fine-tuning
Head over to fiddlecube.ai to get started!
Hi everyone, we are Neha and Kaushik. We’re building FiddleCube to make high-quality datasets accessible to everyone.
🦸 Kaushik spent most of the last decade building tech at companies like Google, Uber, and LinkedIn.
🧙🏻 Neha has spent a similar amount of time as a dev at multiple startups, most recently at Uber
👫🏻🫶🏻 We met at Uber, eventually got married, and decided to build a startup together, following our passion for AI.
In the real world, LLMs need to be aligned to follow human instructions. It needs to respond in a manner that is:
Remarkable outcomes have been achieved towards this end by fine-tuning and reinforcement learning with high-quality datasets. However, creating these datasets takes significant time, manual effort, and money.
FiddleCube leverages a suite of AI models to create high-quality datasets for fine-tuning and reinforcement learning.
We create a rich, diverse, high-quality dataset to produce better models with a lower corpus of data.
Give the model a personality, voice, and tone. For example, you can create a safe Dora the explorer / Peppa Pig model that speaks to children.
For specific use cases like making API calls or generating code, fine-tuning has provably demonstrated better results. You can fine-tune the LLM on a corpus of code or API data to significantly improve their ability at these tasks.
Fine-tuned LLMs are much smaller than the foundational models. You can use them to increase throughput and reduce latency and cost.
LLMs perform poorly in certain domains like vernacular languages. These domains lack a sufficient corpus of high-quality data. Fine-tuning using generated datasets has shown remarkable improvements over the state of the art in these cases.
Are you fine-tuning any LLM, or looking to fine-tune LLaMa V2, MPT, or Falcon? We would love to know your use case. Drop a comment on what you are doing, or reach out to us privately!
Book a slot on our calendar 🗓️ or drop us a line using:
- Email 📧 : kaushik@fiddlecube.ai
- Typeform 📝
and we will get back to you!