notdiamond-0001 is our first model router available on Hugging Face. notdiamond-0001 takes any input and determines whether to send it to GPT-3.5 or GPT-4, optimizing for the highest accuracy while drastically reducing your costs and latency.
We’ve spent the last month working to make sure notdiamond-0001 meets the following five criteria:
To get started with notdiamond-0001, download it on Hugging Face.
We believe in a multi-model future. The world won't have one single, giant model that everyone sends everything to—instead, there will be many foundation models, millions of fine-tuned variants of those models, and countless custom inference engines running on top of them. We believe this is not only a better future for AI, but a safer one as well. We started Not Diamond to enable this multi-model future, starting with safe and robust infrastructure for routing between models.
Why routing? Over the past months, we’ve talked to hundreds of developers and companies building on top of LLMs, from early-stage startups to Fortune 500 companies. For nearly everyone, model routing is a big, hairy, audacious problem. It sucks. Teams are using heuristics to route deterministically with if/else statements and regex expressions, trying to train their own classifiers to route inputs, A/B testing model selections, or handwriting prompts in an attempt to use slow and bulky agents as routers. When teams do manage to get a router working, it frequently breaks whenever the underlying models update. Meanwhile, those who have managed to build functional routers have seen huge gains in their product quality and margins. We decided there had to be a better way.
If you’re using GPT-4, notdiamond-0001 will lead to an immediate and drastic reduction in your inference costs and latency without any degradation in quality. Or, if you’re using GPT-3.5, you can enjoy a much higher response quality without significantly increasing your bill.
As a team, we've built venture scale companies, developed products for billions of users, and published cutting edge research in top AI journals, and we’re excited to be backed by some of the world's best AI developers and founders.
We’re actively hiring, so drop us a line if you want to help us build a multi-model future.