PRICING

Do more with less

Built for teams at every scale

Early access

For developers and small teams

  • Free to try
  • Multi-harness support
  • Savings and usage dashboard
  • Continuous learning from harness usage
  • Privacy-preserving routing

Enterprise

For enterprise teams ready to scale

  • Organization-wide analytics
  • Advanced security and admin controls
  • MDM deployments
  • SSO with SAML
  • SLAs and 24/7 support

Frequently asked questions

Intelligent model routing predicts which AI model to use for each prompt, maximizing accuracy while reducing cost.

Not Diamond is not a gateway. Rather, we integrate with your existing gateway infrastructure. An AI gateway manages access to various models, while intelligent model routing determines which model to use on each prompt to get the best quality at the lowest cost.

Not Diamond is purpose-built for long-horizon coding agent workloads and works with a variety of coding agent harnesses. If you have custom harness integration needs, please reach out.

Yes. Not Diamond is consistent with Anthropic's commercial terms of service and delivers significant savings when used solely with Anthropic models.

Not Diamond looks at payload semantics, KV cache state, implicit feedback signals, session outcomes, sub-agent architecture, compaction events, and the sequence of all past turns and recommendations. It assesses these against a given model pool and cost objective to recommend a model and reasoning effort for each step in a session.

Teams using Not Diamond achieve at least 20-40% cost savings, and often more, without any degradation in quality relative to the frontier by using Not Diamond.

Not Diamond leverages reinforcement learning to continually learn from feedback signals and session outcomes, thus improving recommendations in real-time. This allows us to personalize recommendations to each developer and to transfer learnings across all developers in an organization or across the entire platform. In this way we continuously deliver the highest quality model outputs at the lowest price.

With privacy preserving routing enabled, Not Diamond bases routing decisions solely on derived request metadata and locally computed features, meaning we never see your payload, inputs, or outputs. Enabling privacy preserving routing has no negative impact on the quality of routing decisions.

Not Diamond is SOC-2 and ISO 27001 compliant. We provide custom SLAs and 24/7 on-call support to the most sophisticated AI teams in the world.

The average added per step latency of our router is 100-200ms.

We charge a small fixed fee per million tokens which is cheaper than the cheapest LLM. In the most conservative case we deliver 10x ROI against the savings we provide.

Model routing is a difficult problem which spans research, data science, engineering, and infrastructure. Additionally, maintaining a model routing solution requires tracking many moving targets as the model and agent landscape continuously evolves. This is why we believe it makes sense to work with a model routing vendor. If you want to build routing yourself, we recommend investing in a strong research and engineering team who is prepared to continuously update the solution over time.

If you would like to set up a pilot to test Not Diamond, please book time with our team and we can discuss your needs and requirements.

Frontier quality at a fraction of the price

Let the machine build the machine