For any input, Not Diamond automatically determines which model is best-suited to respond, delivering state-of-the-art performance that beats every foundation model on every major benchmark.
from notdiamond import NotDiamond
# Define the Not Diamond routing client
client = NotDiamond()
# The best LLM is determined by Not Diamond
result, session_id, provider = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Concisely explain merge sort."}
],
model=['openai/gpt-4o', 'anthropic/claude-3-5-sonnet-20240620']
)
print("LLM called: ", provider.model) # The LLM routed to
print("LLM output: ", result.content) # The LLM response
By default, Not Diamond maximizes quality above all else. However, you can also define explicit cost and latency tradeoffs to optimize for speed or cost-savings by using faster, cheaper models when doing so doesn't degrade quality.
result, session_id, provider = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Concisely explain merge sort."}
],
model=['openai/gpt-4o', 'openai/gpt-3.5-turbo'],
tradeoff="cost" # Use cheaper models without degrading quality
)
Not Diamond provides an elegant interface through which you can upload any evaluation dataset and receive back a customized router within minutes.
from notdiamond.toolkit import CustomRouter
# Initialize the CustomRouter object for training
trainer = CustomRouter()
# Train the model using your dataset
preference_id = trainer.fit(
dataset=pzn_train, # The dataset containing inputs, responses, and scores
prompt_column="Input", # Column name for the input prompts
response_column="response", # Column name for the model responses
score_column="score" # Column name for the scores
)
print("Custom router preference ID: ", preference_id)
Not Diamond can hyper-personalize routing decisions to each of your end-users’ individual preferences in real-time based on their feedback.
result, session_id, provider = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a world class programmer."},
{"role": "user", "content": "Write a merge sort in Python."}
],
model=llm_providers,
preference_id="YOUR_PREFERENCE_ID",
metric=Metric("accuracy")
)
# The user submits a thumbs up
score = metric.feedback(session_id=session_id, llm_config=provider, value=1)
Not Diamond seamlessly supports prompt optimization workflows—both manual and automatic—so that you always call the best model with the best prompt.
from notdiamond.llms.config import LLMConfig
llm_providers = [
LLMConfig(
provider="openai",
model="gpt-4o",
system_prompt="Summarize this essay:"
),
LLMConfig(
provider="anthropic",
model="claude-3-5-sonnet-20240620",
system_prompt="Distill the essence of this document:"
),
]
Not Diamond is not a proxy and all calls go out client-side. Bolster your data privacy by enabling similarity-preserving fuzzy hashing on our API or deploy directly to your infrastructure for maximum security.
result, session_id, provider = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a technical analyst."},
{"role": "user", "content": "Review this confidential internal document..."}
],
model=llm_providers,
hash_content=True # Turn on fuzzy hashing
)