Skip to main content
Run A/B tests on models with zero latency. Same session always gets same model (sticky assignment).
Create and manage your model configs in the dashboard.

Setup

from fallom import models

models.init(api_key="your-fallom-api-key")

Basic Usage

from fallom import models

# Get assigned model for this session
model = models.get("summarizer-config", session_id)
# Returns: "gpt-4o" or "claude-3-5-sonnet" based on your config weights

agent = Agent(model=model)
agent.run(message)

Version Pinning

Pin to a specific config version, or use latest (default):
# Use latest version (default)
model = models.get("my-config", session_id)

# Pin to specific version
model = models.get("my-config", session_id, version=2)

Fallback for Resilience

Always provide a fallback so your app works even if Fallom is down:
model = models.get(
    "my-config",
    session_id,
    fallback="gpt-4o-mini"  # Used if config not found or Fallom unreachable
)

User Targeting

Override weighted distribution for specific users or segments:
model = models.get(
    "my-config",
    session_id,
    fallback="gpt-4o-mini",
    customer_id="user-123",  # For individual targeting
    context={                 # For rule-based targeting
        "plan": "enterprise",
        "region": "us-west"
    }
)

User Targeting

Target specific users or segments to specific model variants. This is useful for:
  • Beta testing - Roll out new models to specific users first
  • Enterprise features - Give premium users access to better models
  • Gradual rollouts - Target by region, plan, or any custom attribute

How It Works

Targeting rules are evaluated client-side for zero latency:
  1. Individual Targets - Exact match on customerId or any field
  2. Rules - Condition-based targeting (all conditions in a rule must match)
  3. Fallback - If no targeting matches, use weighted random distribution

Configuration

Configure targeting in the dashboard when editing a model config:
{
  "enabled": true,
  "individualTargets": [
    { "field": "customerId", "value": "vip-user-123", "variantIndex": 1 }
  ],
  "rules": [
    {
      "conditions": [
        { "field": "plan", "operator": "eq", "value": "enterprise" }
      ],
      "variantIndex": 1
    }
  ]
}

Supported Operators

OperatorDescriptionExample
eqEqualsplan = "enterprise"
neqNot equalsplan ≠ "free"
inIn listplan in ["enterprise", "business"]
ninNot in listregion not in ["cn", "ru"]
containsContains substringemail contains "@acme.com"
startsWithStarts withregion starts with "eu-"
endsWithEnds withemail ends with ".gov"

Custom Model Providers

A/B test between standard API models and custom-hosted models (self-hosted, Novita, Together, Fireworks, Ollama, etc.).

Dashboard Setup

Create a config with custom model names using any naming convention:
VariantModelWeight
Controlgpt-4o50%
Customcustom:my-llama-70b50%
Use a prefix like custom:, together:, or local: to identify non-standard providers.

Provider Routing

Create a helper function to route model IDs to the correct provider:
import { createOpenAI } from "@ai-sdk/openai";

function createModelClient(modelId: string) {
  // Custom-hosted models
  if (modelId.startsWith("custom:")) {
    return createOpenAI({
      apiKey: process.env.CUSTOM_API_KEY,
      baseURL: "https://your-custom-endpoint.com/v1",
    })(modelId.replace("custom:", ""));
  }

  // Together AI
  if (modelId.startsWith("together:")) {
    return createOpenAI({
      apiKey: process.env.TOGETHER_API_KEY,
      baseURL: "https://api.together.xyz/v1",
    })(modelId.replace("together:", ""));
  }

  // Default to OpenAI
  return createOpenAI()(modelId);
}

Full Example

import { models } from "@fallom/trace";
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";

// Initialize once at startup
models.init({ apiKey: process.env.FALLOM_API_KEY });

async function chat(sessionId: string, message: string) {
  // Get A/B tested model
  const modelId = await models.get("my-agent", sessionId, {
    fallback: "gpt-4o-mini",
  });

  // Route to correct provider
  const model = createModelClient(modelId);

  // Use with Vercel AI SDK
  const result = await generateText({
    model,
    prompt: message,
  });

  return result.text;
}

Use Cases

  • Cost optimization - A/B test expensive vs cheap models
  • Latency testing - Compare self-hosted vs API latency
  • Gradual migration - Roll out new model providers safely
  • Fallback routing - Scale custom models to 0% instantly if issues arise

Resilience Guarantees

Zero Latency

Targeting evaluated client-side, no network call

Background Sync

Config sync never blocks your requests

Graceful Degradation

Returns fallback on any error

Sticky Sessions

Same session always gets same model

Next Steps