Skip to main content
Run A/B tests on models with zero latency. Same session always gets same model (sticky assignment).
Create and manage your model configs in the dashboard.

Setup

from fallom import models

models.init(api_key="your-fallom-api-key")

Basic Usage

from fallom import models

# Get assigned model for this session
model = models.get("summarizer-config", session_id)
# Returns: "gpt-4o" or "claude-3-5-sonnet" based on your config weights

agent = Agent(model=model)
agent.run(message)

Version Pinning

Pin to a specific config version, or use latest (default):
# Use latest version (default)
model = models.get("my-config", session_id)

# Pin to specific version
model = models.get("my-config", session_id, version=2)

Fallback for Resilience

Always provide a fallback so your app works even if Fallom is down:
model = models.get(
    "my-config",
    session_id,
    fallback="gpt-4o-mini"  # Used if config not found or Fallom unreachable
)

User Targeting

Override weighted distribution for specific users or segments:
model = models.get(
    "my-config",
    session_id,
    fallback="gpt-4o-mini",
    customer_id="user-123",  # For individual targeting
    context={                 # For rule-based targeting
        "plan": "enterprise",
        "region": "us-west"
    }
)

User Targeting

Target specific users or segments to specific model variants. This is useful for:
  • Beta testing - Roll out new models to specific users first
  • Enterprise features - Give premium users access to better models
  • Gradual rollouts - Target by region, plan, or any custom attribute

How It Works

Targeting rules are evaluated client-side for zero latency:
  1. Individual Targets - Exact match on customerId or any field
  2. Rules - Condition-based targeting (all conditions in a rule must match)
  3. Fallback - If no targeting matches, use weighted random distribution

Configuration

Configure targeting in the dashboard when editing a model config:
{
  "enabled": true,
  "individualTargets": [
    { "field": "customerId", "value": "vip-user-123", "variantIndex": 1 }
  ],
  "rules": [
    {
      "conditions": [
        { "field": "plan", "operator": "eq", "value": "enterprise" }
      ],
      "variantIndex": 1
    }
  ]
}

Supported Operators

OperatorDescriptionExample
eqEqualsplan = "enterprise"
neqNot equalsplan ≠ "free"
inIn listplan in ["enterprise", "business"]
ninNot in listregion not in ["cn", "ru"]
containsContains substringemail contains "@acme.com"
startsWithStarts withregion starts with "eu-"
endsWithEnds withemail ends with ".gov"

Custom Model Providers

A/B test between standard API models and custom-hosted models (self-hosted, Novita, Together, Fireworks, Ollama, etc.).

Dashboard Setup

Create a config with custom model names using any naming convention:
VariantModelWeight
Controlgpt-4o50%
Customcustom:my-llama-70b50%
Use a prefix like custom:, together:, or local: to identify non-standard providers.

Provider Routing

Create a helper function to route model IDs to the correct provider:
import { createOpenAI } from "@ai-sdk/openai";

function createModelClient(modelId: string) {
  // Custom-hosted models
  if (modelId.startsWith("custom:")) {
    return createOpenAI({
      apiKey: process.env.CUSTOM_API_KEY,
      baseURL: "https://your-custom-endpoint.com/v1",
    })(modelId.replace("custom:", ""));
  }

  // Together AI
  if (modelId.startsWith("together:")) {
    return createOpenAI({
      apiKey: process.env.TOGETHER_API_KEY,
      baseURL: "https://api.together.xyz/v1",
    })(modelId.replace("together:", ""));
  }

  // Default to OpenAI
  return createOpenAI()(modelId);
}

Full Example

import { models } from "@fallom/trace";
import { createOpenAI } from "@ai-sdk/openai";
import { generateText } from "ai";

// Initialize once at startup
models.init({ apiKey: process.env.FALLOM_API_KEY });

async function chat(sessionId: string, message: string) {
  // Get A/B tested model
  const modelId = await models.get("my-agent", sessionId, {
    fallback: "gpt-4o-mini",
  });

  // Route to correct provider
  const model = createModelClient(modelId);

  // Use with Vercel AI SDK
  const result = await generateText({
    model,
    prompt: message,
  });

  return result.text;
}

Use Cases

  • Cost optimization - A/B test expensive vs cheap models
  • Latency testing - Compare self-hosted vs API latency
  • Gradual migration - Roll out new model providers safely
  • Fallback routing - Scale custom models to 0% instantly if issues arise

Resilience Guarantees

Zero Latency

Targeting evaluated client-side, no network call

Background Sync

Config sync never blocks your requests

Graceful Degradation

Returns fallback on any error

Sticky Sessions

Same session always gets same model

Next Steps

Prompt Management

Test different prompts alongside model variants.

View Analytics

Analyze experiment results in your dashboard.