Fallom provides a wrapper for the Vercel AI SDK that automatically traces all your LLM calls, including streaming with time-to-first-token metrics.
Installation
npm install @fallom/trace ai @ai-sdk/openai
Quick Start
import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
// 1. Initialize Fallom
await trace.init({ apiKey: "your-fallom-api-key" });
// 2. Wrap the AI SDK
const { generateText, streamText, generateObject, streamObject } =
trace.wrapAISDK(ai);
// 3. Create your provider
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// 4. Set session context (configKey, sessionId, userId)
trace.setSession("my-app", "session-123", "user-456");
// 5. Use as normal - automatically traced!
const { text } = await generateText({
model: openai("gpt-4o"),
prompt: "What is the capital of France?",
});
With OpenRouter
import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
// Initialize Fallom
await trace.init({ apiKey: process.env.FALLOM_API_KEY });
// Wrap the AI SDK functions
const { generateText, streamText, generateObject, streamObject } =
trace.wrapAISDK(ai);
// Create OpenRouter provider
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
// Set session context with user tracking
// Args: configKey, sessionId, userId (optional)
trace.setSession("my-app", "conversation-123", "user-456");
// All calls are automatically traced with session + user
const { text } = await generateText({
model: openrouter("openai/gpt-4o-mini"),
prompt: "Hello!",
});
console.log(text);
// Trace includes: model, tokens, latency, session_id, customer_id
setSession Parameters
trace.setSession(
configKey: string, // Groups traces in dashboard (e.g., "my-app", "chatbot")
sessionId: string, // Conversation/session ID (e.g., "conv-123")
userId?: string // Optional: End user ID for per-user analytics
);
Streaming with TTFT
Streaming responses automatically capture time to first token (TTFT):
import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
await trace.init({ apiKey: process.env.FALLOM_API_KEY });
const { streamText } = trace.wrapAISDK(ai);
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
trace.setSession("my-app", "session-123", "user-456");
const result = await streamText({
model: openai("gpt-4o"),
prompt: "Write a short poem about coding.",
});
// Consume the stream
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
// Trace is sent after stream completes with:
// - Total duration
// - Time to first token
// - Token counts
// - Session and user IDs
Structured Output
generateObject and streamObject are also supported:
import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";
await trace.init({ apiKey: process.env.FALLOM_API_KEY });
const { generateObject } = trace.wrapAISDK(ai);
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
trace.setSession("my-app", "session-123", "user-456");
const { object } = await generateObject({
model: openrouter("openai/gpt-4o-mini"),
schema: z.object({
name: z.string(),
age: z.number(),
}),
prompt: "Generate a random person.",
});
console.log(object); // { name: "Alice", age: 28 }
The Vercel AI SDK supports tool/function calling with maxSteps for multi-turn agent behavior. All tool calls are automatically traced:
import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";
await trace.init({ apiKey: "your-fallom-api-key" });
const { generateText } = trace.wrapAISDK(ai);
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// Define tools
const weatherTool = ai.tool({
description: "Get the current weather for a location",
parameters: z.object({
location: z.string().describe("The city and country"),
}),
execute: async ({ location }) => ({
location,
temperature: 72,
condition: "sunny",
}),
});
const calculatorTool = ai.tool({
description: "Perform math calculations",
parameters: z.object({
expression: z.string().describe("Math expression like '2 + 2'"),
}),
execute: async ({ expression }) => {
const result = eval(expression); // Use a safe evaluator in production
return { expression, result };
},
});
trace.setSession("my-agent", "session-123", "user-456");
// Agent-style execution with multiple tool calls
const { text, steps } = await generateText({
model: openai("gpt-4o"),
tools: { weather: weatherTool, calculator: calculatorTool },
maxSteps: 5, // Allow multiple tool call rounds
prompt: "What's the weather in Tokyo? Also, what is 15 * 7?",
});
console.log(text);
console.log(`Completed in ${steps.length} steps`);
| Field | Description |
|---|
| Tool Calls | Functions called by the model |
| Tool Results | Responses from tool execution |
| Steps | Each round of tool interaction |
Model A/B Testing
Test different models with consistent session assignment:
import { trace, models } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
// Initialize both trace and models
await trace.init({ apiKey: "your-fallom-api-key" });
models.init({ apiKey: "your-fallom-api-key" });
const { generateText } = trace.wrapAISDK(ai);
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const sessionId = "user-123-conversation-456";
// Get assigned model for this session (sticky assignment)
const modelId = await models.get("my-experiment", sessionId, {
fallback: "openai/gpt-4o-mini",
});
trace.setSession("my-experiment", sessionId);
const { text } = await generateText({
model: openrouter(modelId), // Uses A/B test assigned model
prompt: "Hello!",
});
Prompt Management
Use managed prompts with the Vercel AI SDK:
import { trace, prompts } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
// Initialize both trace and prompts
await trace.init({ apiKey: process.env.FALLOM_API_KEY });
prompts.init({ apiKey: process.env.FALLOM_API_KEY });
const { generateText } = trace.wrapAISDK(ai);
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
trace.setSession("my-app", "session-123", "user-456");
// Get managed prompt (auto-tagged to trace)
const prompt = await prompts.get("fun-facts", {
variables: { topic: "science" },
});
const { text } = await generateText({
model: openrouter("openai/gpt-4o-mini"),
system: prompt.system,
prompt: prompt.user,
});
What Gets Traced
| Field | Description |
|---|
| Model | The model used |
| Duration | Total request time (ms) |
| Time to First Token | Streaming latency (ms) - streaming only |
| Tokens | Prompt, completion, and total token counts |
| Prompts | Input messages/prompt |
| Completions | Model output |
| Session | Config key and session ID |
| Customer | Optional customer/user ID |
With Other Providers
The wrapper works with any Vercel AI SDK provider:
import { createAnthropic } from "@ai-sdk/anthropic";
const anthropic = createAnthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const { text } = await generateText({
model: anthropic("claude-3-5-sonnet-20241022"),
prompt: "Hello!",
});
import { createGoogleGenerativeAI } from "@ai-sdk/google";
const google = createGoogleGenerativeAI({
apiKey: process.env.GOOGLE_API_KEY,
});
const { text } = await generateText({
model: google("gemini-1.5-pro"),
prompt: "Hello!",
});
import { createMistral } from "@ai-sdk/mistral";
const mistral = createMistral({
apiKey: process.env.MISTRAL_API_KEY,
});
const { text } = await generateText({
model: mistral("mistral-large-latest"),
prompt: "Hello!",
});
Alternative: OpenRouter Broadcast
If you’re using OpenRouter and prefer zero-SDK tracing, use OpenRouter Broadcast. However, Vercel AI SDK doesn’t pass custom body fields to OpenRouter, so you’ll need to use the OpenAI SDK directly for session tracking:
import OpenAI from "openai";
// Use OpenAI SDK for broadcast (not Vercel AI SDK)
const openrouter = new OpenAI({
apiKey: process.env.OPENROUTER_API_KEY,
baseURL: "https://openrouter.ai/api/v1",
defaultHeaders: {
"X-Broadcast-URL": "https://broadcast.fallom.com/v1/traces",
"X-Broadcast-Auth": "Bearer YOUR_FALLOM_API_KEY",
},
});
// For structured output, use JSON mode
const response = await openrouter.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [
{ role: "system", content: "Output JSON: {name: string, age: number}" },
{ role: "user", content: "Generate a person." },
],
response_format: { type: "json_object" },
// @ts-ignore - OpenRouter extensions for session tracking
session_id: "conversation-123",
user: "customer-456",
});
const person = JSON.parse(response.choices[0].message.content!);
| Feature | SDK Wrapper (wrapAISDK) | Broadcast (OpenAI SDK) |
|---|
| Vercel AI SDK | ✅ Full support | ❌ Use OpenAI SDK |
| Session tracking | trace.setSession() | Body fields |
| Model A/B Testing | ✅ | ❌ |
| Prompt Management | ✅ | ❌ |
| Time to first token | ✅ (streaming) | ❌ |
| Setup complexity | More code | Just headers |
Recommendation: Use trace.wrapAISDK() for Vercel AI SDK. It provides
full tracing, session tracking, and works with all Vercel AI SDK functions
including generateObject and streamObject.
Next Steps