Fallom provides a wrapper for the Vercel AI SDK that automatically traces all your LLM calls, including streaming with time-to-first-token metrics.
AI SDK v6 Support: Fallom fully supports AI SDK v6 including the new ToolLoopAgent class, tool approval workflows, and all v6 features. Both v5 and v6 patterns work with our SDK.
Installation
npm install @fallom/trace ai @ai-sdk/openai
Version Compatibility
| AI SDK Version | Status | Notes |
|---|
| v6.x | ✅ Fully Supported | Includes ToolLoopAgent, tool approval, DevTools |
| v5.x | ✅ Fully Supported | Use inputSchema for tools, maxSteps for agents |
| v4.x | ✅ Supported | Use parameters for tools |
We recommend upgrading to AI SDK v6 for the latest features including reusable agents and improved type safety. Run npx @ai-sdk/codemod upgrade v6 to migrate automatically.
Two Approaches (Pick ONE)
Choose ONE approach per call. Do NOT mix them.Using both wrapAISDK and traceModel together will create duplicate traces.
| Approach | Best For | Captures |
|---|
wrapAISDK(ai) | Most users | ✅ Prompts, completions, tokens, costs, previews, finish reason |
traceModel(model) | PostHog-style, simpler | ⚠️ Tokens only (no prompt/completion content) |
Option 1: Wrap the SDK (Recommended)
Full tracing - captures prompts, completions, tokens, costs, and all metadata.
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
// Initialize Fallom
await fallom.init({ apiKey: "your-fallom-api-key" });
// Create a session
const session = fallom.session({
configKey: "my-app",
sessionId: "session-123",
customerId: "user-456",
});
// Wrap the AI SDK
const { generateText, streamText } = session.wrapAISDK(ai);
// Create your provider
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Use as normal - fully traced!
const { text } = await generateText({
model: openai("gpt-4o"),
prompt: "What is the capital of France?",
});
console.log(text);
// Trace includes: prompt, completion, tokens, costs, finish reason
Option 2: Wrap the Model (PostHog Style)
Simpler integration - captures tokens and timing only.
This approach does NOT capture prompt or completion content. Use Option 1 if
you need full observability.
import fallom from "@fallom/trace";
import { generateText } from "ai"; // Import original SDK
import { createOpenAI } from "@ai-sdk/openai";
// Initialize Fallom
await fallom.init({ apiKey: "your-fallom-api-key" });
// Create a session
const session = fallom.session({
configKey: "my-app",
sessionId: "session-123",
customerId: "user-456",
});
// Create your provider
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Wrap the MODEL
const tracedModel = session.traceModel(openai("gpt-4o"));
// Use original SDK with traced model
const { text } = await generateText({
model: tracedModel,
prompt: "What is the capital of France?",
});
console.log(text);
// Trace includes: tokens, timing (no prompt/completion content)
With OpenRouter
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
await fallom.init({ apiKey: process.env.FALLOM_API_KEY });
const session = fallom.session({
configKey: "my-app",
sessionId: "conversation-123",
customerId: "user-456",
});
// Wrap the AI SDK functions
const { generateText, streamText } = session.wrapAISDK(ai);
// Create OpenRouter provider
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
// All calls are automatically traced with session + user
const { text } = await generateText({
model: openrouter("openai/gpt-4o-mini"),
prompt: "Hello!",
});
console.log(text);
// Trace includes: model, tokens, latency, session_id, customer_id
Streaming with TTFT
Streaming responses automatically capture time to first token (TTFT):
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
await fallom.init({ apiKey: process.env.FALLOM_API_KEY });
const session = fallom.session({
configKey: "my-app",
sessionId: "session-123",
customerId: "user-456",
});
const { streamText } = session.wrapAISDK(ai);
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const result = await streamText({
model: openai("gpt-4o"),
prompt: "Write a short poem about coding.",
});
// Consume the stream
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
// Trace is sent after stream completes with:
// - Total duration
// - Time to first token
// - Token counts
// - Session and user IDs
Structured Output
generateObject and streamObject are also supported:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";
await fallom.init({ apiKey: process.env.FALLOM_API_KEY });
const session = fallom.session({
configKey: "my-app",
sessionId: "session-123",
customerId: "user-456",
});
const { generateObject } = session.wrapAISDK(ai);
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const { object } = await generateObject({
model: openrouter("openai/gpt-4o-mini"),
schema: z.object({
name: z.string(),
age: z.number(),
}),
prompt: "Generate a random person.",
});
console.log(object); // { name: "Alice", age: 28 }
The Vercel AI SDK supports tool/function calling with maxSteps for multi-turn agent behavior. All tool calls are automatically traced:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";
await fallom.init({ apiKey: "your-fallom-api-key" });
const session = fallom.session({
configKey: "my-agent",
sessionId: "session-123",
customerId: "user-456",
});
const { generateText } = session.wrapAISDK(ai);
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// Define tools
const weatherTool = ai.tool({
description: "Get the current weather for a location",
parameters: z.object({
location: z.string().describe("The city and country"),
}),
execute: async ({ location }) => ({
location,
temperature: 72,
condition: "sunny",
}),
});
const calculatorTool = ai.tool({
description: "Perform math calculations",
parameters: z.object({
expression: z.string().describe("Math expression like '2 + 2'"),
}),
execute: async ({ expression }) => {
const result = eval(expression); // Use a safe evaluator in production
return { expression, result };
},
});
// Agent-style execution with multiple tool calls
const { text, steps } = await generateText({
model: openai("gpt-4o"),
tools: { weather: weatherTool, calculator: calculatorTool },
maxSteps: 5, // Allow multiple tool call rounds
prompt: "What's the weather in Tokyo? Also, what is 15 * 7?",
});
console.log(text);
console.log(`Completed in ${steps.length} steps`);
| Field | Description |
|---|
| Tool Calls | Functions called by the model |
| Tool Results | Responses from tool execution |
| Steps | Each round of tool interaction |
AI SDK v6: ToolLoopAgent
AI SDK v6 introduces the ToolLoopAgent class for building reusable agents. Fallom supports tracing ToolLoopAgent by wrapping the underlying AI SDK functions:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { ToolLoopAgent } from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";
await fallom.init({ apiKey: "your-fallom-api-key" });
const session = fallom.session({
configKey: "my-agent",
sessionId: "session-123",
customerId: "user-456",
});
// Wrap the AI SDK
const { generateText, streamText } = session.wrapAISDK(ai);
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// Define your tools
const weatherTool = ai.tool({
description: "Get the current weather for a location",
parameters: z.object({
location: z.string().describe("The city and country"),
}),
execute: async ({ location }) => ({
location,
temperature: 72,
condition: "sunny",
}),
});
// Create a ToolLoopAgent with traced generateText
const weatherAgent = new ToolLoopAgent({
model: openai("gpt-4o"),
instructions: "You are a helpful weather assistant.",
tools: { weather: weatherTool },
// Pass the wrapped generateText for tracing
generateText: generateText,
});
// Use the agent - all calls are automatically traced!
const result = await weatherAgent.generate({
prompt: "What's the weather in San Francisco?",
});
console.log(result.text);
ToolLoopAgent with Streaming
// Create agent with traced streamText for streaming
const streamingAgent = new ToolLoopAgent({
model: openai("gpt-4o"),
instructions: "You are a helpful assistant.",
tools: { weather: weatherTool },
streamText: streamText, // Use wrapped streamText
});
const result = await streamingAgent.stream({
prompt: "What's the weather in New York?",
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
Call Options (v6)
AI SDK v6 supports type-safe call options for dynamic agent configuration:
import { z } from "zod";
const supportAgent = new ToolLoopAgent({
model: openai("gpt-4o"),
callOptionsSchema: z.object({
userId: z.string(),
accountType: z.enum(["free", "pro", "enterprise"]),
}),
prepareCall: ({ options, ...settings }) => ({
...settings,
instructions: `You are a helpful support agent.
- User Account type: ${options.accountType}
- User ID: ${options.userId}`,
}),
generateText: generateText, // Traced
});
// Call with typed options
const result = await supportAgent.generate({
prompt: "How do I upgrade my account?",
options: {
userId: "user_123",
accountType: "free",
},
});
AI SDK v6 supports tool approval workflows. Use onStepFinish to implement human-in-the-loop:
const { text, steps } = await generateText({
model: openai("gpt-4o"),
tools: { weather: weatherTool },
maxSteps: 5,
prompt: "What's the weather in Tokyo?",
onStepFinish: async ({ stepType, toolCalls, toolResults }) => {
// Implement your approval logic here
// In production, you might pause and wait for user approval
if (toolCalls && toolCalls.length > 0) {
console.log(`Tools called: ${toolCalls.map(tc => tc.toolName).join(", ")}`);
// await waitForUserApproval(toolCalls);
}
},
});
Model A/B Testing
Test different models with consistent session assignment:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
await fallom.init({ apiKey: "your-fallom-api-key" });
const session = fallom.session({
configKey: "my-experiment",
sessionId: "user-123-conversation-456",
});
const { generateText } = session.wrapAISDK(ai);
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
// Get assigned model for this session (sticky assignment)
const modelId = await session.getModel({ fallback: "openai/gpt-4o-mini" });
const { text } = await generateText({
model: openrouter(modelId), // Uses A/B test assigned model
prompt: "Hello!",
});
Prompt Management
Use managed prompts with the Vercel AI SDK:
import fallom, { prompts } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
await fallom.init({ apiKey: process.env.FALLOM_API_KEY });
const session = fallom.session({
configKey: "my-app",
sessionId: "session-123",
customerId: "user-456",
});
const { generateText } = session.wrapAISDK(ai);
const openrouter = createOpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
// Get managed prompt (auto-tagged to trace)
const prompt = await prompts.get("fun-facts", {
variables: { topic: "science" },
});
const { text } = await generateText({
model: openrouter("openai/gpt-4o-mini"),
system: prompt.system,
prompt: prompt.user,
});
What Gets Traced
| Field | Description |
|---|
| Model | The model used |
| Duration | Total request time (ms) |
| Time to First Token | Streaming latency (ms) - streaming only |
| Tokens | Prompt, completion, and total token counts |
| Prompts | Input messages/prompt |
| Completions | Model output |
| Session | Config key and session ID |
| Customer | Optional customer/user ID |
With Other Providers
The wrapper works with any Vercel AI SDK provider:
import { createAnthropic } from "@ai-sdk/anthropic";
const anthropic = createAnthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const { text } = await generateText({
model: anthropic("claude-3-5-sonnet-20241022"),
prompt: "Hello!",
});
import { createGoogleGenerativeAI } from "@ai-sdk/google";
const google = createGoogleGenerativeAI({
apiKey: process.env.GOOGLE_API_KEY,
});
const { text } = await generateText({
model: google("gemini-1.5-pro"),
prompt: "Hello!",
});
import { createMistral } from "@ai-sdk/mistral";
const mistral = createMistral({
apiKey: process.env.MISTRAL_API_KEY,
});
const { text } = await generateText({
model: mistral("mistral-large-latest"),
prompt: "Hello!",
});
Alternative: OpenRouter Broadcast
If you’re using OpenRouter and prefer zero-SDK tracing, use OpenRouter Broadcast. However, Vercel AI SDK doesn’t pass custom body fields to OpenRouter, so you’ll need to use the OpenAI SDK directly for session tracking:
import OpenAI from "openai";
// Use OpenAI SDK for broadcast (not Vercel AI SDK)
const openrouter = new OpenAI({
apiKey: process.env.OPENROUTER_API_KEY,
baseURL: "https://openrouter.ai/api/v1",
defaultHeaders: {
"X-Broadcast-URL": "https://broadcast.fallom.com/v1/traces",
"X-Broadcast-Auth": "Bearer YOUR_FALLOM_API_KEY",
},
});
// For structured output, use JSON mode
const response = await openrouter.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [
{ role: "system", content: "Output JSON: {name: string, age: number}" },
{ role: "user", content: "Generate a person." },
],
response_format: { type: "json_object" },
// @ts-ignore - OpenRouter extensions for session tracking
session_id: "conversation-123",
user: "customer-456",
});
const person = JSON.parse(response.choices[0].message.content!);
| Feature | SDK Wrapper (session.wrapAISDK) | Broadcast (OpenAI SDK) |
|---|
| Vercel AI SDK | ✅ Full support | ❌ Use OpenAI SDK |
| Session tracking | fallom.session() | Body fields |
| Model A/B Testing | ✅ | ❌ |
| Prompt Management | ✅ | ❌ |
| Time to first token | ✅ (streaming) | ❌ |
| Setup complexity | More code | Just headers |
Recommendation: Use session.wrapAISDK() for Vercel AI SDK. It provides
full tracing, session tracking, and works with all Vercel AI SDK functions
including generateObject and streamObject.
Next Steps