Skip to main content
Fallom provides a wrapper for the Vercel AI SDK that automatically traces all your LLM calls, including streaming with time-to-first-token metrics.
AI SDK v6 Support: Fallom fully supports AI SDK v6 including the new ToolLoopAgent class, tool approval workflows, and all v6 features. Both v5 and v6 patterns work with our SDK.

Installation

npm install @fallom/trace ai @ai-sdk/openai

Version Compatibility

AI SDK VersionStatusNotes
v6.x✅ Fully SupportedIncludes ToolLoopAgent, tool approval, DevTools
v5.x✅ Fully SupportedUse inputSchema for tools, maxSteps for agents
v4.x✅ SupportedUse parameters for tools
We recommend upgrading to AI SDK v6 for the latest features including reusable agents and improved type safety. Run npx @ai-sdk/codemod upgrade v6 to migrate automatically.

Two Approaches (Pick ONE)

Choose ONE approach per call. Do NOT mix them.Using both wrapAISDK and traceModel together will create duplicate traces.
ApproachBest ForCaptures
wrapAISDK(ai)Most users✅ Prompts, completions, tokens, costs, previews, finish reason
traceModel(model)PostHog-style, simpler⚠️ Tokens only (no prompt/completion content)

Full tracing - captures prompts, completions, tokens, costs, and all metadata.
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

// Initialize Fallom
await fallom.init({ apiKey: "your-fallom-api-key" });

// Create a session
const session = fallom.session({
  configKey: "my-app",
  sessionId: "session-123",
  customerId: "user-456",
});

// Wrap the AI SDK
const { generateText, streamText } = session.wrapAISDK(ai);

// Create your provider
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Use as normal - fully traced!
const { text } = await generateText({
  model: openai("gpt-4o"),
  prompt: "What is the capital of France?",
});

console.log(text);
// Trace includes: prompt, completion, tokens, costs, finish reason

Option 2: Wrap the Model (PostHog Style)

Simpler integration - captures tokens and timing only.
This approach does NOT capture prompt or completion content. Use Option 1 if you need full observability.
import fallom from "@fallom/trace";
import { generateText } from "ai"; // Import original SDK
import { createOpenAI } from "@ai-sdk/openai";

// Initialize Fallom
await fallom.init({ apiKey: "your-fallom-api-key" });

// Create a session
const session = fallom.session({
  configKey: "my-app",
  sessionId: "session-123",
  customerId: "user-456",
});

// Create your provider
const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });

// Wrap the MODEL
const tracedModel = session.traceModel(openai("gpt-4o"));

// Use original SDK with traced model
const { text } = await generateText({
  model: tracedModel,
  prompt: "What is the capital of France?",
});

console.log(text);
// Trace includes: tokens, timing (no prompt/completion content)

With OpenRouter

import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

await fallom.init({ apiKey: process.env.FALLOM_API_KEY });

const session = fallom.session({
  configKey: "my-app",
  sessionId: "conversation-123",
  customerId: "user-456",
});

// Wrap the AI SDK functions
const { generateText, streamText } = session.wrapAISDK(ai);

// Create OpenRouter provider
const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

// All calls are automatically traced with session + user
const { text } = await generateText({
  model: openrouter("openai/gpt-4o-mini"),
  prompt: "Hello!",
});

console.log(text);
// Trace includes: model, tokens, latency, session_id, customer_id

Streaming with TTFT

Streaming responses automatically capture time to first token (TTFT):
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

await fallom.init({ apiKey: process.env.FALLOM_API_KEY });

const session = fallom.session({
  configKey: "my-app",
  sessionId: "session-123",
  customerId: "user-456",
});

const { streamText } = session.wrapAISDK(ai);

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const result = await streamText({
  model: openai("gpt-4o"),
  prompt: "Write a short poem about coding.",
});

// Consume the stream
for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

// Trace is sent after stream completes with:
// - Total duration
// - Time to first token
// - Token counts
// - Session and user IDs

Structured Output

generateObject and streamObject are also supported:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";

await fallom.init({ apiKey: process.env.FALLOM_API_KEY });

const session = fallom.session({
  configKey: "my-app",
  sessionId: "session-123",
  customerId: "user-456",
});

const { generateObject } = session.wrapAISDK(ai);

const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const { object } = await generateObject({
  model: openrouter("openai/gpt-4o-mini"),
  schema: z.object({
    name: z.string(),
    age: z.number(),
  }),
  prompt: "Generate a random person.",
});

console.log(object); // { name: "Alice", age: 28 }

Tool Calling (Agents)

The Vercel AI SDK supports tool/function calling with maxSteps for multi-turn agent behavior. All tool calls are automatically traced:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";

await fallom.init({ apiKey: "your-fallom-api-key" });

const session = fallom.session({
  configKey: "my-agent",
  sessionId: "session-123",
  customerId: "user-456",
});

const { generateText } = session.wrapAISDK(ai);

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Define tools
const weatherTool = ai.tool({
  description: "Get the current weather for a location",
  parameters: z.object({
    location: z.string().describe("The city and country"),
  }),
  execute: async ({ location }) => ({
    location,
    temperature: 72,
    condition: "sunny",
  }),
});

const calculatorTool = ai.tool({
  description: "Perform math calculations",
  parameters: z.object({
    expression: z.string().describe("Math expression like '2 + 2'"),
  }),
  execute: async ({ expression }) => {
    const result = eval(expression); // Use a safe evaluator in production
    return { expression, result };
  },
});

// Agent-style execution with multiple tool calls
const { text, steps } = await generateText({
  model: openai("gpt-4o"),
  tools: { weather: weatherTool, calculator: calculatorTool },
  maxSteps: 5, // Allow multiple tool call rounds
  prompt: "What's the weather in Tokyo? Also, what is 15 * 7?",
});

console.log(text);
console.log(`Completed in ${steps.length} steps`);

What Gets Traced for Tool Calls

FieldDescription
Tool CallsFunctions called by the model
Tool ResultsResponses from tool execution
StepsEach round of tool interaction

AI SDK v6: ToolLoopAgent

AI SDK v6 introduces the ToolLoopAgent class for building reusable agents. Fallom supports tracing ToolLoopAgent by wrapping the underlying AI SDK functions:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { ToolLoopAgent } from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";

await fallom.init({ apiKey: "your-fallom-api-key" });

const session = fallom.session({
  configKey: "my-agent",
  sessionId: "session-123",
  customerId: "user-456",
});

// Wrap the AI SDK
const { generateText, streamText } = session.wrapAISDK(ai);

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Define your tools
const weatherTool = ai.tool({
  description: "Get the current weather for a location",
  parameters: z.object({
    location: z.string().describe("The city and country"),
  }),
  execute: async ({ location }) => ({
    location,
    temperature: 72,
    condition: "sunny",
  }),
});

// Create a ToolLoopAgent with traced generateText
const weatherAgent = new ToolLoopAgent({
  model: openai("gpt-4o"),
  instructions: "You are a helpful weather assistant.",
  tools: { weather: weatherTool },
  // Pass the wrapped generateText for tracing
  generateText: generateText,
});

// Use the agent - all calls are automatically traced!
const result = await weatherAgent.generate({
  prompt: "What's the weather in San Francisco?",
});

console.log(result.text);

ToolLoopAgent with Streaming

// Create agent with traced streamText for streaming
const streamingAgent = new ToolLoopAgent({
  model: openai("gpt-4o"),
  instructions: "You are a helpful assistant.",
  tools: { weather: weatherTool },
  streamText: streamText, // Use wrapped streamText
});

const result = await streamingAgent.stream({
  prompt: "What's the weather in New York?",
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Call Options (v6)

AI SDK v6 supports type-safe call options for dynamic agent configuration:
import { z } from "zod";

const supportAgent = new ToolLoopAgent({
  model: openai("gpt-4o"),
  callOptionsSchema: z.object({
    userId: z.string(),
    accountType: z.enum(["free", "pro", "enterprise"]),
  }),
  prepareCall: ({ options, ...settings }) => ({
    ...settings,
    instructions: `You are a helpful support agent.
- User Account type: ${options.accountType}
- User ID: ${options.userId}`,
  }),
  generateText: generateText, // Traced
});

// Call with typed options
const result = await supportAgent.generate({
  prompt: "How do I upgrade my account?",
  options: {
    userId: "user_123",
    accountType: "free",
  },
});

Tool Approval (Human-in-the-Loop)

AI SDK v6 supports tool approval workflows. Use onStepFinish to implement human-in-the-loop:
const { text, steps } = await generateText({
  model: openai("gpt-4o"),
  tools: { weather: weatherTool },
  maxSteps: 5,
  prompt: "What's the weather in Tokyo?",
  onStepFinish: async ({ stepType, toolCalls, toolResults }) => {
    // Implement your approval logic here
    // In production, you might pause and wait for user approval
    if (toolCalls && toolCalls.length > 0) {
      console.log(`Tools called: ${toolCalls.map(tc => tc.toolName).join(", ")}`);
      // await waitForUserApproval(toolCalls);
    }
  },
});

Model A/B Testing

Test different models with consistent session assignment:
import fallom from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

await fallom.init({ apiKey: "your-fallom-api-key" });

const session = fallom.session({
  configKey: "my-experiment",
  sessionId: "user-123-conversation-456",
});

const { generateText } = session.wrapAISDK(ai);

const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

// Get assigned model for this session (sticky assignment)
const modelId = await session.getModel({ fallback: "openai/gpt-4o-mini" });

const { text } = await generateText({
  model: openrouter(modelId), // Uses A/B test assigned model
  prompt: "Hello!",
});

Prompt Management

Use managed prompts with the Vercel AI SDK:
import fallom, { prompts } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

await fallom.init({ apiKey: process.env.FALLOM_API_KEY });

const session = fallom.session({
  configKey: "my-app",
  sessionId: "session-123",
  customerId: "user-456",
});

const { generateText } = session.wrapAISDK(ai);

const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

// Get managed prompt (auto-tagged to trace)
const prompt = await prompts.get("fun-facts", {
  variables: { topic: "science" },
});

const { text } = await generateText({
  model: openrouter("openai/gpt-4o-mini"),
  system: prompt.system,
  prompt: prompt.user,
});

What Gets Traced

FieldDescription
ModelThe model used
DurationTotal request time (ms)
Time to First TokenStreaming latency (ms) - streaming only
TokensPrompt, completion, and total token counts
PromptsInput messages/prompt
CompletionsModel output
SessionConfig key and session ID
CustomerOptional customer/user ID

With Other Providers

The wrapper works with any Vercel AI SDK provider:
import { createAnthropic } from "@ai-sdk/anthropic";

const anthropic = createAnthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});

const { text } = await generateText({
model: anthropic("claude-3-5-sonnet-20241022"),
prompt: "Hello!",
});

Alternative: OpenRouter Broadcast

If you’re using OpenRouter and prefer zero-SDK tracing, use OpenRouter Broadcast. However, Vercel AI SDK doesn’t pass custom body fields to OpenRouter, so you’ll need to use the OpenAI SDK directly for session tracking:
import OpenAI from "openai";

// Use OpenAI SDK for broadcast (not Vercel AI SDK)
const openrouter = new OpenAI({
  apiKey: process.env.OPENROUTER_API_KEY,
  baseURL: "https://openrouter.ai/api/v1",
  defaultHeaders: {
    "X-Broadcast-URL": "https://broadcast.fallom.com/v1/traces",
    "X-Broadcast-Auth": "Bearer YOUR_FALLOM_API_KEY",
  },
});

// For structured output, use JSON mode
const response = await openrouter.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [
    { role: "system", content: "Output JSON: {name: string, age: number}" },
    { role: "user", content: "Generate a person." },
  ],
  response_format: { type: "json_object" },
  // @ts-ignore - OpenRouter extensions for session tracking
  session_id: "conversation-123",
  user: "customer-456",
});

const person = JSON.parse(response.choices[0].message.content!);
FeatureSDK Wrapper (session.wrapAISDK)Broadcast (OpenAI SDK)
Vercel AI SDK✅ Full support❌ Use OpenAI SDK
Session trackingfallom.session()Body fields
Model A/B Testing
Prompt Management
Time to first token✅ (streaming)
Setup complexityMore codeJust headers
Recommendation: Use session.wrapAISDK() for Vercel AI SDK. It provides full tracing, session tracking, and works with all Vercel AI SDK functions including generateObject and streamObject.

Next Steps