Vercel AI SDK

Fallom provides a wrapper for the Vercel AI SDK that automatically traces all your LLM calls, including streaming with time-to-first-token metrics.

Installation

npm install @fallom/trace ai @ai-sdk/openai

Quick Start

import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

// 1. Initialize Fallom
await trace.init({ apiKey: "your-fallom-api-key" });

// 2. Wrap the AI SDK
const { generateText, streamText, generateObject, streamObject } =
  trace.wrapAISDK(ai);

// 3. Create your provider
const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// 4. Set session context (configKey, sessionId, userId)
trace.setSession("my-app", "session-123", "user-456");

// 5. Use as normal - automatically traced!
const { text } = await generateText({
  model: openai("gpt-4o"),
  prompt: "What is the capital of France?",
});

With OpenRouter

import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

// Initialize Fallom
await trace.init({ apiKey: process.env.FALLOM_API_KEY });

// Wrap the AI SDK functions
const { generateText, streamText, generateObject, streamObject } =
  trace.wrapAISDK(ai);

// Create OpenRouter provider
const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

// Set session context with user tracking
// Args: configKey, sessionId, userId (optional)
trace.setSession("my-app", "conversation-123", "user-456");

// All calls are automatically traced with session + user
const { text } = await generateText({
  model: openrouter("openai/gpt-4o-mini"),
  prompt: "Hello!",
});

console.log(text);
// Trace includes: model, tokens, latency, session_id, customer_id

setSession Parameters

trace.setSession(
  configKey: string,   // Groups traces in dashboard (e.g., "my-app", "chatbot")
  sessionId: string,   // Conversation/session ID (e.g., "conv-123")
  userId?: string      // Optional: End user ID for per-user analytics
);

Streaming with TTFT

Streaming responses automatically capture time to first token (TTFT):

import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

await trace.init({ apiKey: process.env.FALLOM_API_KEY });

const { streamText } = trace.wrapAISDK(ai);

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

trace.setSession("my-app", "session-123", "user-456");

const result = await streamText({
  model: openai("gpt-4o"),
  prompt: "Write a short poem about coding.",
});

// Consume the stream
for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

// Trace is sent after stream completes with:
// - Total duration
// - Time to first token
// - Token counts
// - Session and user IDs

Structured Output

generateObject and streamObject are also supported:

import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";

await trace.init({ apiKey: process.env.FALLOM_API_KEY });

const { generateObject } = trace.wrapAISDK(ai);

const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

trace.setSession("my-app", "session-123", "user-456");

const { object } = await generateObject({
  model: openrouter("openai/gpt-4o-mini"),
  schema: z.object({
    name: z.string(),
    age: z.number(),
  }),
  prompt: "Generate a random person.",
});

console.log(object); // { name: "Alice", age: 28 }

Tool Calling (Agents)

The Vercel AI SDK supports tool/function calling with maxSteps for multi-turn agent behavior. All tool calls are automatically traced:

import { trace } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";
import { z } from "zod";

await trace.init({ apiKey: "your-fallom-api-key" });

const { generateText } = trace.wrapAISDK(ai);

const openai = createOpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Define tools
const weatherTool = ai.tool({
  description: "Get the current weather for a location",
  parameters: z.object({
    location: z.string().describe("The city and country"),
  }),
  execute: async ({ location }) => ({
    location,
    temperature: 72,
    condition: "sunny",
  }),
});

const calculatorTool = ai.tool({
  description: "Perform math calculations",
  parameters: z.object({
    expression: z.string().describe("Math expression like '2 + 2'"),
  }),
  execute: async ({ expression }) => {
    const result = eval(expression); // Use a safe evaluator in production
    return { expression, result };
  },
});

trace.setSession("my-agent", "session-123", "user-456");

// Agent-style execution with multiple tool calls
const { text, steps } = await generateText({
  model: openai("gpt-4o"),
  tools: { weather: weatherTool, calculator: calculatorTool },
  maxSteps: 5, // Allow multiple tool call rounds
  prompt: "What's the weather in Tokyo? Also, what is 15 * 7?",
});

console.log(text);
console.log(`Completed in ${steps.length} steps`);

What Gets Traced for Tool Calls

Field	Description
Tool Calls	Functions called by the model
Tool Results	Responses from tool execution
Steps	Each round of tool interaction

Model A/B Testing

Test different models with consistent session assignment:

import { trace, models } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

// Initialize both trace and models
await trace.init({ apiKey: "your-fallom-api-key" });
models.init({ apiKey: "your-fallom-api-key" });

const { generateText } = trace.wrapAISDK(ai);

const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const sessionId = "user-123-conversation-456";

// Get assigned model for this session (sticky assignment)
const modelId = await models.get("my-experiment", sessionId, {
  fallback: "openai/gpt-4o-mini",
});

trace.setSession("my-experiment", sessionId);

const { text } = await generateText({
  model: openrouter(modelId), // Uses A/B test assigned model
  prompt: "Hello!",
});

Prompt Management

Use managed prompts with the Vercel AI SDK:

import { trace, prompts } from "@fallom/trace";
import * as ai from "ai";
import { createOpenAI } from "@ai-sdk/openai";

// Initialize both trace and prompts
await trace.init({ apiKey: process.env.FALLOM_API_KEY });
prompts.init({ apiKey: process.env.FALLOM_API_KEY });

const { generateText } = trace.wrapAISDK(ai);

const openrouter = createOpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

trace.setSession("my-app", "session-123", "user-456");

// Get managed prompt (auto-tagged to trace)
const prompt = await prompts.get("fun-facts", {
  variables: { topic: "science" },
});

const { text } = await generateText({
  model: openrouter("openai/gpt-4o-mini"),
  system: prompt.system,
  prompt: prompt.user,
});

What Gets Traced

Field	Description
Model	The model used
Duration	Total request time (ms)
Time to First Token	Streaming latency (ms) - streaming only
Tokens	Prompt, completion, and total token counts
Prompts	Input messages/prompt
Completions	Model output
Session	Config key and session ID
Customer	Optional customer/user ID

With Other Providers

The wrapper works with any Vercel AI SDK provider:

Anthropic
Google
Mistral

import { createAnthropic } from "@ai-sdk/anthropic";

const anthropic = createAnthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});

const { text } = await generateText({
model: anthropic("claude-3-5-sonnet-20241022"),
prompt: "Hello!",
});

import { createGoogleGenerativeAI } from "@ai-sdk/google";

const google = createGoogleGenerativeAI({
  apiKey: process.env.GOOGLE_API_KEY,
});

const { text } = await generateText({
  model: google("gemini-1.5-pro"),
  prompt: "Hello!",
});

import { createMistral } from "@ai-sdk/mistral";

const mistral = createMistral({
apiKey: process.env.MISTRAL_API_KEY,
});

const { text } = await generateText({
model: mistral("mistral-large-latest"),
prompt: "Hello!",
});

Alternative: OpenRouter Broadcast

If you’re using OpenRouter and prefer zero-SDK tracing, use OpenRouter Broadcast. However, Vercel AI SDK doesn’t pass custom body fields to OpenRouter, so you’ll need to use the OpenAI SDK directly for session tracking:

import OpenAI from "openai";

// Use OpenAI SDK for broadcast (not Vercel AI SDK)
const openrouter = new OpenAI({
  apiKey: process.env.OPENROUTER_API_KEY,
  baseURL: "https://openrouter.ai/api/v1",
  defaultHeaders: {
    "X-Broadcast-URL": "https://broadcast.fallom.com/v1/traces",
    "X-Broadcast-Auth": "Bearer YOUR_FALLOM_API_KEY",
  },
});

// For structured output, use JSON mode
const response = await openrouter.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [
    { role: "system", content: "Output JSON: {name: string, age: number}" },
    { role: "user", content: "Generate a person." },
  ],
  response_format: { type: "json_object" },
  // @ts-ignore - OpenRouter extensions for session tracking
  session_id: "conversation-123",
  user: "customer-456",
});

const person = JSON.parse(response.choices[0].message.content!);

Feature	SDK Wrapper (`wrapAISDK`)	Broadcast (OpenAI SDK)
Vercel AI SDK	✅ Full support	❌ Use OpenAI SDK
Session tracking	`trace.setSession()`	Body fields
Model A/B Testing	✅	❌
Prompt Management	✅	❌
Time to first token	✅ (streaming)	❌
Setup complexity	More code	Just headers

Recommendation: Use trace.wrapAISDK() for Vercel AI SDK. It provides full tracing, session tracking, and works with all Vercel AI SDK functions including generateObject and streamObject.

Next Steps

Model A/B Testing

Set up experiments to test different models.

Prompt Management

Manage and version your prompts.

Mastra Agents

Use with Mastra AI agent framework.

OpenRouter Broadcast

Zero-code tracing via OpenRouter.

Getting Started

Features

TypeScript Integrations

Python Integrations

No-Code Integrations

Installation

Quick Start

With OpenRouter

setSession Parameters

Streaming with TTFT

Structured Output

Tool Calling (Agents)

What Gets Traced for Tool Calls

Model A/B Testing

Prompt Management

What Gets Traced

With Other Providers

Alternative: OpenRouter Broadcast

Next Steps

Model A/B Testing

Prompt Management

Mastra Agents

OpenRouter Broadcast

Getting Started

Features

TypeScript Integrations

Python Integrations

No-Code Integrations

​Installation

​Quick Start

​With OpenRouter

​setSession Parameters

​Streaming with TTFT

​Structured Output

​Tool Calling (Agents)

​What Gets Traced for Tool Calls

​Model A/B Testing

​Prompt Management

​What Gets Traced

​With Other Providers

​Alternative: OpenRouter Broadcast

​Next Steps

Model A/B Testing

Prompt Management

Mastra Agents

OpenRouter Broadcast

Installation

Quick Start

With OpenRouter

setSession Parameters

Streaming with TTFT

Structured Output

Tool Calling (Agents)

What Gets Traced for Tool Calls

Model A/B Testing

Prompt Management

What Gets Traced

With Other Providers

Alternative: OpenRouter Broadcast

Next Steps