Documentation Index Fetch the complete documentation index at: https://docs.fallom.com/llms.txt
Use this file to discover all available pages before exploring further.
Fallom automatically traces all your LLM calls, capturing tokens, costs, latency, and full prompt/completion content.
TypeScript SDK
OpenAI
Anthropic
Vercel AI SDK
OpenRouter
Wrap your OpenAI client for automatic tracing: import fallom from "@fallom/trace" ;
import OpenAI from "openai" ;
await fallom . init ({ apiKey: process . env . FALLOM_API_KEY });
// Create a session for this conversation/request
const session = fallom . session ({
configKey: "my-app" ,
sessionId: "session-123" ,
customerId: "user-456" ,
});
const openai = session . wrapOpenAI ( new OpenAI ());
const response = await openai . chat . completions . create ({
model: "gpt-4o" ,
messages: [{ role: "user" , content: "Hello!" }],
});
Wrap your Anthropic client: import fallom from "@fallom/trace" ;
import Anthropic from "@anthropic-ai/sdk" ;
await fallom . init ({ apiKey: process . env . FALLOM_API_KEY });
const session = fallom . session ({
configKey: "my-app" ,
sessionId: "session-123" ,
customerId: "user-456" ,
});
const anthropic = session . wrapAnthropic ( new Anthropic ());
const response = await anthropic . messages . create ({
model: "claude-sonnet-4-20250514" ,
max_tokens: 1024 ,
messages: [{ role: "user" , content: "Hello!" }],
});
Wrap the entire AI SDK module: import fallom from "@fallom/trace" ;
import * as ai from "ai" ;
import { createOpenAI } from "@ai-sdk/openai" ;
await fallom . init ({ apiKey: process . env . FALLOM_API_KEY });
const session = fallom . session ({
configKey: "my-app" ,
sessionId: "session-123" ,
customerId: "user-456" ,
});
// Option 1: Wrap the SDK
const { generateText , streamText } = session . wrapAISDK ( ai );
// Option 2: Wrap the model directly (PostHog style)
const openrouter = createOpenAI ({
baseURL: "https://openrouter.ai/api/v1" ,
apiKey: process . env . OPENROUTER_API_KEY ,
});
const { text } = await generateText ({
model: openrouter ( "openai/gpt-4o-mini" ),
prompt: "Hello!" ,
});
See Vercel AI SDK Integration for more details. OpenRouter uses the OpenAI-compatible API: import fallom from "@fallom/trace" ;
import OpenAI from "openai" ;
await fallom . init ({ apiKey: process . env . FALLOM_API_KEY });
const session = fallom . session ({
configKey: "my-app" ,
sessionId: "session-123" ,
customerId: "user-456" ,
});
const openrouter = session . wrapOpenAI (
new OpenAI ({
baseURL: "https://openrouter.ai/api/v1" ,
apiKey: process . env . OPENROUTER_API_KEY ,
})
);
const response = await openrouter . chat . completions . create ({
model: "openai/gpt-4o-mini" ,
messages: [{ role: "user" , content: "Hello!" }],
});
Python SDK
OpenAI
Anthropic
Google AI
OpenRouter
import os
import fallom
from openai import OpenAI
fallom.init( api_key = os.environ[ "FALLOM_API_KEY" ])
# Create a session for this conversation/request
session = fallom.session(
config_key = "my-app" ,
session_id = "session-123" ,
customer_id = "user-456" ,
)
openai = session.wrap_openai(OpenAI())
response = openai.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
import os
import fallom
from anthropic import Anthropic
fallom.init( api_key = os.environ[ "FALLOM_API_KEY" ])
session = fallom.session(
config_key = "my-app" ,
session_id = "session-123" ,
customer_id = "user-456" ,
)
anthropic = session.wrap_anthropic(Anthropic())
response = anthropic.messages.create(
model = "claude-sonnet-4-20250514" ,
max_tokens = 1024 ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
import os
import fallom
import google.generativeai as genai
fallom.init( api_key = os.environ[ "FALLOM_API_KEY" ])
session = fallom.session(
config_key = "my-app" ,
session_id = "session-123" ,
customer_id = "user-456" ,
)
genai.configure( api_key = os.environ[ "GOOGLE_API_KEY" ])
model = genai.GenerativeModel( "gemini-1.5-flash" )
gemini = session.wrap_google_ai(model)
response = gemini.generate_content( "Hello!" )
import os
import fallom
from openai import OpenAI
fallom.init( api_key = os.environ[ "FALLOM_API_KEY" ])
session = fallom.session(
config_key = "my-app" ,
session_id = "session-123" ,
customer_id = "user-456" ,
)
openrouter = session.wrap_openai(
OpenAI(
base_url = "https://openrouter.ai/api/v1" ,
api_key = os.environ[ "OPENROUTER_API_KEY" ],
)
)
response = openrouter.chat.completions.create(
model = "openai/gpt-4o-mini" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Session Context
Sessions group related LLM calls together (e.g., a conversation or agent run):
import fallom from "@fallom/trace" ;
// Create a session for this conversation/request
const session = fallom . session ({
configKey: "my-agent" , // Groups traces in dashboard
sessionId: "session-123" , // Conversation/request ID
customerId: "user-456" , // Optional: end-user identifier
});
// All wrapped clients use this session context
const openai = session . wrapOpenAI ( new OpenAI ());
const anthropic = session . wrapAnthropic ( new Anthropic ());
Concurrent Sessions Sessions are isolated - safe for concurrent requests: async function handleRequest ( userId : string , conversationId : string ) {
const session = fallom . session ({
configKey: "my-agent" ,
sessionId: conversationId ,
customerId: userId ,
});
const openai = session . wrapOpenAI ( new OpenAI ());
// This session's context is isolated
return await openai . chat . completions . create ({ ... });
}
// Safe to run concurrently!
await Promise . all ([
handleRequest ( "user-1" , "conv-1" ),
handleRequest ( "user-2" , "conv-2" ),
]);
import fallom
# Create a session for this conversation/request
session = fallom.session(
config_key = "my-agent" , # Groups traces in dashboard
session_id = "session-123" , # Conversation/request ID
customer_id = "user-456" , # Optional: end-user identifier
)
# All wrapped clients use this session context
openai = session.wrap_openai(OpenAI())
anthropic = session.wrap_anthropic(Anthropic())
Concurrent Sessions Sessions are isolated - safe for concurrent requests: def handle_request ( user_id : str , conversation_id : str ):
session = fallom.session(
config_key = "my-agent" ,
session_id = conversation_id,
customer_id = user_id,
)
openai = session.wrap_openai(OpenAI())
# This session's context is isolated
return openai.chat.completions.create( ... )
# Safe to run concurrently!
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [
executor.submit(handle_request, "user-1" , "conv-1" ),
executor.submit(handle_request, "user-2" , "conv-2" ),
]
Add custom metadata and tags for filtering: session = fallom.session(
config_key = "my-agent" ,
session_id = "session-123" ,
customer_id = "user-456" ,
metadata = {
"deployment" : "dedicated" ,
"request_type" : "transcript" ,
"user_tier" : "premium" ,
},
tags = [ "production" , "high-priority" , "premium" ],
)
Parameter Description configKeyYour experiment/config identifier (e.g., "summarizer") sessionIdUnique ID for this session (e.g., conversation ID) customerIdOptional user identifier for per-user analytics
What Gets Captured
Every LLM call automatically includes:
Field Description Model The model used (e.g., gpt-4o, claude-3-opus) Duration Total request time in milliseconds Time to First Token TTFT for streaming requests Tokens Input, output, and cached token counts Cost Calculated from token usage + model pricing Prompts Full input messages Completions Model responses Session Config key, session ID, customer ID Status OK or ERROR
Multimodal (Images)
Images in prompts are automatically handled:
URL images - Stored as-is
Base64 images - Uploaded to secure storage, replaced with URL
const response = await openai . chat . completions . create ({
model: "gpt-4o" ,
messages: [
{
role: "user" ,
content: [
{ type: "text" , text: "What's in this image?" },
{ type: "image_url" , image_url: { url: "https://..." } },
],
},
],
});
// Trace captures the image URL for replay in dashboard
Configuration
await fallom . init ({
apiKey: "your-fallom-api-key" ,
// Optional settings
baseUrl: "https://traces.fallom.com" , // Custom endpoint
captureContent: true , // Capture prompt/completion text
debug: false , // Enable debug logging
});
import fallom
fallom.init(
api_key = "your-fallom-api-key" ,
# Optional settings
traces_url = "https://traces.fallom.com" , # Custom traces endpoint
configs_url = "https://configs.fallom.com" , # Custom configs endpoint
prompts_url = "https://prompts.fallom.com" , # Custom prompts endpoint
capture_content = True , # Capture prompt/completion text
debug = False , # Enable debug logging
)
Disable Content Capture
For privacy, you can disable capturing prompt/completion content:
await fallom . init ({
apiKey: "your-fallom-api-key" ,
captureContent: false , // Only capture metadata (model, tokens, latency)
});
Next Steps
Model A/B Testing Test different models with traced calls.
Prompt Management Manage and A/B test your prompts.
OpenRouter Broadcast Send traces without an SDK.
Vercel AI SDK Integration guide for Vercel AI.