Skip to main content
Fallom automatically traces all your LLM calls, capturing tokens, costs, latency, and full prompt/completion content.

TypeScript SDK

Wrap your OpenAI client for automatic tracing:
import fallom from "@fallom/trace";
import OpenAI from "openai";

await fallom.init({ apiKey: process.env.FALLOM_API_KEY });

// Create a session for this conversation/request
const session = fallom.session({
  configKey: "my-app",
  sessionId: "session-123",
  customerId: "user-456",
});

const openai = session.wrapOpenAI(new OpenAI());

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Python SDK

import os
import fallom
from openai import OpenAI

fallom.init(api_key=os.environ["FALLOM_API_KEY"])

# Create a session for this conversation/request
session = fallom.session(
    config_key="my-app",
    session_id="session-123",
    customer_id="user-456",
)

openai = session.wrap_openai(OpenAI())

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Session Context

Sessions group related LLM calls together (e.g., a conversation or agent run):
import fallom from "@fallom/trace";

// Create a session for this conversation/request
const session = fallom.session({
  configKey: "my-agent",       // Groups traces in dashboard
  sessionId: "session-123",    // Conversation/request ID
  customerId: "user-456",      // Optional: end-user identifier
});

// All wrapped clients use this session context
const openai = session.wrapOpenAI(new OpenAI());
const anthropic = session.wrapAnthropic(new Anthropic());

Concurrent Sessions

Sessions are isolated - safe for concurrent requests:
async function handleRequest(userId: string, conversationId: string) {
  const session = fallom.session({
    configKey: "my-agent",
    sessionId: conversationId,
    customerId: userId,
  });

  const openai = session.wrapOpenAI(new OpenAI());
  
  // This session's context is isolated
  return await openai.chat.completions.create({...});
}

// Safe to run concurrently!
await Promise.all([
  handleRequest("user-1", "conv-1"),
  handleRequest("user-2", "conv-2"),
]);
ParameterDescription
configKeyYour experiment/config identifier (e.g., "summarizer")
sessionIdUnique ID for this session (e.g., conversation ID)
customerIdOptional user identifier for per-user analytics

What Gets Captured

Every LLM call automatically includes:
FieldDescription
ModelThe model used (e.g., gpt-4o, claude-3-opus)
DurationTotal request time in milliseconds
Time to First TokenTTFT for streaming requests
TokensInput, output, and cached token counts
CostCalculated from token usage + model pricing
PromptsFull input messages
CompletionsModel responses
SessionConfig key, session ID, customer ID
StatusOK or ERROR

Multimodal (Images)

Images in prompts are automatically handled:
  • URL images - Stored as-is
  • Base64 images - Uploaded to secure storage, replaced with URL
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
        { type: "image_url", image_url: { url: "https://..." } },
      ],
    },
  ],
});
// Trace captures the image URL for replay in dashboard

Configuration

await fallom.init({
  apiKey: "your-fallom-api-key",
  
  // Optional settings
  baseUrl: "https://traces.fallom.com",  // Custom endpoint
  captureContent: true,                   // Capture prompt/completion text
  debug: false,                           // Enable debug logging
});

Disable Content Capture

For privacy, you can disable capturing prompt/completion content:
await fallom.init({
  apiKey: "your-fallom-api-key",
  captureContent: false, // Only capture metadata (model, tokens, latency)
});

Next Steps