AI Toolkit
Modulesmonitor

monitor

AI observability with Langfuse tracing, evaluation, and cost tracking

Overview

The monitor module wraps Langfuse for AI observability. Trace every LLM call, evaluate quality, track costs, and export metrics.

Peer dependencies: langfuse

npm install langfuse
yarn add langfuse
pnpm add langfuse

Quick Start

import { createMonitor, trace } from '@jamaalbuilds/ai-toolkit/monitor';

const monitor = await createMonitor({
  publicKey: process.env.LANGFUSE_PUBLIC_KEY!,
  secretKey: process.env.LANGFUSE_SECRET_KEY!,
});

const result = await trace(monitor, 'my-query', async (span) => {
  span.update({ model: 'llama-3.3-70b', input: 'Hello world' });
  return await ai.generate('Hello world');
});

API Reference

createMonitor(config?)

Create a monitor client. Reads from LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY env vars by default.

async function createMonitor(config?: MonitorConfig): Promise<MonitorClient>
ParameterTypeDescription
config.publicKeystringLangfuse public key
config.secretKeystringLangfuse secret key
config.baseUrlstringCustom Langfuse URL (self-hosted)
config.traceStoreTraceStoreConfigIn-memory trace store config

trace(monitor, name, fn)

Create a traced operation with automatic timing and error scoring.

async function trace<T>(
  monitor: MonitorClient,
  name: string,
  fn: (span: TraceSpan) => Promise<T>,
): Promise<TraceResult<T>>
const { result, traceId } = await trace(monitor, 'rag-pipeline', async (span) => {
  span.update({ input: query, model: 'gpt-4o' });
  const answer = await generate(ai, query);
  span.update({ output: answer.text, usage: { promptTokens: 100, completionTokens: 50 } });
  return answer;
});

evaluate(monitor, options)

Score a trace for quality evaluation.

async function evaluate(monitor: MonitorClient, options: EvaluateOptions): Promise<void>
ParameterTypeDescription
options.traceIdstringTrace to evaluate
options.namestringScore name (e.g., 'relevance')
options.valuenumberScore value (0-1)
options.dataTypeScoreDataType'NUMERIC' | 'BOOLEAN' | 'CATEGORICAL'
await evaluate(monitor, {
  traceId: result.traceId,
  name: 'relevance',
  value: 0.95,
});

getCostReport(monitor)

Get aggregated cost data across all traced operations.

function getCostReport(monitor: MonitorClient): CostReport
const report = getCostReport(monitor);
console.log(`Total: $${report.totalEstimatedCostUsd}`);
for (const [model, data] of Object.entries(report.byModel)) {
  console.log(`${model}: ${data.operations} operations, $${data.estimatedCostUsd}`);
}

getTraces(monitor), getTrace(monitor, id)

Retrieve stored traces from the in-memory trace store.

onTrace(monitor, callback)

Subscribe to new traces as they complete.

exportMetrics(monitor)

Export metrics in a structured format for dashboards.

createLogger(service, options?)

Create a structured logger.

import { createLogger } from '@jamaalbuilds/ai-toolkit/monitor';

const logger = createLogger('my-app', { level: 'info' });
logger.info('Server started', { port: 3000 });

Types

  • MonitorClient — client with trace, evaluate, getCostReport
  • TraceSpan — span with update() method
  • TraceResult<T> — result, traceId
  • CostReport — totalOperations, totalTokens, totalEstimatedCostUsd, byModel, byModule, timeRange
  • StoredTrace — id, name, input, output, duration, scores
On this page

On this page