Skip to main content
Version: Next 🚧

Tracing Overview

CubePi emits OpenTelemetry spans that follow the GenAI Semantic Conventions v1.41 so any OTel-compatible backend (Jaeger, Tempo, Honeycomb, Datadog, AWS X-Ray, Azure Monitor, â€Ļ) can ingest agent runs without custom instrumentation.

Attach a Tracer to an Agent and every prompt produces a tree of spans you can pivot, query, and join with the rest of your service traces:

trace
└── invoke_agent 14425.8ms [0x1cd97cdb] ← one per agent.prompt()
├── cubepi.turn 1283.1ms [0x5cfda93e] ← one per LLM round-trip
│ ├── chat deepseek-v4-flash 1208.7ms tok 6845/68 [0x0d130229]
│ └── execute_tool subagent 9610.2ms subagent [0x38bdd10a]
│ └── invoke_agent 9601.0ms [0x8094f99b] ← subagent run, nested
│ └── cubepi.turn 9598.4ms [0x57c5cfc7]
│ ├── chat deepseek-v4-flash 1190.3ms [0x8205ca6b]
│ └── execute_tool web_search 6500.2ms web_search [0xca4e59fc]
└── cubepi.turn 491.9ms ERROR [0xce25f242]
└── chat deepseek-v4-flash 427.2ms ERROR [0x0bff68ec]
└── error: Error code: 400 - ... `tool_use` ids were found without
`tool_result` blocks immediately after: call_01_...

Each layer carries standard gen_ai.* attributes — gen_ai.operation.name, gen_ai.request.model, gen_ai.provider.name, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons, â€Ļ

What ships out of the box​

  • Tracer — builds an SDK TracerProvider, attaches one BatchSpanProcessor per exporter, wires the CubePi event stream into spans.
  • Meter — sibling for OTel histograms: gen_ai.client.operation.duration, gen_ai.client.operation.time_to_first_chunk, gen_ai.client.token.usage.
  • JsonlSpanExporter — write one JSON line per span to ./cubepi-traces/<date>/<trace_id>.jsonl. Files are sharded by trace_id, so one file holds a whole trace — the run plus any nested subagent runs (which inherit the trace). Useful for local dev and offline debugging; works with any OTel viewer that reads JSONL, and with the cubepi trace CLI.
  • OTLP — bring your own exporter via opentelemetry-exporter-otlp-proto-http (HTTP) or â€Ļ-grpc and hand it to Tracer(exporters=[â€Ļ]).
  • W3C trace context propagation — outgoing MCP calls inject the active traceparent as an HTTP header so an instrumented MCP server can continue the trace.
  • tracer.attached(agent) / meter.attached(agent) — async context managers that RAII-wrap attach/detach, so cleanup is one async with block instead of an explicit try/finally.
  • atexit flush hook — Tracer(atexit_flush=True) (default) registers a process-exit handler that sync-flushes any buffered spans, so callers who forget await tracer.shutdown() still get their spans exported on normal exit / Ctrl-C / unhandled exception.
  • tracing_context() — set per-run tags and metadata (cubepi.tags = ("beta-arm",), cubepi.metadata.user_id = "u-42") via a contextvar-scoped block. Concurrent agents each see their own values.

What it costs​

  • One pure-Python recorder per agent run subscribing to the agent's event stream and the provider's listener registry — no monkey-patching, no extra threads.
  • One OTel SDK span per layer above. BatchSpanProcessor batches export off the hot path.
  • No payloads are recorded by default. gen_ai.input.messages, gen_ai.output.messages, raw request/response, and tool args/results all require explicit opt-in via record_content=True so you don't accidentally ship PII to your backend. See Content & Redaction.

When to use each piece​

You wantUse
Trace one local agent run and inspect a JSONL fileTracer + JsonlSpanExporter
Ship to Jaeger / Tempo / Honeycomb / DatadogTracer + OTLP exporter
Latency + token histograms next to the spansMeter alongside Tracer
Record prompts / model outputs for evaluationTracer(record_content=True)
Redact PII before it leaves the processTracer(redact=â€Ļ)
Tag runs with user_id / session_id / A-B armtracing_context(tags=â€Ļ, metadata=â€Ļ)
One-liner cleanup, no try/finallyasync with tracer.attached(agent): â€Ļ
Forget to call shutdown() and not lose spansTracer(atexit_flush=True) (default)
Continue a trace from an upstream serviceTracer(resource=â€Ļ) + W3C traceparent (auto for MCP, manual for HTTP)

Where to go next​