OpenTelemetry Standardizes LLM Tracing: Implementation Guide for GenAI Semantic Conventions
These articles are AI-generated summaries. Please check the original sources for full details.
OpenTelemetry just standardized LLM tracing. Here’s what it actually looks like in code.
OpenTelemetry has released experimental GenAI Semantic Conventions to standardize how LLM spans are named and attributed across different tools. This specification addresses the fragmentation where every LLM tool, such as Langfuse or Helicone, previously used incompatible custom tracing formats.
Why This Matters
The transition to standardized GenAI tracing resolves the ‘walled garden’ problem where traces were only visible in specific vendor dashboards. By adopting these conventions, engineering teams can switch between backends like Datadog, Arize Phoenix, and Jaeger without reconfiguring their entire instrumentation layer. Failure to align with these standards leads to vendor lock-in and invisible traces when metadata is stored in reserved namespaces or incorrect paths.
Key Insights
- Span naming must follow the
{operation} {name}format, such aschat gpt-4oorexecute_tool web_search, to be recognized by GenAI-aware backends. - Tool attributes must be placed at the
gen_ai.tool.*level rather than nested under agents, as seen in the toad-eye v1 to v2 migration. - The OTel spec mandates that instrumentations SHOULD NOT capture prompt or completion content by default to prevent PII leaks, requiring explicit opt-in.
- A gap analysis shows that OTel covers the ‘what’ of an event, while custom namespaces like
gen_ai.toad_eye.costare still required for ‘how much’ metrics. - The OTel NodeSDK silently disables trace export if
spanProcessorsis passed as an empty array, a pitfall that can lead to passing tests with zero actual observability.
Working Examples
Applying the official GenAI Semantic Conventions for an agent tool call.
// Standardized Span Naming and Attributes
span.setAttribute("gen_ai.operation.name", "chat");
span.setAttribute("gen_ai.request.model", "gpt-4o");
span.setAttribute("gen_ai.agent.name", "weather-bot");
span.setAttribute("gen_ai.tool.name", "search");
span.setAttribute("gen_ai.tool.type", "function");
Migration strategy to support both legacy and standardized attributes during a version transition.
// Dual-emit approach for backward compatibility
// New (OTel spec-compliant)
span.setAttribute("gen_ai.tool.name", toolName);
// Old (deprecated, still emitted for backward compat)
span.setAttribute("gen_ai.agent.tool.name", toolName);
Practical Applications
- System: toad-eye v2.4. Behavior: Implements dual-emission of both old and new attribute names controlled by the OTEL_SEMCONV_STABILITY_OPT_IN environment variable.
- Pitfall: Using custom span names like
gen_ai.openai.gpt-4o. Consequence: The span becomes invisible to GenAI-aware backends that expect thechat {model}format. - Use Case: Privacy-first instrumentation where prompt recording is disabled by default, only enabling JSON string capture via
gen_ai.input.messageswhen explicitly configured.
References:
Continue reading
Next article
Oracle Patches Critical CVE-2026-21992 Enabling Unauthenticated RCE in Identity Manager
Related Content
OpenTelemetry Standardizes Cloud Observability Across Distributed Systems
OpenTelemetry establishes a unified standard for metrics, logs, and traces, eliminating vendor lock-in for complex distributed cloud environments.
Essential Observability: 3 Critical Alerts for LLM Systems
Prevent runaway LLM costs and quality drift using OpenTelemetry GenAI conventions to monitor per-trace spend and retrieval relevance.
Why Observability Matters for AI Applications: A Deep Dive into LLM Monitoring
Sally O'Malley explains the unique observability challenges of Large Language Models (LLMs) and demonstrates how to implement an open-source observability stack using vLLM, Llama Stack, Prometheus, Grafana, and OpenTelemetry. She discusses key metrics for monitoring performance, cost, and quality, and the importance of tracing for debugging AI workloads.