Annotating Auto-Instrumented Spans

Assumes you are using openinference-instrumentation>=0.1.34

When working with spans that are automatically instrumented via OpenInference in your LLM applications, you often need to capture span contexts to apply feedback or annotations. The capture_span_context context manager provides a convenient way to capture all OpenInference spans within its scope, making it easier to apply feedback to specific spans in downstream operations. The capture_span_context context manager allows you to:

Capture all spans created within a specific code block
Retrieve span contexts for later use in feedback systems
Maintain a clean separation between span creation and annotation logic
Apply feedback to spans without needing to track span IDs manually

Understanding Span Topology

When your LLM application executes, it creates a hierarchy of spans representing different operations. For example, when using a framework, you might have:

framework (root span)
  ├── span 1 (query processing)
  │     └── span 2 (LLM call)
  └── span 3 (response formatting)

The capture_span_context context manager helps you easily access:

First span: The root span of your operation (useful for high-level feedback and evaluations)
Last span: The most recent span created (often the final LLM call, useful for LLM-specific feedback)
All spans: A complete list of all spans created within the context (useful for comprehensive analysis)

Usage

Accessing First and Last Spans

from openinference.instrumentation import capture_span_context
from opentelemetry.trace.span import format_span_id
from phoenix.client import Client

client = Client()

def process_llm_request_with_feedback(prompt: str):
    with capture_span_context() as capture:
        # This creates multiple spans in a hierarchy when using a framework
        response = llm.invoke("Generate a summary")
        # Get user feedback (simulated)
        user_feedback = get_user_feedback(response)

        # Method 1: Get first span ID (root span - good for user feedback)
        first_span_id = capture.get_first_span_id()
        if first_span_id:
            # Apply user feedback to the first span
            client.annotations.add_span_annotation(
                annotation_name="user_feedback",
                annotator_kind="HUMAN",
                span_id=first_span_id,
                label=user_feedback.label,
                score=user_feedback.score,
                explanation=user_feedback.explanation
            )

        # Method 2: Get last span ID (most recent span - often the LLM call)
        last_span_id = capture.get_last_span_id()
        if last_span_id:
            # Apply feedback to the most recent span

            client.annotations.add_span_annotation(
                annotation_name="llm_response_quality",
                annotator_kind="HUMAN",
                span_id=last_span_id,
                label="helpful",
                score=4,
                explanation="The LLM provided a helpful and relevant response"
            )

When to Use First vs Last Spans

Use the first span (get_first_span_id()) when:

Adding user feedback about the overall experience
Recording evaluation scores for the entire request/response cycle

Use the last span (get_last_span_id()) when:

The last span represents an LLM invocation
You want to annotate the final output or generation step
Applying feedback specifically to the model’s response quality
Recording model-specific metrics or evaluations

Working with All Captured Spans

You can also access all spans for more complex annotation scenarios:

with capture_span_context() as capture:
    # Make LLM call (auto-instrumented)
    response = llm.invoke("Generate a summary")

    # Get all captured span contexts
    span_contexts = capture.get_span_contexts()

    # Apply different feedback logic to different spans
    for i, span_context in enumerate(span_contexts):
        span_id = format_span_id(span_context.span_id)

        client.annotations.add_span_annotation(
            annotation_name="span_order",
            annotator_kind="CODE",
            span_id=span_id,
            label=f"span_{i}",
            metadata={"position": i, "total_spans": len(span_contexts)}
        )

Working with Multiple Span Types

You can filter spans based on their attributes:

with capture_span_context() as capture:
    # Make LLM call (auto-instrumented)
    response = llm.invoke("Generate a summary")

    span_contexts = capture.get_span_contexts()

    # Filter for specific span types
    llm_spans = [
        ctx for ctx in span_contexts
        if hasattr(ctx, 'attributes')
    ]

    # Apply different feedback logic to different span types
    for span_context in llm_spans:
        apply_llm_feedback(span_context)

Tracing

Prompt Engineering

Datasets & Experiments

Evaluation

Settings

Resources

Annotating Auto-Instrumented Spans

Understanding Span Topology

Usage

Accessing First and Last Spans

When to Use First vs Last Spans

Working with All Captured Spans

Working with Multiple Span Types

Resources

Tracing

Prompt Engineering

Datasets & Experiments

Evaluation

Settings

Resources

​Understanding Span Topology

​Usage

​Accessing First and Last Spans

​When to Use First vs Last Spans

​Working with All Captured Spans

​Working with Multiple Span Types

​Resources

Understanding Span Topology

Usage

Accessing First and Last Spans

When to Use First vs Last Spans

Working with All Captured Spans

Working with Multiple Span Types

Resources