Follow along with code: This guide has a companion TypeScript project with runnable examples. Find it here.
3.1 Setting Up Sessions
See the full session-enabled agent code here.
- A session ID: A unique identifier for each conversation (usually a UUID)
- Context propagation: Making sure child spans inherit the session ID
Install Dependencies
You’ll need the OpenInference core package to set session context:Add Session Tracking to Your Agent
Here’s how to modify your support agent to support sessions:SemanticConventions.SESSION_ID: The standard attribute name for session IDssetSession(): Propagates the session ID to all child spanscontext.with(): Ensures the session context is active during execution
Track Conversation History
For multi-turn conversations, you also need to track what’s been said. Here’s a simple message type:3.2 Running Multi-Turn Conversations
See the multi-turn conversation demo code here.
- Order Inquiry: Customer asks about order, then asks follow-up questions
- FAQ Conversation: Multiple FAQ questions in one session
- Mixed Conversation: Switching between order and FAQ topics
What You’ll See in Phoenix
Now you can view and analyze your traces, grouped by user session!3.3 Session-Level Evaluations
See the session evaluation code here.
- Is memory being preserved? Does the agent remember order IDs, customer preferences, and context from earlier in the conversation?
- Are issues getting resolved? Do conversations end with the customer’s problem solved, or do they trail off unresolved?
- Where do conversations break down? Which sessions show signs of confusion, repetition, or context loss?
Conversation Coherence Evaluator
This evaluator checks if the agent maintained context throughout the conversation:Resolution Evaluator
This evaluator determines if the customer’s issue was actually resolved:Running Session Evaluations
Viewing and Analyzing Session Level Evals
Now that we’ve ran our session level evaluators, let’s see how our support bot performs across user sessions. Turn 1: The user asks about order ORD-67890. The agent correctly looks up the order and reports it’s processing with a December 15 ETA. Turn 2: The user switches topics entirely - “How do I cancel my subscription?” This is a FAQ question, not an order question. The agent handles it via RAG, providing the correct cancellation instructions. Turn 3: Here’s the real test. The user says “Back to my order - what’s the carrier?” They don’t repeat the order ID. They just say “my order.” Did the agent remember? Yes. It correctly referenced ORD-67890 and provided the carrier status (pending) without asking the user to repeat themselves. The session-level annotations confirm what we see:- conversation_coherence: coherent (score: 1.0) - The explanation notes that “the agent correctly referenced the order ID and consistent details across turns… and also handled the separate subscription question without losing track.”
- resolution_status: resolved (score: 1.0) - The explanation confirms “the agent answered the user’s questions: provided order status and ETA, explained cancellation steps, and clarified that the carrier is currently pending.”
Summary
You’ve used sessions transform your tracing data from isolated queries into conversation threads. Here are the benefits you’ve realized by using sessions:| Without Sessions | With Sessions |
|---|---|
| Individual traces, disconnected | Full conversation history |
| Can’t see context loss | ”Bot forgot what I said” is visible |
| Per-turn metrics only | Total tokens, turns to resolution |
| Evaluate single responses | Evaluate entire conversations |
- Add session IDs to your agent (one-time setup)
- Track conversation history between turns
- View sessions in the Phoenix Sessions tab
- Evaluate conversations with coherence and resolution evaluators
- Debug patterns by clicking into problematic sessions
Congratulations!
This marks the end of the tracing tutorial. You’ve now learned how to gain observability into your LLM applications. You’ve learned how to:- Chapter 1: Tracing every LLM call, tool execution, and retrieval
- Chapter 2: Annotating traces with human feedback and LLM-as-Judge
- Chapter 3: Tracking multi-turn conversations as sessions
Next Steps
From here, you might want to explore:- Exporting Data: Export annotated traces for fine-tuning
- Multimodal Tracing: Tracing for multimodal applications
- Cost Tracking: Track LLM/Agent costs smartly

