Skip to main content
The UserBotLatencyObserver measures the time between when a user stops speaking and when the bot starts responding, emitting events for custom handling and optional OpenTelemetry tracing integration.

Features

  • Tracks user speech start/stop timing using VAD frames
  • Measures bot response latency from the actual moment the user started speaking
  • Emits on_latency_measured events for custom processing
  • Automatically records latency as OpenTelemetry span attributes when tracing is enabled
  • Automatically resets between conversation turns

Usage

Basic Latency Monitoring

Add latency monitoring to your pipeline and handle the event:
from pipecat.observers.user_bot_latency_observer import UserBotLatencyObserver

latency_observer = UserBotLatencyObserver()

@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
    print(f"User-to-bot latency: {latency:.3f}s")

task = PipelineTask(
    pipeline,
    params=PipelineParams(observers=[latency_observer]),
)

OpenTelemetry Integration

When tracing is enabled, latency measurements are automatically recorded as turn.user_bot_latency_seconds attributes on OpenTelemetry turn spans. No additional configuration is needed.

How It Works

The observer tracks conversation flow through these key events:
  1. User starts speaking (VADUserStartedSpeakingFrame) → Resets latency tracking
  2. User stops speaking (VADUserStoppedSpeakingFrame) → Records timestamp, accounting for VAD stop_secs delay
  3. Bot starts speaking (BotStartedSpeakingFrame) → Calculates latency and emits on_latency_measured event

Event Handlers

on_latency_measured

Called each time a user-to-bot latency measurement is captured.
@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
    # latency is a float representing seconds
    logger.info(f"Response latency: {latency:.3f}s")

Deprecated: UserBotLatencyLogObserver

UserBotLatencyLogObserver is deprecated. Use UserBotLatencyObserver directly with its on_latency_measured event handler instead.

Limitations

  • Only measures speech-to-speech latency (not text processing time)
  • Requires proper frame sequencing to work accurately