The UserBotLatencyObserver measures the time between when a user stops speaking and when the bot starts responding, emitting events for custom handling and optional OpenTelemetry tracing integration.
Features
- Tracks user speech start/stop timing using VAD frames
- Measures bot response latency from the actual moment the user started speaking
- Emits
on_latency_measured events for custom processing
- Automatically records latency as OpenTelemetry span attributes when tracing is enabled
- Automatically resets between conversation turns
Usage
Basic Latency Monitoring
Add latency monitoring to your pipeline and handle the event:
from pipecat.observers.user_bot_latency_observer import UserBotLatencyObserver
latency_observer = UserBotLatencyObserver()
@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
print(f"User-to-bot latency: {latency:.3f}s")
task = PipelineTask(
pipeline,
params=PipelineParams(observers=[latency_observer]),
)
OpenTelemetry Integration
When tracing is enabled, latency measurements are automatically recorded as turn.user_bot_latency_seconds attributes on OpenTelemetry turn spans. No additional configuration is needed.
How It Works
The observer tracks conversation flow through these key events:
- User starts speaking (
VADUserStartedSpeakingFrame) → Resets latency tracking
- User stops speaking (
VADUserStoppedSpeakingFrame) → Records timestamp, accounting for VAD stop_secs delay
- Bot starts speaking (
BotStartedSpeakingFrame) → Calculates latency and emits on_latency_measured event
Event Handlers
on_latency_measured
Called each time a user-to-bot latency measurement is captured.
@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
# latency is a float representing seconds
logger.info(f"Response latency: {latency:.3f}s")
Deprecated: UserBotLatencyLogObserver
UserBotLatencyLogObserver is deprecated. Use UserBotLatencyObserver
directly with its on_latency_measured event handler instead.
Limitations
- Only measures speech-to-speech latency (not text processing time)
- Requires proper frame sequencing to work accurately