Observability Setup
Every pipeline run is fully instrumented out of the box. Each VM stage emits spans and metrics via OTLP, giving you end-to-end visibility across isolated execution boundaries — from pipeline orchestration down to individual tool calls inside each micro-VM.
What's Included
- OTLP traces — Per-box spans, tool call events, pipeline-level trace context.
- Metrics — Token counts, cost, and duration per stage.
- Structured logs —
[vm:NAME]prefixed and trace-correlated for easy filtering. - Guest telemetry — procfs metrics (CPU, memory) exported from the guest to the host via vsock.
Enable OTLP
Build with the opentelemetry feature flag and set the OTLP endpoint:
cargo build --features opentelemetry
Then set the endpoint environment variable when running:
VOIDBOX_OTLP_ENDPOINT=http://localhost:4317 \
cargo run --bin voidbox -- run --file agent.yaml
Configuration
| Environment Variable | Description |
|---|---|
VOIDBOX_OTLP_ENDPOINT | OTLP gRPC endpoint (e.g. http://localhost:4317) |
OTEL_SERVICE_NAME | Service name for traces (default: void-box) |
Trace Structure
Traces follow a hierarchical structure from the pipeline level down to individual tool calls within each VM stage:
Pipeline span
└─ Stage 1 span (box_name="data_analyst")
├─ tool_call event: Read("input.json")
├─ tool_call event: Bash("curl ...")
└─ attributes: tokens_in, tokens_out, cost_usd, model
└─ Stage 2 span (box_name="quant_analyst")
└─ ...
Each stage span carries attributes for token counts, cost, model used, and duration. Tool call events are recorded as span events within the stage span.
Guest Telemetry
The guest-agent inside each micro-VM periodically reads /proc/stat and /proc/meminfo, then sends TelemetryBatch messages over vsock to the host. On the host side, the TelemetryAggregator ingests these batches and exports them as OTLP metrics.
Guest telemetry gives you per-VM resource utilization without any agent-side instrumentation. CPU and memory metrics flow automatically as long as the guest-agent is running.
Playground
The repository includes a ready-to-run observability stack in the playground/ directory with pre-configured:
- Grafana — Dashboards for pipeline traces and metrics.
- Tempo — Distributed trace backend for OTLP ingestion.
- Prometheus — Metrics collection and storage.
See the playground/ directory in the repository for setup instructions.
