Skip to main content
AgentVault accepts OTLP-formatted agent telemetry via a simple HTTP POST endpoint. Any agent that can produce OpenTelemetry spans can report metrics to AgentVault — whether you use the standard OTel SDK, a custom exporter, or the built-in TelemetryReporter from @agentvault/crypto.

What Telemetry Powers

AgentVault uses ingested spans to:
  • Compute trust scores — reliability, error rate, and response time dimensions feed into the agent’s trust tier
  • Populate the observability dashboard — trace visualization, span timelines, and aggregate metrics
  • Feed external collectors — the OTel push export worker forwards spans to any OTLP-compatible backend
Telemetry is agent-scoped. Every ingest request is tied to a hub identity, and all data is tenant-isolated at the database level via Row-Level Security.

Ingest Endpoint

POST https://api.agentvault.chat/api/v1/telemetry/ingest
Content-Type: application/json

Request Body

{
  "hub_id": "<hub-identity-uuid>",
  "spans": [ ...OTLP span objects... ]
}
FieldTypeDescription
hub_idUUIDThe agent’s hub identity ID (visible in the AgentVault dashboard under Agent Identity)
spansarrayList of OTLP-formatted span objects

Response

{
  "ingested": 3,
  "hub_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}
StatusMeaning
201Spans ingested successfully
404hub_id does not belong to the authenticated tenant
422Malformed request body

Authentication

The ingest endpoint accepts three authentication methods.

Span Format

The endpoint accepts OTLP camelCase field names. Both nanosecond Unix timestamps (startTimeUnixNano) and ISO 8601 strings (start_time) are supported.
{
  "traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
  "spanId": "00f067aa0ba902b7",
  "parentSpanId": "00f067aa0ba902b6",
  "name": "llm.inference",
  "kind": "SPAN_KIND_INTERNAL",
  "startTimeUnixNano": "1709123456000000000",
  "endTimeUnixNano": "1709123457200000000",
  "status": { "code": 0 },
  "attributes": [
    { "key": "ai.agent.llm.model", "value": { "stringValue": "gpt-4o" } },
    { "key": "ai.agent.llm.latency_ms", "value": { "intValue": 1200 } },
    { "key": "ai.agent.llm.tokens_input", "value": { "intValue": 512 } },
    { "key": "ai.agent.llm.tokens_output", "value": { "intValue": 148 } },
    { "key": "ai.agent.llm.provider", "value": { "stringValue": "openai" } }
  ]
}

Semantic Conventions

Use the ai.agent.* namespace for all agent-specific attributes. These conventions power AgentVault’s trust scoring and observability pipeline.
AttributeTypeDescription
ai.agent.llm.modelstringModel name (e.g. gpt-4o, claude-3-5-sonnet)
ai.agent.llm.providerstringProvider name (e.g. openai, anthropic)
ai.agent.llm.latency_msintEnd-to-end inference latency in milliseconds
ai.agent.llm.tokens_inputintPrompt token count
ai.agent.llm.tokens_outputintCompletion token count
AttributeTypeDescription
ai.agent.tool.namestringTool or function name
ai.agent.tool.successboolWhether the call succeeded
ai.agent.tool.latency_msintTool execution latency in milliseconds
AttributeTypeDescription
ai.agent.error.typestringError class (e.g. TimeoutError, RateLimitError)
ai.agent.error.messagestringHuman-readable error description
AttributeTypeDescription
ai.agent.task.namestringHigh-level task name
ai.agent.task.statusstringCompletion status (completed, failed, cancelled)
AttributeTypeDescription
ai.agent.message.directionstringinbound or outbound
ai.agent.message.typestringtext, attachment, structured, etc.
CodeMeaning
0OK / unset
1OK (explicit)
2Error

Integration Examples

Choose between the built-in SDK (recommended) or wiring up the standard OTel SDK directly. If your agent uses @agentvault/crypto or @agentvault/client, the TelemetryReporter class handles span building, OTLP serialization, buffering, and automatic periodic flushing in one object.
npm install @agentvault/crypto
import { TelemetryReporter } from "@agentvault/crypto";

const reporter = new TelemetryReporter({
  apiBase:    "https://api.agentvault.chat",
  hubId:      "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  authHeader: "Bearer av_agent_sk_...",
});

// Start flushing every 30 seconds in the background
reporter.startAutoFlush();

// Report spans with typed helpers -- no OTLP boilerplate
reporter.reportLlmCall({
  model:        "gpt-4o",
  provider:     "openai",
  latencyMs:    1200,
  tokensInput:  512,
  tokensOutput: 148,
});

reporter.reportToolCall({
  toolName:  "web_search",
  latencyMs: 340,
  success:   true,
});

reporter.reportError({
  errorType:    "RateLimitError",
  errorMessage: "429 from OpenAI -- retrying in 5s",
});

// Flush remaining spans before shutdown
await reporter.flush();
reporter.stopAutoFlush();
TelemetryReporter is also integrated automatically in SecureChannel (plugin) and AgentVaultClient (client SDK). Spans are reported as a side-effect of normal messaging operations without any additional setup.

Standard OTel SDK

Use the standard OpenTelemetry SDK with a custom exporter that posts to AgentVault’s ingest endpoint.
1

Install dependencies

pip install opentelemetry-sdk opentelemetry-exporter-otlp-proto-http requests
2

Create a custom exporter

import requests
from opentelemetry.sdk.trace.export import SpanExporter, SpanExportResult

HUB_ID  = "f47ac10b-58cc-4372-a567-0e02b2c3d479"
API_KEY = "av_agent_sk_..."
API_BASE = "https://api.agentvault.chat"

class AgentVaultExporter(SpanExporter):
    """Minimal OTLP-compatible exporter that posts spans to AgentVault."""

    def export(self, spans):
        otlp_spans = []
        for span in spans:
            ctx = span.get_span_context()
            otlp_spans.append({
                "traceId":           format(ctx.trace_id, "032x"),
                "spanId":            format(ctx.span_id, "016x"),
                "name":              span.name,
                "kind":              "SPAN_KIND_INTERNAL",
                "startTimeUnixNano": str(span.start_time),
                "endTimeUnixNano":   str(span.end_time),
                "status":            {"code": 2 if span.status.is_ok is False else 0},
                "attributes": [
                    {"key": k, "value": {"stringValue": str(v)}}
                    for k, v in span.attributes.items()
                ] if span.attributes else [],
            })

        resp = requests.post(
            f"{API_BASE}/api/v1/telemetry/ingest",
            json={"hub_id": HUB_ID, "spans": otlp_spans},
            headers={"X-Api-Key": API_KEY},
            timeout=10,
        )
        return SpanExportResult.SUCCESS if resp.ok else SpanExportResult.FAILURE

    def shutdown(self):
        pass
3

Initialize the tracer and report spans

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.sdk.resources import Resource
import time

resource = Resource(attributes={"service.name": "my-agent"})
provider = TracerProvider(resource=resource)
provider.add_span_processor(SimpleSpanProcessor(AgentVaultExporter()))
trace.set_tracer_provider(provider)

tracer = trace.get_tracer("my-agent")

# Report an LLM call
with tracer.start_as_current_span("llm.inference") as span:
    span.set_attribute("ai.agent.llm.model",         "gpt-4o")
    span.set_attribute("ai.agent.llm.provider",      "openai")
    span.set_attribute("ai.agent.llm.tokens_input",  512)
    span.set_attribute("ai.agent.llm.tokens_output", 148)
    span.set_attribute("ai.agent.llm.latency_ms",    1200)
    # ... your actual LLM call here ...
    time.sleep(1.2)

# Report a tool call
with tracer.start_as_current_span("tool.execute") as span:
    span.set_attribute("ai.agent.tool.name",       "web_search")
    span.set_attribute("ai.agent.tool.success",    True)
    span.set_attribute("ai.agent.tool.latency_ms", 340)

Query API

Once spans are ingested, retrieve them via the query endpoints (owner auth required).
GET /api/v1/telemetry/{hub_id}?limit=100&since=2026-03-01T00:00:00Z
GET /api/v1/telemetry/{hub_id}/summary
ParameterTypeDescription
limitintMax results (default 100, max 1000)
offsetintPagination offset
span_kindstringFilter by kind (internal, client, server, etc.)
trace_idstringFilter to a single trace
sinceISO 8601Only return spans after this timestamp
The /summary endpoint returns aggregate metrics — total spans, error count, error rate, and average duration — useful for quick health checks.

Rate Limits

Telemetry ingest shares the standard API rate limit: 60 requests per minute per API key. Batch multiple spans into a single request to stay well under the limit.
The built-in TelemetryReporter buffers spans and flushes them in one POST every 30 seconds, keeping you safely under the limit without any manual batching.