Skip to main content

Overview

AgentVault’s telemetry data model is OTel-compatible from day one. Every audit event conforms to the OpenTelemetry Log Data Model so it can export to any OTel-compatible backend (Splunk, Datadog, Grafana) without schema changes.

Design Principles

  1. OTel-compatible — Every event maps directly to an OTel LogRecord.
  2. Postgres-first — No external observability stack required. JSONB columns provide flexibility and queryability.
  3. Trace context as conversation context — Conversation IDs map to OTel Trace IDs. Message exchanges map to Span IDs.
  4. Tamper-evident — SHA-256 hash chain provides cryptographic integrity without blockchain overhead.
  5. Schema-versioned — Every envelope carries a version so types can evolve without breaking integrations.

Entity Identity Model

Hierarchy

Tenant       -- Organization or individual account
  |
  +-- User   -- Human participant (owner, team member)
  |
  +-- Agent  -- AI agent registered to the tenant

Identity Schema

{
  "entity_id": "agt_7f3a2b9c-1234-5678-abcd-ef0123456789",
  "entity_type": "agent | user | system",
  "tenant_id": "tnt_a1b2c3d4-5678-90ab-cdef-1234567890ab",
  "display_name": "Claude Code Assistant",
  "hub_address": "claude-coder.alice.agentvault.hub",
  "capabilities": ["code_review", "deployment", "testing"],
  "trust_level": "tenant_verified | federation_verified | unverified"
}

OTel Resource Attributes

These map to OTel semantic conventions for Resource:
OTel AttributeAgentVault FieldExample
service.name"agentvault"
service.version"0.1.0"
service.instance.identity_id"agt_7f3a..."
av.tenant.idtenant_id"tnt_a1b2..."
av.entity.typeentity_type"agent"
av.entity.hub_addresshub_address"cortina.agentvault.hub"
av.trust_leveltrust_level"tenant_verified"

Message Envelope

Every message flowing through AgentVault is wrapped in a transport-agnostic envelope (works over WebSocket, REST, gRPC, or queues).

Envelope Schema

{
  "envelope_version": "1.0.0",
  "message_id": "msg_uuid_v7",
  "trace_id": "a1b2c3d4e5f67890a1b2c3d4e5f67890",
  "span_id": "1234567890abcdef",
  "parent_span_id": "abcdef1234567890",
  "timestamp": "2026-02-16T14:32:00Z",
  "sender": {
    "entity_id": "agt_uuid",
    "entity_type": "agent",
    "hub_address": "cortina.agentvault.hub"
  },
  "recipient": {
    "entity_id": "usr_uuid",
    "entity_type": "user"
  },
  "message_type": "decision_request",
  "priority": "high",
  "payload": { },
  "metadata": { }
}
Required fields: envelope_version, message_id, trace_id, span_id, parent_span_id, timestamp, sender, recipient, message_type, payload.

Trace Context Mapping

AgentVault ConceptOTel ConceptFormat
Conversation/Room IDTrace ID32-char hex (128-bit)
Message ExchangeSpan ID16-char hex (64-bit)
Reply-to ReferenceParent Span ID16-char hex (64-bit)
Message EnvelopeSpan Event / LogOTel LogRecord
Trace ID generation is deterministic from conversation IDs using a namespace UUID + SHA-256 truncated to 128 bits. The same conversation always produces the same trace ID.
import uuid

AGENTVAULT_NAMESPACE = uuid.UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890")

def conversation_to_trace_id(conversation_id: str) -> str:
    """Deterministic mapping from conversation ID to OTel trace ID."""
    derived = uuid.uuid5(AGENTVAULT_NAMESPACE, conversation_id)
    return derived.hex  # 32-char hex, OTel compatible

def generate_span_id() -> str:
    """Random span ID for each message exchange."""
    return uuid.uuid4().hex[:16]  # 16-char hex, OTel compatible

Message Types

AgentVault defines eight message types. Each has a type-specific payload schema.

Message Type Reference

TypeDescriptionPriority
decision_requestAgent requests owner decision (approve/deny/defer)high
decision_responseOwner responds to a decision requestnormal
status_alertAgent reports workflow status (info/warning/error/critical)varies
artifact_shareFile, code snippet, image, or structured data exchangenormal
action_confirmationAgent confirms action result (success/failure/rollback)normal
heartbeatAgent status pulse with task progress and resource usagelow
textFree-form text messagenormal
system_eventPlatform-level event (connect, disconnect, policy)normal

Payload Examples

{
  "decision_id": "dec_uuid",
  "title": "Deploy v2.3.1 to production?",
  "description": "All tests passing. 3 files changed.",
  "options": [
    {
      "option_id": "approve",
      "label": "Approve Deploy",
      "risk_level": "medium"
    },
    {
      "option_id": "deny",
      "label": "Deny",
      "risk_level": "low"
    },
    {
      "option_id": "defer",
      "label": "Defer 24h",
      "is_default": true,
      "risk_level": "low"
    }
  ],
  "context_refs": [
    { "ref_type": "pull_request", "uri": "https://github.com/org/repo/pull/142", "label": "PR #142" }
  ],
  "deadline": "2026-02-16T18:00:00Z",
  "auto_action": {
    "option_id": "defer",
    "trigger": "deadline_expired"
  }
}
{
  "decision_id": "dec_uuid",
  "selected_option_id": "approve",
  "respondent_note": "Looks good, ship it.",
  "responded_at": "2026-02-16T14:32:00Z",
  "response_method": "interactive"
}
{
  "alert_id": "alt_uuid",
  "severity": "info",
  "category": "workflow",
  "title": "Build pipeline completed",
  "summary": "All 847 tests passed in 3m 42s. Coverage at 91.2%.",
  "detail": {
    "format": "markdown",
    "content": "## Test Results\n- Unit: 612 passed\n- Integration: 235 passed"
  },
  "actionable": false
}
{
  "agent_status": "working",
  "current_task": {
    "task_id": "tsk_uuid",
    "description": "Running integration test suite",
    "progress_pct": 67,
    "estimated_completion": "2026-02-16T14:45:00Z"
  },
  "resource_usage": {
    "tokens_consumed_24h": 14500,
    "api_calls_24h": 230
  }
}
{
  "action_id": "act_uuid",
  "action_type": "deployment",
  "title": "Production deployment completed",
  "status": "success",
  "summary": "v2.3.1 deployed to 3/3 nodes.",
  "detail": {
    "started_at": "2026-02-16T14:33:00Z",
    "completed_at": "2026-02-16T14:35:42Z",
    "duration_ms": 162000
  },
  "triggered_by": {
    "decision_id": "dec_uuid"
  },
  "rollback_available": true,
  "before_state": { "version": "2.3.0" },
  "after_state": { "version": "2.3.1" }
}

Audit Event Schema

Every message generates an audit event that maps directly to the OTel LogRecord data model.

Structure

{
  "audit_event_id": "evt_uuid_v7",
  "timestamp": "2026-02-16T14:32:00.123456789Z",
  "observed_timestamp": "2026-02-16T14:32:00.125000000Z",

  "trace_id": "a1b2c3d4e5f67890a1b2c3d4e5f67890",
  "span_id": "1234567890abcdef",
  "parent_span_id": "abcdef1234567890",
  "trace_flags": 1,

  "severity_number": 9,
  "severity_text": "INFO",

  "body": {
    "event_type": "message_delivered",
    "message_id": "msg_uuid",
    "message_type": "decision_request",
    "summary": "Decision request delivered: Deploy v2.3.1 to production?"
  },

  "resource": {
    "service.name": "agentvault",
    "service.version": "0.1.0",
    "av.tenant.id": "tnt_uuid",
    "av.environment": "production"
  },

  "attributes": {
    "av.sender.entity_id": "agt_uuid",
    "av.sender.entity_type": "agent",
    "av.sender.hub_address": "claude-coder.alice.agentvault.hub",
    "av.recipient.entity_id": "usr_uuid",
    "av.recipient.entity_type": "user",
    "av.conversation.id": "conv_uuid",
    "av.message.type": "decision_request",
    "av.message.priority": "high",
    "av.delivery.latency_ms": 42
  },

  "hash_chain": {
    "event_hash": "sha256:a1b2c3d4...",
    "previous_hash": "sha256:e5f6a7b8...",
    "sequence_number": 10847
  }
}

OTel Field Mapping

OTel LogRecord FieldAgentVault FieldNotes
TimestamptimestampWhen the event occurred
ObservedTimestampobserved_timestampWhen the event was recorded (broker latency)
TraceIdtrace_idFrom message envelope
SpanIdspan_idFrom message envelope
TraceFlagstrace_flags1 = sampled (always for audit)
SeverityNumberseverity_numberOTel severity scale (1—24)
SeverityTextseverity_textDEBUG/INFO/WARN/ERROR/FATAL
BodybodyEvent description object
ResourceresourceService + tenant identification
AttributesattributesEvent-specific key-value pairs

Severity Mapping

Event TypeOTel SeverityNumber
heartbeatDEBUG5
message_deliveredINFO9
message_readINFO9
agent_connectedINFO9
agent_disconnectedINFO9
decision_madeINFO10
policy_evaluatedINFO10
action_executedINFO10
errorERROR17
security_violationFATAL21

Hash Chain

Tamper evidence without blockchain. Each audit event references the hash of the previous event, forming a per-tenant chain.

Hash Computation

The hash covers: previous hash, timestamp, trace context, event body, and participants. It excludes: observed_timestamp, delivery metadata, and the hash_chain block itself.
import hashlib
import json

def compute_event_hash(event: dict, previous_hash: str) -> str:
    hash_input = {
        "previous_hash": previous_hash,
        "timestamp": event["timestamp"],
        "trace_id": event["trace_id"],
        "span_id": event["span_id"],
        "body": event["body"],
        "sender": event["attributes"].get("av.sender.entity_id"),
        "recipient": event["attributes"].get("av.recipient.entity_id"),
        "sequence_number": event["hash_chain"]["sequence_number"]
    }
    canonical = json.dumps(hash_input, sort_keys=True, separators=(",", ":"))
    return "sha256:" + hashlib.sha256(canonical.encode("utf-8")).hexdigest()

GENESIS_HASH = "sha256:" + hashlib.sha256(b"agentvault_genesis_v1").hexdigest()

Chain Verification

def verify_chain(events: list[dict]) -> tuple[bool, int | None]:
    """Returns (True, None) if valid or (False, break_index) if tampered."""
    for i, event in enumerate(events):
        expected_previous = GENESIS_HASH if i == 0 else events[i-1]["hash_chain"]["event_hash"]
        if event["hash_chain"]["previous_hash"] != expected_previous:
            return (False, i)
        recomputed = compute_event_hash(event, expected_previous)
        if event["hash_chain"]["event_hash"] != recomputed:
            return (False, i)
    return (True, None)

Chain Partitioning

Chains are partitioned per tenant to avoid bottlenecks. Within a tenant, events are strictly ordered by sequence_number. Cross-tenant verification is handled via signed checkpoint hashes in the federation phase.
Tenant A: genesis -> evt_1 -> evt_2 -> evt_3 -> ...
Tenant B: genesis -> evt_1 -> evt_2 -> ...

Storage Schema

Core Tables

CREATE TABLE audit_events (
    audit_event_id    UUID PRIMARY KEY,
    tenant_id         UUID NOT NULL REFERENCES tenants(tenant_id),
    sequence_number   BIGINT NOT NULL,
    timestamp         TIMESTAMPTZ NOT NULL,
    observed_timestamp TIMESTAMPTZ NOT NULL DEFAULT now(),
    trace_id          CHAR(32) NOT NULL,
    span_id           CHAR(16) NOT NULL,
    parent_span_id    CHAR(16),
    trace_flags       SMALLINT DEFAULT 1,
    severity_number   SMALLINT NOT NULL,
    severity_text     TEXT NOT NULL,
    body              JSONB NOT NULL,
    resource          JSONB NOT NULL,
    attributes        JSONB NOT NULL,
    event_hash        TEXT NOT NULL,
    previous_hash     TEXT NOT NULL,
    UNIQUE (tenant_id, sequence_number)
);
Indexes:
  • (tenant_id, sequence_number) — Chain traversal
  • (trace_id) — Conversation/room lookup
  • (tenant_id, timestamp) — Time-range queries
  • (tenant_id, severity_number) WHERE severity_number >= 17 — Error/critical filtering
  • GIN on body and attributes — JSONB content search

Key Query Patterns

-- Full audit trail for a conversation
SELECT * FROM audit_events
WHERE trace_id = :trace_id
ORDER BY sequence_number ASC;

-- Critical/error events for tenant in last 24h
SELECT * FROM audit_events
WHERE tenant_id = :tenant_id
  AND severity_number >= 17
  AND timestamp > now() - interval '24 hours'
ORDER BY timestamp DESC;

-- Hash chain integrity verification
SELECT
    sequence_number,
    event_hash,
    previous_hash,
    LAG(event_hash) OVER (ORDER BY sequence_number) as expected_previous
FROM audit_events
WHERE tenant_id = :tenant_id
ORDER BY sequence_number;

API Endpoints

Audit Trail

GET /v1/audit/trace/{trace_id}
GET /v1/audit/tenant?since=...&until=...&severity_min=17
GET /v1/audit/entity/{entity_id}?limit=100

Chain Verification

POST /v1/audit/verify
{
  "tenant_id": "tnt_uuid",
  "from_sequence": 1,
  "to_sequence": 10847
}
Response:
{
  "valid": true,
  "events_verified": 10847,
  "first_hash": "sha256:...",
  "last_hash": "sha256:...",
  "verified_at": "2026-02-16T15:00:00Z"
}

Pending Decisions

GET /v1/decisions/pending?user_id={user_id}

Phase Evolution

The data model is designed to grow without schema changes across five phases.
1

Phase 1: 1:1 Human-Agent Chat

  • trace_id = conversation_id (1:1 mapping)
  • span_id = each message exchange
  • Single chain per tenant
  • Postgres storage, direct queries
  • No external OTel infrastructure needed
2

Phase 2: Multi-Agent Rooms

  • trace_id = room_id
  • Multiple agents produce spans within the same trace
  • parent_span_id links replies to triggers
  • New event types: agent_joined_room, agent_left_room
3

Phase 3: Agent-to-Agent Direct

  • Trace context propagates across agent boundaries
  • OTel SpanLinks for delegation chains referencing original traces
  • Policy evaluation events added to audit log
  • OTel Collector deployed as export sidecar
4

Phase 4: Team Workspaces

  • Resource attributes expand: av.team.id, av.workspace.id
  • RBAC audit events: permission_granted, permission_denied, role_changed
  • OTel export to customer backends becomes a selling point
5

Phase 5: Federation

  • Cross-tenant trace context propagation (W3C Trace Context headers)
  • Signed checkpoint hashes between federated tenants
  • Full OTel Collector deployment with multi-backend export

OTel Export Path

When ready, adding OTel export requires zero schema changes because the data model is already OTel-compatible:
async def emit_audit_event(event: AuditEvent):
    await db.insert_audit_event(event)
OTel Collector configuration for customer export:
# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  batch:
    timeout: 5s
  filter:
    logs:
      include:
        match_type: strict
        resource_attributes:
          - key: av.tenant.id
            value: "tnt_enterprise_customer"

exporters:
  splunkhec:
    endpoint: "https://customer-splunk.example.com:8088"
  datadog:
    api:
      key: "${DD_API_KEY}"

service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch, filter]
      exporters: [splunkhec, datadog]
Compatible with OpenTelemetry Specification v1.29+.