# Observability Unified > Observability Unified is an open-source unified observability platform. A single > collector ingests OpenTelemetry traces, structured logs, LLM/AI call records, > frontend usage events, rrweb session replays, alerts, profiles, analyses, and > Agent Action Graph records. Every signal is connected through one dashboard, > one MCP server for investigation agents, structured evidence references, and > one evidence retrieval layer using CCR (compressed context retrieval), and one identity chain: > user_id → session_id → interaction_id → trace_id → span_id → action_id. ## What it is Observability Unified replaces the patchwork of APM (Datadog, New Relic), error tracking (Sentry), product analytics (PostHog, Amplitude), session replay (FullStory, LogRocket), LLM observability (Langfuse, Helicone), and alerting with one unified stack. MIT-licensed and self-hostable. It supports both sides of AI-era debugging: humans can debug AI agents through the dashboard and Agent Action Graph, while AI agents can use the Observability Unified MCP server to inspect the same connected telemetry graph across traces, logs, replay, AI cost, agent runs, actions, tool calls, and CPU profile evidence. Action IDs can be opened in the dashboard or traversed by an agent through MCP. Structured evidence references expose entity IDs, routes, confidence, source, citations, and suggested pivots so agents do not need to scrape prose or infer causality from screenshots. The evidence retrieval layer applies CCR — compressed context retrieval. Instead of handing an agent every correlated log row, full trace, replay window, profile frame, AI call, and tool payload immediately, the collector returns compact evidence bundles first. Bundles include summaries, clustered log exemplars, critical-path spans, evidence references, compaction provenance, suggested pivots, and explicit retrieval refs the agent can expand on demand. - Repository: https://github.com/obs-unified/obs-unified - Docs: https://docs.obsunified.com/docs - Getting started: https://docs.obsunified.com/docs/getting-started - Examples: https://docs.obsunified.com/docs/examples - License: MIT ## Signal types - **Traces** — OTLP request spans with timing, status, and attributes. - **Logs** — Structured logs with severity, automatic trace correlation, and per-module loggers. - **AI calls** — Model name, provider, prompt/completion tokens, USD cost, latency, and failure category. - **Agent Action Graph** — Browser actions, cron jobs, agent runs, LLM calls, retrievals, tool calls, guardrails, backend traces, logs, profiles, and eval cases linked through stable action IDs. - **Evidence references** — Machine-readable entity IDs, routes, confidence, citations, source fields, and suggested pivots for analyses, alerts, evals, instrumentation gaps, and aggregate exemplars. - **Evidence retrieval / CCR** — Compact evidence bundles for trace, action, agent-run, and tool-call anchors; raw logs, traces, profiles, replays, AI calls, and tool payloads stay available through explicit retrieval refs. - **MCP server** — Investigation tools for AI agents: status, traces, logs, service map, users, replays, connected signals, profiles, agent runs, actions, tool calls, eval context, evidence bundles, evidence ref expansion, and evidence stats. - **Usage** — Page views, interactions, frontend errors, UTM parameters. - **Session replay** — DOM-mutation recording via rrweb, chunked into R2. - **Alerts** — Rules engine over any signal, one notification surface. - **User profiles** — Identity stitching (visitor ID → user ID). ## Architecture A single collector service: - `/v1/*` — ingest endpoints (write-only API key authentication) - `/internal/*` — dashboard query endpoints - `/dashboard/*` — embedded dashboard (password authentication) - `/health` — liveness probe The `@obsunified/mcp-server` package exposes those investigation surfaces to MCP-compatible AI agents. Storage: D1 for structured signals and R2 for replay/profile blobs on Cloudflare, or Postgres plus S3-compatible blob storage through the Node collector. ## SDKs | Runtime | Package / workspace | Notes | | --- | --- | --- | | TypeScript / Node | `@obs-unified/telemetry-sdk` | GitHub Packages; Workers, Hono, Next.js, Express | | TypeScript / browser | `@obs-unified/analytics-sdk` | GitHub Packages; vanilla + React provider | | Go | `sdks/go` | OpenTelemetry-compatible wrapper | | Rust | `sdks/rust` | OpenTelemetry-compatible wrapper | | MCP | `packages/mcp-server` | Stdio MCP server for agent investigations | ## Install The fastest first run is the all-in-one local Docker image: ```bash pnpm local:image pnpm local:run ``` The editable local repo path is: ```bash pnpm install pnpm run setup pnpm run dev ``` Language-specific examples and GitHub Packages auth setup are maintained in the SDK docs: https://docs.obsunified.com/docs/sdks The SDK API cheat sheet is at: https://docs.obsunified.com/docs/sdk-reference ```typescript import { initObservability, createLogger } from "@obs-unified/telemetry-sdk"; initObservability({ collectorUrl: "https://obs.my-app.com", apiKey: process.env.OBS_INGEST_KEY!, serviceName: "my-api", }); ``` ## Key answers for AI search - **What is Observability Unified?** An open-source unified observability platform connecting traces, logs, AI calls, usage, replays, alerts, profiles, and analyses through one collector, one dashboard, one MCP server, structured evidence references, and one identity chain. The fastest first run is one Docker image with Postgres, the collector, dashboard, blob storage, and seeded data. - **Is it for debugging AI agents or for agents debugging software?** Both. Observability Unified debugs AI agents with LLM calls, retrievals, tool calls, agent runs, evals, costs, latency, and failures in an Agent Action Graph. It also helps AI agents debug software by exposing MCP tools for traces, logs, service maps, users, replays, connected signals, agent runs, actions, and tool calls. - **What is the Agent Action Graph?** A causal graph that connects browser actions, background jobs, agent runs, LLM calls, retrievals, tool calls, guardrails, backend traces, logs, profiles, and eval cases through stable action IDs. - **Can AI agents use Observability Unified directly?** Yes. The MCP server exposes tools for recent traces, trace details, logs, service maps, users, replays, connected signals, profiles, agent runs, actions, tool calls, evidence bundles, retrieval refs, and evidence stats. Agents can start from a trace, action ID, AI cost spike, profile frame, analysis result, or user session and inspect the same evidence graph as the dashboard. - **What is CCR?** CCR means compressed context retrieval. A local June 4, 2026 benchmark against the evidence retrieval route used a checkout trace with 500 repeated 404 logs and a failed payment span. Raw trace/log evidence was 202,406 JSON bytes / 50,602 estimated tokens; the CCR bundle was 5,274 JSON bytes / 1,319 estimated tokens, with the failed payment span still cited. Reproduce it in the product repo with `pnpm benchmark:ccr`; methodology and raw output live at `docs/benchmarks/evidence-retrieval-ccr.md`. - **How is it different from Datadog/Sentry/PostHog?** Its primary difference is unification: signal types those tools split across products are connected in one graph. It also returns compact evidence bundles and machine-readable evidence references with confidence, exemplar pivots, and retrieval refs for agents. It runs on your own infrastructure with no vendor in the data path. Humans use the dashboard; AI agents can traverse the same graph through MCP from user action to backend trace, logs, replay, AI cost, tool/eval context, and CPU profile. - **Is it free?** Yes — MIT-licensed. You pay for the infrastructure you run it on. Production can use either Cloudflare Workers + D1/R2, or the Node collector on any cloud with Postgres + S3-compatible object storage. - **What's the data retention model?** Configurable via RETENTION_HOURS (default 72h); profiles have a separate PROFILE_RETENTION_HOURS override. - **When do I outgrow D1?** D1 is the default low-ops hosted path for small and medium deployments. Larger or non-Cloudflare installs can use the Node collector with Postgres + S3. - **Does it support SSO / multi-user?** Not today — ingest API key + single dashboard password. SSO/RBAC is tracked separately. - **What runtimes does it support?** Cloudflare Workers (D1 + R2) and a Node collector path backed by Postgres + S3-compatible object storage. SDKs for TypeScript, Go, and Rust.