Introducing Atatus MCP Server: Connect AI Agents to Your Observability Data

Published: Jun 30, 2026 Updated: Jun 30, 2026 13 min read Observability ai observability distributed tracing distributed tracing tools

AI coding assistants like Claude, Cursor, Codex, GitHub Copilot have become standard tools in the modern engineering workflow. Developers use them to write code, generate tests, and review pull requests. But when something breaks in production, these assistants hit a wall: they have no access to your actual system state.

They can reason about logs, traces, and metrics. They just can't see yours.

This is the problem Atatus MCP Server solves. Using the Model Context Protocol (MCP) - an open standard for connecting AI agents to external data sources, Atatus MCP Server gives AI assistants secure, structured access to your observability platform. Instead of switching between dashboards, your AI agent can query production telemetry directly and give you answers grounded in real data.

Try Atatus MCP Server Free

Connect Claude or Cursor to your observability data in under 5 minutes.

Start Free Trial →

What Is Atatus MCP Server?

Atatus MCP Server is a remote server that implements the Model Context Protocol exposing Atatus observability data as structured tools that any MCP-compatible AI client can call.

When you connect Claude Code or Cursor to Atatus MCP Server, the AI assistant gains the ability to query your logs, search distributed traces, fetch infrastructure metrics, read active alerts, and retrieve error groups within a single conversation context, without manual navigation of the Atatus UI.

Instead of:

Opening a browser tab
Navigating to the right service
Adjusting the time range
Finding the trace
Copying the error to your IDE

You ask your AI agent directly:

Example Prompt:

"Why did the checkout API latency spike after the 3pm deployment? Show me the slow traces and any related errors."

The agent calls the relevant Atatus MCP tools, synthesizes the data, and surfaces the root cause with spans, stack traces, and query context directly in your editor.

Why Developers Need It?

The biggest bottleneck in incident response is not intelligence, it is context. Your team understands distributed systems. Your AI assistant can reason about them. The gap is that AI agents have no access to what is actually happening in your production environment.

Here is what that costs in practice:

Problem	Without Atatus MCP	With Atatus MCP Server
Context switching	Toggle between IDE, dashboards, and logs	Query observability data directly from your editor
Incident investigation	Manually correlate traces, logs, and errors	AI correlates signals automatically across all data sources
Onboarding	New engineers learn dashboards before they can debug	Ask questions in natural language from day one
Alert triage	Open Atatus, find the alert, and click through to the relevant context	"Summarize today's alerts and their status" in one prompt
Post-mortems	Reconstruct timelines manually across multiple tools	AI builds the incident timeline automatically from logs and trace data

How It Works?

The architecture is straightforward. Your AI client communicates with Atatus MCP Server using the MCP protocol. The server authenticates the request, calls the relevant Atatus APIs, and returns structured data the AI can reason about.

Component Breakdown

MCP Client - the AI assistant running inside Claude Code, Cursor, or VS Code. It discovers available tools by requesting the tool manifest from Atatus MCP Server and calls them as needed during a conversation.
Atatus MCP Server - implements the MCP server specification. It handles tool routing, request validation, authentication, response formatting, and result truncation to fit AI context windows. Runs as a remote SSE endpoint for cloud deployments.
Authentication layer - validates every request before it reaches Atatus APIs. Supports OAuth 2.0 tokens and API keys. RBAC policies determine which tools a given credential can call.
Atatus Platform APIs - the underlying REST APIs powering the Atatus observability platform. MCP Server sits in front of them, translating natural-language tool calls into structured API requests and returning formatted responses.

Transport:

Atatus MCP Server supports both stdio (for local process-based clients) and Server-Sent Events (SSE) (for remote cloud-hosted connections). Most production deployments use SSE with the remote endpoint URL.

Supported AI Clients

AI Client	Connection Type	MCP Support	Notes
Claude Code	SSE / stdio	✅ Native	Best-in-class MCP support; recommended for production.
Claude Desktop	stdio	✅ Native	Full tool calling; configure via `mcp_settings.json`.
Cursor	SSE	✅ Native	Add via Cursor Settings → MCP Servers.
OpenAI Codex	SSE	✅ Supported	MCP tool use via Codex function-calling bridge.
VS Code	SSE	✅ Extension	Requires the MCP extension from the VS Code Marketplace.
Windsurf	SSE	✅ Native	Configure in the Windsurf MCP settings panel.
Continue.dev	stdio / SSE	✅ Native	Add as a context provider in `config.json`.
Any MCP client	SSE / stdio	✅ Compatible	Works with any client implementing the MCP specification.

Authentication & Security

Every request to Atatus MCP Server is authenticated before it reaches your data. We support two authentication methods and enforce RBAC at the tool level.

Authentication Methods

Method	Recommended For	Setup
API Key	Individual developer setups, CI/CD	Generate in Atatus → Settings → API Keys
OAuth 2.0	Enterprise multi-user deployments	Configure an OAuth application in Atatus → Settings → OAuth

RBAC & Permissions

Create dedicated service accounts for your AI agents. Assign each account only the tool categories it needs. A CI-based Codex agent checking deployment status does not need access to raw log search.

Role	Tool Access	Use Case
Read-only Viewer	All read tools, no mutations	Developer debugging in IDE
APM Reader	Traces, errors, and APM metrics only	Backend service teams
Infra Reader	Infrastructure, Kubernetes, and host metrics	Platform engineering teams
Incident Responder	Alerts, incidents, logs, and traces	On-call SRE workflows
Full Read Access	All available tools	Engineering managers and staff engineers

Available MCP Tools

Atatus MCP Server exposes tools across seven capability domains. Each tool accepts structured parameters and returns formatted data sized for AI context windows.

Logs

Tool	Description	Example Prompt
search_logs	Full-text search across log streams with optional service, severity, and time filters.	"Search logs for 'connection refused' in the payments service in the last 30 minutes."
get_log_patterns	Cluster log messages by pattern to automatically surface recurring errors.	"What are the most common error patterns in the auth service today?"
get_log_volume	Return log volume over a time range, optionally grouped by service or severity.	"Show me the error log volume for the API gateway over the last 6 hours."
tail_logs	Return the most recent log lines for a given service or query.	"Show me the last 50 log lines from the order-service."

APM & Distributed Tracing

Tool	Description	Example Prompt
get_slow_traces	Return the slowest traces for a service and time window, including span breakdown.	"Find the slowest checkout API traces from the last hour."
get_trace_detail	Fetch a complete trace with spans, service graph, and timing for a specific trace ID.	"Show me the full trace for trace ID abc123."
get_endpoint_metrics	Return P50, P95, P99 latency, request rate, and error rate for an endpoint.	"What is the P99 latency for POST /api/v2/orders over the last 24 hours?"
find_slow_db_queries	List the slowest database queries observed during a selected time window.	"Find slow database queries in the last hour."
compare_deployments	Compare APM metrics before and after a deployment event.	"Compare API performance before and after the 3 PM deployment."

Infrastructure

Tool	Description	Example Prompt
get_host_metrics	Retrieve CPU, memory, disk I/O, and network metrics for a specific host.	"Show CPU and memory usage for the db-prod-01 host."
get_kubernetes_pods	List Kubernetes pods with status, restart count, and resource usage.	"Which pods are restarting frequently in the payments namespace?"
get_container_metrics	Return container-level CPU, memory, and network usage metrics.	"Show me memory usage for the api-gateway containers."
list_services	Return all monitored services with their current health status.	"List all services and their current health status."

Errors

Tool	Description	Example Prompt
get_error_groups	Return grouped errors by type and frequency for a specific service.	"List the top errors in the order service from the last hour."
get_error_detail	Fetch the full stack trace, affected user count, and first/last seen timestamps for an error group.	"Show me the stack trace and impact of error group #4892."
get_error_trend	View error rate trends over time with deployment correlation.	"Did error rates increase after today's deployment?"

Alerts & Incidents

Tool	Description	Example Prompt
get_active_alerts	List all currently firing alerts with their severity and triggering metric.	"What alerts are currently firing?"
get_alert_history	Retrieve historical alert events for a specified time window.	"Show me all alerts that fired today."
get_incidents	Return recent incidents with their timeline and current status.	"Summarize all incidents from today."
get_incident_timeline	Retrieve the complete event timeline for a specific incident.	"Walk me through what happened during incident INC-442."

Dashboards

Tool	Description	Example Prompt
list_dashboards	List available dashboards by name and tag.	"What dashboards do we have for the payments team?"
get_dashboard_summary	Return current widget values and key metrics from a dashboard.	"Summarize the current state of the production overview dashboard."

Example Workflows

These workflows represent real scenarios where Atatus MCP Server reduces mean-time-to-resolution by giving your AI agent the production context it needs.

Workflow 01 · Deployment Regression

"Why did latency spike after the 3pm deployment?"

The agent calls compare_deployments to diff APM metrics before/after, then get_slow_traces to find affected endpoints, then find_slow_db_queries to pinpoint the root cause — often a missing index or an N+1 query introduced in the release.

Workflow 02 · Error Rate Triage

"Which services have the highest error rate right now?"

The agent calls list_services for health status, then get_error_groups on each affected service to surface the specific exceptions, their frequency, and first occurrence time relative to recent deploys.

Workflow 03 · Incident Summary

"Summarize all incidents from today with current status."

The agent calls get_incidents for the day's incident list, then get_incident_timeline on each open incident to build a structured summary with contributing signals, affected services, and resolution status.

Workflow 04 · Kubernetes Pod Health

"Which pods are restarting frequently and why?"

The agent calls get_kubernetes_pods to list restart counts, then get_container_metrics on affected pods to check for OOM conditions, then search_logs to find the exact error messages triggering the restarts.

Workflow 05 · Database Performance

"Find slow database queries affecting the order service."

The agent calls find_slow_db_queries filtered to the order service, retrieves the full query text and execution plans, then correlates with get_slow_traces to identify which API endpoints trigger them.

Workflow 06 · On-Call Handoff

"Give me a status report of the last 8 hours to prepare for handoff."

The agent compiles a structured report: active alerts, resolved incidents, top error groups, services with degraded P99 latency, and any anomalous infrastructure metrics in a format ready to paste into a Slack message or runbook.

Enterprise Features

Atatus MCP Server is designed for production engineering teams with compliance, auditability, and multi-tenant requirements.

Feature	Description
Audit Logging	Every tool call is logged with the authenticated identity, tool name, parameters, and response status. Logs are retained according to your data retention policy.
RBAC Enforcement	Enforce tool-level permissions by allowing or denying individual tools or entire categories for each service account.
Multi-tenant Isolation	All requests are scoped to the authenticated account, preventing cross-tenant data access.
Encrypted Transport	All MCP connections use TLS 1.2 or later. Plaintext transport is never permitted.
Read-only Design	MCP tools provide read-only access. AI agents can observe data but cannot modify any Atatus resources.
Session Tracking	Each MCP session is assigned a unique ID, with session metadata available for auditing and security reviews.
IP Allowlisting	Restrict MCP Server access to approved IP ranges or VPN egress IP addresses for additional security.

Monitoring MCP Usage

An AI layer that calls your observability APIs at scale is itself a system worth monitoring. Atatus emits telemetry on MCP Server activity so you can observe how your AI agents are using production data.

Metric	Description	Alert When
mcp.tool.calls	Total tool invocations grouped by tool name and authenticated identity.	Unexpected spike in tool call volume.
mcp.session.count	Tracks active and historical MCP sessions.	Unusual increase in session creation rate.
mcp.response.latency	Measures tool response latency (P50, P95, and P99).	P99 latency exceeds 5 seconds for critical tools.
mcp.auth.failures	Authentication failures grouped by credential or identity.	Any sustained authentication failure rate.
mcp.errors	Counts tool-level error responses.	Error rate exceeds 5%.
mcp.tokens.consumed	Approximate context tokens returned per MCP session.	Token usage approaches model context limits.

Performance Optimization

AI context windows are finite. Returning 10MB of raw logs to an LLM is counterproductive. Atatus MCP Server is designed to return exactly what the agent needs, nothing more.

Response truncation - all log and trace responses are capped at configurable limits (default: 100 log lines, 50 spans) to prevent context window overflow.
Pagination - tools that return large datasets support cursor-based pagination so agents can request additional pages if needed.
Intelligent field selection - responses include only the fields most relevant to the query type. Raw log objects are stripped to essential fields (timestamp, level, message, service, trace ID).
Server-side filtering - filtering happens at the API level, not in the AI layer. Time ranges, service filters, and severity filters reduce data before it enters the context window.
Result caching - repeated identical queries within a session window are served from cache, reducing API load and improving response time.
Rate limiting - per-credential rate limits prevent AI agents from inadvertently overwhelming the Atatus API tier during recursive investigation loops.

Best Practices

Credential Management

Create one service account per AI agent type (IDE agent, CI agent, on-call bot). Never share credentials across agents.
Assign the minimum tool permissions required. An IDE debugging agent rarely needs alert management tools.
Rotate API keys on a 90-day schedule. Store them in environment variables or secrets managers, never in dotfiles committed to version control.

Observability Data Hygiene

Use consistent service naming across your stack. An AI agent correlating logs to traces to metrics relies on service names matching across all three data sources.
Tag your deployments in Atatus. The compare_deployments tool is most useful when deployment events are accurately tracked.
Ensure your OpenTelemetry instrumentation is complete. Sparse tracing coverage limits the AI agent's ability to identify root causes across service boundaries.

Prompt Engineering

Specify time ranges explicitly: "in the last 30 minutes" is more useful than "recently."
Name specific services when you know them: "in the payments service" focuses the tool call and returns more relevant results.
Break complex investigations into steps: first identify affected services, then drill into specific errors, then find root causes in traces.
Ask for comparisons: "compare before and after deployment X" triggers the compare_deployments tool which is purpose-built for regression detection.

Ready to Connect Your AI Agent to Atatus?

Full platform access. No credit card. Cancel any time.

Start Free Trial →

Frequently Asked Questions

1) How is it different from calling the Atatus API directly?
Direct API calls require custom integration code for every AI tool you use. MCP is a universal protocol: any MCP-compatible client connects to Atatus MCP Server without bespoke glue code. The server also handles context optimization, response truncation, and token-efficient formatting automatically, things you would need to build yourself with raw API access.

2) Which AI clients are supported?
Claude Code, Claude Desktop, Cursor, OpenAI Codex, VS Code (with the MCP extension), Windsurf, Continue.dev, and any other client that implements the MCP specification. If a client supports MCP, it supports Atatus MCP Server.

3) Is Atatus MCP Server secure for production use?
Yes. Every request is authenticated via OAuth 2.0 or API key before reaching Atatus APIs. RBAC limits which tools a given credential can call. All connections use TLS. The server is read-only by design, no tool can modify Atatus data. Full audit logs are maintained for every tool call.

4) Can multiple developers use Atatus MCP Server simultaneously?
Yes. Atatus MCP Server is multi-tenant and designed for concurrent access. Each developer or agent maintains an independent authenticated session. Sessions are isolated and audited separately.

5) How is MCP usage monitored?
Atatus emits telemetry on MCP tool calls, session counts, response latency, authentication failures, and token consumption. You can build Atatus dashboards on top of this data and set alerts on unusual activity patterns.

6) What Atatus products are accessible via MCP?
APM (traces, endpoint metrics), Logs, Infrastructure Monitoring (hosts, Kubernetes, containers), Error Tracking, Alerts, Dashboards, Synthetic Monitoring status, and Incidents. Coverage expands as we add new tools.

7) How do I get started?
Start a free 14-day Atatus trial. Once your account is active, navigate to Settings → API Keys, generate a new key, and add the MCP server configuration to your preferred AI client. Check full setup documentation

Conclusion

AI coding assistants are no longer just code-generation tools, they are becoming the primary interface through which engineers interact with their systems. The missing piece has always been production context.

Atatus MCP Server closes that gap. With a single configuration block in your AI client, Claude, Cursor, or VS Code gain access to your full observability stack such as logs, metrics, distributed traces, errors, infrastructure, alerts, and incidents through a secure, read-only, enterprise-grade interface.

Your team stops switching tabs to investigate incidents. Your on-call engineers get answers faster. Your AI agents finally have the context they need to be genuinely useful in production.

Get Started

Connect your preferred AI assistant to Atatus in under 5 minutes.

Start Free Trial →

Atatus

#1 Solution for Logs, Traces & Metrics

APM

Kubernetes

Logs

Synthetics

RUM

Serverless

Security

Try Atatus For Free

I write about APM and observability, sharing practical insights to help engineering teams, platform, and SRE teams evaluate and adopt monitoring tools.

Chennai, Tamilnadu