Introducing Atatus MCP Server: Connect AI Agents to Your Observability Data

AI coding assistants like Claude, Cursor, Codex, GitHub Copilot have become standard tools in the modern engineering workflow. Developers use them to write code, generate tests, and review pull requests. But when something breaks in production, these assistants hit a wall: they have no access to your actual system state.

They can reason about logs, traces, and metrics. They just can't see yours.

This is the problem Atatus MCP Server solves. Using the Model Context Protocol (MCP) - an open standard for connecting AI agents to external data sources, Atatus MCP Server gives AI assistants secure, structured access to your observability platform. Instead of switching between dashboards, your AI agent can query production telemetry directly and give you answers grounded in real data.

Try Atatus MCP Server Free

Connect Claude or Cursor to your observability data in under 5 minutes.

Start Free Trial →

In this article:

What Is Atatus MCP Server?

Atatus MCP Server is a remote server that implements the Model Context Protocol exposing Atatus observability data as structured tools that any MCP-compatible AI client can call.

When you connect Claude Code or Cursor to Atatus MCP Server, the AI assistant gains the ability to query your logs, search distributed traces, fetch infrastructure metrics, read active alerts, and retrieve error groups within a single conversation context, without manual navigation of the Atatus UI.

Instead of:

  1. Opening a browser tab
  2. Navigating to the right service
  3. Adjusting the time range
  4. Finding the trace
  5. Copying the error to your IDE

You ask your AI agent directly:

Example Prompt:
"Why did the checkout API latency spike after the 3pm deployment? Show me the slow traces and any related errors."

The agent calls the relevant Atatus MCP tools, synthesizes the data, and surfaces the root cause with spans, stack traces, and query context directly in your editor.

Why Developers Need It?

The biggest bottleneck in incident response is not intelligence, it is context. Your team understands distributed systems. Your AI assistant can reason about them. The gap is that AI agents have no access to what is actually happening in your production environment.

Here is what that costs in practice:

Problem Without Atatus MCP With Atatus MCP Server
Context switching Toggle between IDE, dashboards, and logs Query observability data directly from your editor
Incident investigation Manually correlate traces, logs, and errors AI correlates signals automatically across all data sources
Onboarding New engineers learn dashboards before they can debug Ask questions in natural language from day one
Alert triage Open Atatus, find the alert, and click through to the relevant context "Summarize today's alerts and their status" in one prompt
Post-mortems Reconstruct timelines manually across multiple tools AI builds the incident timeline automatically from logs and trace data

How It Works?

The architecture is straightforward. Your AI client communicates with Atatus MCP Server using the MCP protocol. The server authenticates the request, calls the relevant Atatus APIs, and returns structured data the AI can reason about.

How Atatus MCP Server Works?
How Atatus MCP Server Works?

Component Breakdown

  • MCP Client - the AI assistant running inside Claude Code, Cursor, or VS Code. It discovers available tools by requesting the tool manifest from Atatus MCP Server and calls them as needed during a conversation.
  • Atatus MCP Server - implements the MCP server specification. It handles tool routing, request validation, authentication, response formatting, and result truncation to fit AI context windows. Runs as a remote SSE endpoint for cloud deployments.
  • Authentication layer - validates every request before it reaches Atatus APIs. Supports OAuth 2.0 tokens and API keys. RBAC policies determine which tools a given credential can call.
  • Atatus Platform APIs - the underlying REST APIs powering the Atatus observability platform. MCP Server sits in front of them, translating natural-language tool calls into structured API requests and returning formatted responses.
Transport:
Atatus MCP Server supports both stdio (for local process-based clients) and Server-Sent Events (SSE) (for remote cloud-hosted connections). Most production deployments use SSE with the remote endpoint URL.

Supported AI Clients

AI Client Connection Type MCP Support Notes
Claude Code SSE / stdio ✅ Native Best-in-class MCP support; recommended for production.
Claude Desktop stdio ✅ Native Full tool calling; configure via mcp_settings.json.
Cursor SSE ✅ Native Add via Cursor Settings → MCP Servers.
OpenAI Codex SSE ✅ Supported MCP tool use via Codex function-calling bridge.
VS Code SSE ✅ Extension Requires the MCP extension from the VS Code Marketplace.
Windsurf SSE ✅ Native Configure in the Windsurf MCP settings panel.
Continue.dev stdio / SSE ✅ Native Add as a context provider in config.json.
Any MCP client SSE / stdio ✅ Compatible Works with any client implementing the MCP specification.

Authentication & Security

Every request to Atatus MCP Server is authenticated before it reaches your data. We support two authentication methods and enforce RBAC at the tool level.

Authentication Methods

Method Recommended For Setup
API Key Individual developer setups, CI/CD Generate in Atatus → Settings → API Keys
OAuth 2.0 Enterprise multi-user deployments Configure an OAuth application in Atatus → Settings → OAuth

RBAC & Permissions

Create dedicated service accounts for your AI agents. Assign each account only the tool categories it needs. A CI-based Codex agent checking deployment status does not need access to raw log search.

Role Tool Access Use Case
Read-only Viewer All read tools, no mutations Developer debugging in IDE
APM Reader Traces, errors, and APM metrics only Backend service teams
Infra Reader Infrastructure, Kubernetes, and host metrics Platform engineering teams
Incident Responder Alerts, incidents, logs, and traces On-call SRE workflows
Full Read Access All available tools Engineering managers and staff engineers

Available MCP Tools

Atatus MCP Server exposes tools across seven capability domains. Each tool accepts structured parameters and returns formatted data sized for AI context windows.

Logs

Tool Description Example Prompt
search_logs Full-text search across log streams with optional service, severity, and time filters. "Search logs for 'connection refused' in the payments service in the last 30 minutes."
get_log_patterns Cluster log messages by pattern to automatically surface recurring errors. "What are the most common error patterns in the auth service today?"
get_log_volume Return log volume over a time range, optionally grouped by service or severity. "Show me the error log volume for the API gateway over the last 6 hours."
tail_logs Return the most recent log lines for a given service or query. "Show me the last 50 log lines from the order-service."

APM & Distributed Tracing

Tool Description Example Prompt
get_slow_traces Return the slowest traces for a service and time window, including span breakdown. "Find the slowest checkout API traces from the last hour."
get_trace_detail Fetch a complete trace with spans, service graph, and timing for a specific trace ID. "Show me the full trace for trace ID abc123."
get_endpoint_metrics Return P50, P95, P99 latency, request rate, and error rate for an endpoint. "What is the P99 latency for POST /api/v2/orders over the last 24 hours?"
find_slow_db_queries List the slowest database queries observed during a selected time window. "Find slow database queries in the last hour."
compare_deployments Compare APM metrics before and after a deployment event. "Compare API performance before and after the 3 PM deployment."

Infrastructure

Tool Description Example Prompt
get_host_metrics Retrieve CPU, memory, disk I/O, and network metrics for a specific host. "Show CPU and memory usage for the db-prod-01 host."
get_kubernetes_pods List Kubernetes pods with status, restart count, and resource usage. "Which pods are restarting frequently in the payments namespace?"
get_container_metrics Return container-level CPU, memory, and network usage metrics. "Show me memory usage for the api-gateway containers."
list_services Return all monitored services with their current health status. "List all services and their current health status."

Errors

Tool Description Example Prompt
get_error_groups Return grouped errors by type and frequency for a specific service. "List the top errors in the order service from the last hour."
get_error_detail Fetch the full stack trace, affected user count, and first/last seen timestamps for an error group. "Show me the stack trace and impact of error group #4892."
get_error_trend View error rate trends over time with deployment correlation. "Did error rates increase after today's deployment?"

Alerts & Incidents

Tool Description Example Prompt
get_active_alerts List all currently firing alerts with their severity and triggering metric. "What alerts are currently firing?"
get_alert_history Retrieve historical alert events for a specified time window. "Show me all alerts that fired today."
get_incidents Return recent incidents with their timeline and current status. "Summarize all incidents from today."
get_incident_timeline Retrieve the complete event timeline for a specific incident. "Walk me through what happened during incident INC-442."

Dashboards

Tool Description Example Prompt
list_dashboards List available dashboards by name and tag. "What dashboards do we have for the payments team?"
get_dashboard_summary Return current widget values and key metrics from a dashboard. "Summarize the current state of the production overview dashboard."

Example Workflows

These workflows represent real scenarios where Atatus MCP Server reduces mean-time-to-resolution by giving your AI agent the production context it needs.

Workflow 01 · Deployment Regression

"Why did latency spike after the 3pm deployment?"

The agent calls compare_deployments to diff APM metrics before/after, then get_slow_traces to find affected endpoints, then find_slow_db_queries to pinpoint the root cause — often a missing index or an N+1 query introduced in the release.

Workflow 02 · Error Rate Triage

"Which services have the highest error rate right now?"

The agent calls list_services for health status, then get_error_groups on each affected service to surface the specific exceptions, their frequency, and first occurrence time relative to recent deploys.

Workflow 03 · Incident Summary

"Summarize all incidents from today with current status."

The agent calls get_incidents for the day's incident list, then get_incident_timeline on each open incident to build a structured summary with contributing signals, affected services, and resolution status.

Workflow 04 · Kubernetes Pod Health

"Which pods are restarting frequently and why?"

The agent calls get_kubernetes_pods to list restart counts, then get_container_metrics on affected pods to check for OOM conditions, then search_logs to find the exact error messages triggering the restarts.

Workflow 05 · Database Performance

"Find slow database queries affecting the order service."

The agent calls find_slow_db_queries filtered to the order service, retrieves the full query text and execution plans, then correlates with get_slow_traces to identify which API endpoints trigger them.

Workflow 06 · On-Call Handoff

"Give me a status report of the last 8 hours to prepare for handoff."

The agent compiles a structured report: active alerts, resolved incidents, top error groups, services with degraded P99 latency, and any anomalous infrastructure metrics in a format ready to paste into a Slack message or runbook.

Enterprise Features

Atatus MCP Server is designed for production engineering teams with compliance, auditability, and multi-tenant requirements.

Feature Description
Audit Logging Every tool call is logged with the authenticated identity, tool name, parameters, and response status. Logs are retained according to your data retention policy.
RBAC Enforcement Enforce tool-level permissions by allowing or denying individual tools or entire categories for each service account.
Multi-tenant Isolation All requests are scoped to the authenticated account, preventing cross-tenant data access.
Encrypted Transport All MCP connections use TLS 1.2 or later. Plaintext transport is never permitted.
Read-only Design MCP tools provide read-only access. AI agents can observe data but cannot modify any Atatus resources.
Session Tracking Each MCP session is assigned a unique ID, with session metadata available for auditing and security reviews.
IP Allowlisting Restrict MCP Server access to approved IP ranges or VPN egress IP addresses for additional security.

Monitoring MCP Usage

An AI layer that calls your observability APIs at scale is itself a system worth monitoring. Atatus emits telemetry on MCP Server activity so you can observe how your AI agents are using production data.

Metric Description Alert When
mcp.tool.calls Total tool invocations grouped by tool name and authenticated identity. Unexpected spike in tool call volume.
mcp.session.count Tracks active and historical MCP sessions. Unusual increase in session creation rate.
mcp.response.latency Measures tool response latency (P50, P95, and P99). P99 latency exceeds 5 seconds for critical tools.
mcp.auth.failures Authentication failures grouped by credential or identity. Any sustained authentication failure rate.
mcp.errors Counts tool-level error responses. Error rate exceeds 5%.
mcp.tokens.consumed Approximate context tokens returned per MCP session. Token usage approaches model context limits.

Performance Optimization

AI context windows are finite. Returning 10MB of raw logs to an LLM is counterproductive. Atatus MCP Server is designed to return exactly what the agent needs, nothing more.

  • Response truncation - all log and trace responses are capped at configurable limits (default: 100 log lines, 50 spans) to prevent context window overflow.
  • Pagination - tools that return large datasets support cursor-based pagination so agents can request additional pages if needed.
  • Intelligent field selection - responses include only the fields most relevant to the query type. Raw log objects are stripped to essential fields (timestamp, level, message, service, trace ID).
  • Server-side filtering - filtering happens at the API level, not in the AI layer. Time ranges, service filters, and severity filters reduce data before it enters the context window.
  • Result caching - repeated identical queries within a session window are served from cache, reducing API load and improving response time.
  • Rate limiting - per-credential rate limits prevent AI agents from inadvertently overwhelming the Atatus API tier during recursive investigation loops.

Best Practices

Credential Management

  • Create one service account per AI agent type (IDE agent, CI agent, on-call bot). Never share credentials across agents.
  • Assign the minimum tool permissions required. An IDE debugging agent rarely needs alert management tools.
  • Rotate API keys on a 90-day schedule. Store them in environment variables or secrets managers, never in dotfiles committed to version control.

Observability Data Hygiene

  • Use consistent service naming across your stack. An AI agent correlating logs to traces to metrics relies on service names matching across all three data sources.
  • Tag your deployments in Atatus. The compare_deployments tool is most useful when deployment events are accurately tracked.
  • Ensure your OpenTelemetry instrumentation is complete. Sparse tracing coverage limits the AI agent's ability to identify root causes across service boundaries.

Prompt Engineering

  • Specify time ranges explicitly: "in the last 30 minutes" is more useful than "recently."
  • Name specific services when you know them: "in the payments service" focuses the tool call and returns more relevant results.
  • Break complex investigations into steps: first identify affected services, then drill into specific errors, then find root causes in traces.
  • Ask for comparisons: "compare before and after deployment X" triggers the compare_deployments tool which is purpose-built for regression detection.

Ready to Connect Your AI Agent to Atatus?

Full platform access. No credit card. Cancel any time.

Start Free Trial →

Frequently Asked Questions


1) How is it different from calling the Atatus API directly?
Direct API calls require custom integration code for every AI tool you use. MCP is a universal protocol: any MCP-compatible client connects to Atatus MCP Server without bespoke glue code. The server also handles context optimization, response truncation, and token-efficient formatting automatically, things you would need to build yourself with raw API access.

2) Which AI clients are supported?
Claude Code, Claude Desktop, Cursor, OpenAI Codex, VS Code (with the MCP extension), Windsurf, Continue.dev, and any other client that implements the MCP specification. If a client supports MCP, it supports Atatus MCP Server.

3) Is Atatus MCP Server secure for production use?
Yes. Every request is authenticated via OAuth 2.0 or API key before reaching Atatus APIs. RBAC limits which tools a given credential can call. All connections use TLS. The server is read-only by design, no tool can modify Atatus data. Full audit logs are maintained for every tool call.

4) Can multiple developers use Atatus MCP Server simultaneously?
Yes. Atatus MCP Server is multi-tenant and designed for concurrent access. Each developer or agent maintains an independent authenticated session. Sessions are isolated and audited separately.

5) How is MCP usage monitored?
Atatus emits telemetry on MCP tool calls, session counts, response latency, authentication failures, and token consumption. You can build Atatus dashboards on top of this data and set alerts on unusual activity patterns.

6) What Atatus products are accessible via MCP?
APM (traces, endpoint metrics), Logs, Infrastructure Monitoring (hosts, Kubernetes, containers), Error Tracking, Alerts, Dashboards, Synthetic Monitoring status, and Incidents. Coverage expands as we add new tools.

7) How do I get started?
Start a free 14-day Atatus trial. Once your account is active, navigate to Settings → API Keys, generate a new key, and add the MCP server configuration to your preferred AI client. Check full setup documentation

Conclusion

AI coding assistants are no longer just code-generation tools, they are becoming the primary interface through which engineers interact with their systems. The missing piece has always been production context.

Atatus MCP Server closes that gap. With a single configuration block in your AI client, Claude, Cursor, or VS Code gain access to your full observability stack such as logs, metrics, distributed traces, errors, infrastructure, alerts, and incidents through a secure, read-only, enterprise-grade interface.

Your team stops switching tabs to investigate incidents. Your on-call engineers get answers faster. Your AI agents finally have the context they need to be genuinely useful in production.

Get Started

Connect your preferred AI assistant to Atatus in under 5 minutes.

Start Free Trial →
Atatus

#1 Solution for Logs, Traces & Metrics

tick-logo APM

tick-logo Kubernetes

tick-logo Logs

tick-logo Synthetics

tick-logo RUM

tick-logo Serverless

tick-logo Security

tick-logo More

Mohana Ayeswariya J

Mohana Ayeswariya J

I write about APM and observability, sharing practical insights to help engineering teams, platform, and SRE teams evaluate and adopt monitoring tools.
Chennai, Tamilnadu