Sensitive Data Classifier

Stop Sensitive Data Leaks
Before They Happen

Discover, classify, and redact PII, credentials, and regulated data across logs, traces, RUM events, and cloud storage in real-time, at petabyte scale.

SDS Edit rules

30+

Production-Ready Detection Rules

<5ms

Scan Latency at Ingestion

100%

Stream Coverage

Faster Incident Resolution

How It Works

Real-Time Scanning at the Point of Ingestion

Data is scanned and mutated in-flight, before it ever persists to storage. Zero re-processing. Zero retroactive exposure risk.

Scanning ScopeTag-Based FilteringSelective Data MonitoringRegex + Checksum ValidationFalse Positive ReductionSensitive Data RedactionData Masking ControlsSecurity Event Enrichment
1

Define Scanning Scope

Use tag-based filters to target specific services, environments, or log sources. Apply rules selectively to logs, APM spans, RUM events, and pipeline data.

2

Match via Regex + Validation

Rules run against each event using regex patterns combined with checksum validators (e.g. Luhn algorithm for credit cards) to minimize false positives.

3

Redact, Hash, or Tag

Matched values are fully redacted, hashed (SHA-256), or partially masked. The original value is never stored. Actions are configurable per rule.

4

Enrich & Route

Without segmented insights, it is difficult to see which clients or consumers generate the most traffic over time.

Detection Coverage

What Gets Detected Out of the Box?

Built-in rules cover the most common sensitive data types across financial, healthcare, identity, and infrastructure domains.

Standard Email Addresses

Detect exposed business and personal emails to prevent phishing, data harvesting, and account takeover risks

UK National Insurance Numbers

Identify leaked NI numbers with strict format validation to reduce identity fraud exposure

US Passport Numbers

Flag sensitive passport identifiers to prevent credential abuse and compliance violations

Canadian Social Insurance Numbers

Detect high-risk SIN data with checksum validation to mitigate identity theft threats

US Vehicle Identification Numbers (VIN)

Identify exposed VINs that can be exploited for fraud, cloning, or unauthorized vehicle record access

Basic Authentication Credentials

Detect Base64-encoded Basic Auth headers to prevent hardcoded credential exposure

JSON Web Tokens (JWT)

Identify exposed bearer tokens and signed JWTs to stop session hijacking and API abuse

Credit Cards & Banking Data

Detect payment card numbers, IBAN, routing numbers, and SWIFT codes to prevent financial data breaches and PCI risk

Capabilities

Everything You Need for Enterprise Data Security

Purpose-built for SecOps, platform engineering, and compliance teams running cloud-native stacks.

Automatic Discovery

Scan new hosts, containers, and services the moment they spin up. No manual rule assignment. New environments inherit scanning groups via tag selectors.

Zero-touch onboarding

30+ Validated Rules Library

Out-of-the-box rules for credit cards, SSNs, passport numbers, IBAN, API keys, OAuth tokens, AWS/GCP/Azure credentials, and more. Maintained and updated continuously.

Updated continuously

Custom Regex Rules

Define your own patterns using standard PCRE regex. Set match confidence thresholds, character preservation rules, and custom tag schemas for internal data types.

PCRE support

Multiple Redaction Strategies

Full redaction, SHA-256 hashing (reversible by authorized users), partial masking with configurable character preservation, or replacement with a fixed string.

4 action types

Observability Pipelines Integration

Redact before data leaves your network entirely. Route clean telemetry to Datadog while keeping sensitive payloads out of any third-party destination.

On-prem redaction

Dashboards & Alerting

Tag-driven dashboards show sensitive data volume, match trends, and risk distribution. Alert on spike in detections per service, rule, or risk tier in real-time.

Metric-based alerts

Data Flow

From Telemetry Event to Safe Storage

Scanning happens in-stream. Sensitive values never reach indexes, long-term storage, or third-party integrations.

Source

Telemetry Ingestion

Logs, APM traces, RUM events, CI pipelines, and cloud storage unified ingestion point.

PRE-STORAGE

Filter

Scope Evaluation

Tag-based query filters route events to the correct scanning group in <1ms.

TAG MATCH

Scan

Pattern + Validation

Regex engine runs in parallel with secondary validators (Luhn, format checks) for precision.

REAL-TIME

Action

Mutate & Enrich

Values are redacted / hashed. Risk tags are appended. Audit record created.

IMMUTABLE

Output

Safe Persistence

Clean, enriched events reach indexes and dashboards. Sensitive payload gone from the wire.

COMPLIANT
Integrations

Scans Every Data Stream in Your Stack

No agents, no sidecars, no additional infrastructure. SDS plugs directly into Atatus unified ingestion pipeline.

tick

Log Management: All log sources: agents, lambda, syslog, HTTP endpoint, cloud forwarders

tick

APM / Distributed Tracing: Span tags, HTTP body attributes, SQL queries, gRPC request metadata

tick

Real User Monitoring (RUM): Browser and mobile events, session data, error payloads

Social Proof

Trusted by Security & Platform Teams

We scanned over 50TB of existing log data and found production PII that had been silently leaking for months. The out-of-box rules saved us weeks of regex development.

A

Andrew Mitchell

Principal Security Engineer

The HIPAA rule pack was immediately actionable. We went from zero coverage to full PHI redaction in our production logs within a single afternoon. Compliance team signed off in days.

M

Melissa Reynolds

Head of Platform Engineering

Being able to combine SDS with restriction queries means our customer support team can use logs for debugging without ever seeing raw PII. That's a genuine architectural win.

D

Daniel Carter

Staff Infrastructure Engineer

Questions we get in almost every SDC demo