Skip to main content

On This Page

Amazon SNS Data Protection Policies Block, Mask, or Log Sensitive Data with 99% Sample Rate

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Data Protection in Amazon SNS

Amazon SNS Data Protection Policies identify PII/PHI in messages using machine learning and pattern matching. A 2025 audit policy logs findings with 99% sample rate to CloudWatch.

Why This Matters

Event-driven architectures prioritize speed, but sensitive data leaks (e.g., PII/PHI) expose systems to compliance risks. Manual masking or detection pipelines are error-prone and costly. SNS Data Protection automates this with predefined identifiers (e.g., email, DOB) and three operations: audit, de-identify, or deny. A 2022 study estimated data breach costs at $4.2M per incident, making automated controls critical.

Key Insights

  • “99% sample rate in Audit policy, 2025” (from CloudWatch log configuration)
  • “De-identify over masking for PII in healthcare apps” (example from context)
  • “Lambda used by developers to test SNS policies” (code example in context)

Working Example

{
  "Description": "Audit sensitive data without blocking delivery",
  "Version": "2021-06-01",
  "Statement": [
    {
      "DataDirection": "Inbound",
      "DataIdentifier": [
        "arn:aws:dataprotection::aws:data-identifier/EmailAddress",
        "arn:aws:dataprotection::aws:data-identifier/DateOfBirth",
        "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
      ],
      "Operation": {
        "Audit": {
          "FindingsDestination": {
            "CloudWatchLogs": {
              "LogGroup": "/aws/vendedlogs/sns-audit/"
            }
          },
          "SampleRate": "99"
        }
      },
      "Principal": ["*"],
      "Sid": "AuditSensitiveData"
    }
  ],
  "Name": "sns-audit-policy"
}
import boto3
import os
import json
import logging

sns = boto3.client('sns')
logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    message = {
        "patientId": "PAT123456",
        "name": "John Doe",
        "dob": "12-01-2012",
        "diagnosis": "Flu"
    }
    topics = ["AUDIT_TOPIC_ARN", "DEIDENTIFY_TOPIC_ARN", "DENY_TOPIC_ARN"]
    results = {}
    for topic_env in topics:
        topic_arn = os.environ.get(topic_env)
        try:
            response = sns.publish(
                TopicArn=topic_arn,
                Message=json.dumps(message)
            )
            results[topic_env] = {
                "status": "success",
                "messageId": response.get("MessageId")
            }
            logger.info(f"Published to {topic_env}: {response.get('MessageId')}")
        except sns.exceptions.InvalidParameterException as e:
            logger.error(f"[{topic_env}] Sensitive data detected: {str(e)}")
            results[topic_env] = {"status": "failed", "error": "Sensitive data not allowed"}
        except Exception as e:
            logger.error(f"[{topic_env}] Error: {str(e)}")
            results[topic_env] = {"status": "failed", "error": str(e)}
    return {"status": "completed", "results": results}

Practical Applications

  • Use Case: Healthcare apps using De-identify to mask patient DOB before sending to analytics systems
  • Pitfall: Using Audit alone without De-identify may leave sensitive data exposed in logs

References:

Continue reading

Next article

WhatsApp's Typing Status Architecture: Real-Time Efficiency at Scale

Related Content