Do not use for preventing data exfiltration (use S3 bucket policies, VPC endpoints, and SCPs), for data classification (use Amazon Macie discovery jobs), or for network-level exfiltration detection (use VPC Flow Logs with network analysis tools).
GetObject, PutObject, CopyObject)Configure CloudTrail to capture all S3 object-level operations for forensic analysis.
# Enable S3 data events on an existing trail
aws cloudtrail put-event-selectors \
--trail-name management-trail \
--event-selectors '[{
"ReadWriteType": "All",
"IncludeManagementEvents": true,
"DataResources": [{
"Type": "AWS::S3::Object",
"Values": ["arn:aws:s3:::sensitive-data-bucket/", "arn:aws:s3:::customer-records/"]
}]
}]'
# Verify data event configuration
aws cloudtrail get-event-selectors --trail-name management-trail \
--query 'EventSelectors[*].DataResources' --output json
# Enable GuardDuty S3 Protection
aws guardduty update-detector \
--detector-id $(aws guardduty list-detectors --query 'DetectorIds[0]' --output text) \
--data-sources '{"S3Logs":{"Enable":true}}'
Analyze CloudTrail logs for bulk download activity, unusual access times, and unfamiliar source IPs.
# Athena query: Top S3 downloaders by volume in last 24 hours
cat << 'EOF'
SELECT
useridentity.arn as principal,
sourceipaddress,
COUNT(*) as request_count,
SUM(CAST(json_extract_scalar(requestparameters, '$.bytesTransferredOut') AS bigint)) as bytes_downloaded
FROM cloudtrail_logs
WHERE eventname = 'GetObject'
AND eventsource = 's3.amazonaws.com'
AND eventtime > date_add('hour', -24, now())
GROUP BY useridentity.arn, sourceipaddress
ORDER BY request_count DESC
LIMIT 50
EOF
# CloudWatch Logs Insights: S3 GetObject requests from unusual IPs
aws logs start-query \
--log-group-name cloudtrail-logs \
--start-time $(date -d "24 hours ago" +%s) \
--end-time $(date +%s) \
--query-string '
fields @timestamp, userIdentity.arn, sourceIPAddress, requestParameters.bucketName, requestParameters.key
| filter eventName = "GetObject"
| stats count() as requestCount by sourceIPAddress, userIdentity.arn
| sort requestCount desc
| limit 25
'
# Detect cross-account copies (potential exfiltration)
aws logs start-query \
--log-group-name cloudtrail-logs \
--start-time $(date -d "7 days ago" +%s) \
--end-time $(date +%s) \
--query-string '
fields @timestamp, userIdentity.arn, sourceIPAddress, requestParameters.bucketName
| filter eventName in ["CopyObject", "ReplicateObject", "UploadPart"]
| filter userIdentity.accountId != "OUR_ACCOUNT_ID"
| sort @timestamp desc
| limit 100
'
Check for GuardDuty S3-specific finding types that indicate exfiltration activity.
# List active S3 exfiltration-related findings
aws guardduty list-findings \
--detector-id $(aws guardduty list-detectors --query 'DetectorIds[0]' --output text) \
--finding-criteria '{
"Criterion": {
"type": {
"Eq": [
"Exfiltration:S3/MaliciousIPCaller",
"Exfiltration:S3/ObjectRead.Unusual",
"Discovery:S3/MaliciousIPCaller.Custom",
"Discovery:S3/BucketEnumeration.Unusual",
"UnauthorizedAccess:S3/MaliciousIPCaller.Custom",
"UnauthorizedAccess:S3/TorIPCaller",
"Impact:S3/AnomalousBehavior.Delete"
]
}
}
}' --output json
# Get detailed finding information
aws guardduty get-findings \
--detector-id $(aws guardduty list-detectors --query 'DetectorIds[0]' --output text) \
--finding-ids FINDING_IDS \
--query 'Findings[*].{Type:Type,Severity:Severity,Resource:Resource.S3BucketDetails[0].Name,Action:Service.Action}' \
--output table
Review Macie findings to correlate data sensitivity with access anomalies.
# List Macie findings for sensitive data exposure
aws macie2 list-findings \
--finding-criteria '{
"criterion": {
"category": {"eq": ["CLASSIFICATION"]},
"severity.description": {"eq": ["High", "Critical"]}
}
}' \
--sort-criteria '{"attributeName": "updatedAt", "orderBy": "DESC"}' \
--max-results 25
# Get detailed finding with data classification
aws macie2 get-findings \
--finding-ids FINDING_IDS \
--query 'findings[*].{Type:type,Severity:severity.description,Bucket:resourcesAffected.s3Bucket.name,SensitiveDataTypes:classificationDetails.result.sensitiveData[*].category}' \
--output table
# Run a sensitive data discovery job on target bucket
aws macie2 create-classification-job \
--job-type ONE_TIME \
--name "exfiltration-investigation" \
--s3-job-definition '{
"bucketDefinitions": [{
"accountId": "ACCOUNT_ID",
"buckets": ["sensitive-data-bucket"]
}]
}'
Create CloudWatch alarms and EventBridge rules for real-time exfiltration detection.
# CloudWatch metric filter for high-volume S3 downloads
aws logs put-metric-filter \
--log-group-name cloudtrail-logs \
--filter-name s3-bulk-download \
--filter-pattern '{$.eventName = "GetObject" && $.eventSource = "s3.amazonaws.com"}' \
--metric-transformations '[{
"metricName": "S3GetObjectCount",
"metricNamespace": "SecurityMetrics",
"metricValue": "1",
"defaultValue": 0
}]'
# Alarm for anomalous download volume (>1000 objects/hour)
aws cloudwatch put-metric-alarm \
--alarm-name s3-exfiltration-alert \
--metric-name S3GetObjectCount \
--namespace SecurityMetrics \
--statistic Sum \
--period 3600 \
--threshold 1000 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 1 \
--alarm-actions arn:aws:sns:us-east-1:ACCOUNT:security-alerts
# EventBridge rule for GuardDuty S3 findings
aws events put-rule \
--name guardduty-s3-exfiltration \
--event-pattern '{
"source": ["aws.guardduty"],
"detail-type": ["GuardDuty Finding"],
"detail": {
"type": [{"prefix": "Exfiltration:S3/"}]
}
}'
Deploy bucket policies and VPC endpoint policies to restrict data movement paths.
# VPC endpoint policy restricting S3 access to specific buckets
aws ec2 modify-vpc-endpoint \
--vpc-endpoint-id vpce-ENDPOINT_ID \
--policy-document '{
"Statement": [{
"Sid": "RestrictToOwnBuckets",
"Effect": "Allow",
"Principal": "*",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": ["arn:aws:s3:::approved-bucket-1/*", "arn:aws:s3:::approved-bucket-2/*"]
}]
}'
# Bucket policy denying access from outside the VPC
aws s3api put-bucket-policy --bucket sensitive-data-bucket --policy '{
"Version": "2012-10-17",
"Statement": [{
"Sid": "DenyNonVpcAccess",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::sensitive-data-bucket/*",
"Condition": {
"StringNotEquals": {
"aws:sourceVpce": "vpce-ENDPOINT_ID"
}
}
}]
}'
| Term | Definition |
|---|---|
| S3 Data Events | CloudTrail object-level logging that captures GetObject, PutObject, DeleteObject, and CopyObject API calls with request details |
| GuardDuty S3 Protection | Threat detection feature analyzing CloudTrail S3 data events to identify anomalous access patterns and exfiltration attempts |
| Amazon Macie | Data security service that discovers and classifies sensitive data in S3 and generates findings for data exposure risks |
| VPC Endpoint Policy | Access control policy on an S3 VPC endpoint that restricts which buckets and actions can be accessed through the endpoint |
| Data Exfiltration | Unauthorized transfer of data from an organization's S3 storage to an external location controlled by an attacker |
| Anomalous Behavior Detection | Machine learning-based identification of S3 access patterns that deviate from established baselines for a principal |
Context: GuardDuty reports an Exfiltration:S3/ObjectRead.Unusual finding indicating that a developer's access key is downloading thousands of objects from a sensitive data bucket at 3 AM from an IP address in a foreign country.
Approach:
Pitfalls: CloudTrail S3 data events can generate massive log volume. Use Athena with partitioned tables rather than CloudWatch Logs Insights for queries spanning more than 24 hours. GuardDuty baseline learning requires 7-14 days, so new accounts may generate false positives for normal access patterns.
S3 Data Exfiltration Investigation Report
============================================
Account: 123456789012
Detection Source: GuardDuty Exfiltration:S3/ObjectRead.Unusual
Investigation Date: 2026-02-23
INCIDENT TIMELINE:
2026-02-23 02:47 UTC - First anomalous GetObject from 185.x.x.x
2026-02-23 02:47-04:12 UTC - 12,847 GetObject requests
2026-02-23 04:15 UTC - GuardDuty finding generated
2026-02-23 04:20 UTC - PagerDuty alert received by SOC
2026-02-23 04:25 UTC - Access key deactivated
COMPROMISED PRINCIPAL:
ARN: arn:aws:iam::123456789012:user/developer-jane
Access Key: AKIA...WXYZ
Source IP: 185.x.x.x (Tor exit node)
DATA IMPACT ASSESSMENT:
Buckets accessed: 3
Objects downloaded: 12,847
Total data volume: 4.7 GB
Sensitive data types: PII (SSN, email), Financial (credit card)
Macie severity: CRITICAL
CONTAINMENT ACTIONS:
[x] Access key deactivated
[x] User password reset and MFA re-enrolled
[x] VPC endpoint policy applied to sensitive buckets
[x] Bucket policy restricting to VPC-only access
[x] TruffleHog scan initiated on developer repositories