Detecting Shadow IT Cloud Usage
Overview
Shadow IT refers to unauthorized SaaS applications and cloud services used without IT approval. This skill analyzes proxy logs, DNS query logs, and firewall/netflow data to identify unauthorized cloud service usage, classify discovered domains against known SaaS categories, measure data transfer volumes, and flag high-risk services based on security posture and compliance requirements.
Prerequisites
- Python 3.9+ with
pandas, tldextract
- Proxy logs (Squid, Zscaler, or Palo Alto format) or DNS query logs
- SaaS application catalog/blocklist for classification
- Network firewall logs with FQDN resolution (optional)
Steps
- Parse proxy access logs and extract destination domains with traffic volumes
- Parse DNS query logs to identify resolved cloud service domains
- Aggregate traffic by domain using pandas — total bytes, request counts, unique users
- Classify domains against known SaaS categories (storage, email, dev tools, AI)
- Flag unauthorized services not on the approved application list
- Calculate risk scores based on data volume, user count, and service category
- Generate shadow IT discovery report with remediation recommendations
Expected Output
- JSON report listing discovered cloud services with traffic volumes, user counts, risk scores, and approval status
- Top unauthorized services ranked by data exfiltration risk