Use this skill when:
Do NOT use this skill when:
This skill is intended for assets the operator owns or has written authorization to assess (red-team rules of engagement, bug-bounty in-scope assets, ASM contracts).
Soft scope check: when a user asks you to act against a target whose authorization isn't established earlier in the conversation, ask once before proceeding:
"Quick scope check: is this a target you own or have written authorization to assess (e.g., a red-team engagement, in-scope bug-bounty asset, or your own infrastructure)? I want to make sure we stay on the right side of the engagement boundary."
Once authorization is asserted, proceed without re-asking. If the user explicitly states the engagement type (e.g., "this is for our pentest of acme.com under contract"), you don't need to ask again.
Always-on guardrails (regardless of authorization):
--aggressive mode.Every assertion you make during an engagement should carry a confidence level. Three levels:
| Level | Meaning | Examples |
|---|---|---|
| TENTATIVE | Plausible based on indirect evidence; unverified. | Snippet-only Google dork match; email pattern inferred from name; subdomain returned by one passive source only; favicon-hash overlap (two hosts share a favicon — could be shared infra, could be a coincidence). |
| FIRM | Directly observed but uncorroborated. | Subdomain that resolves to an IP; HEAD-confirmed bucket exists (private); CT-log entry shows certificate; Shodan banner returned. |
| CONFIRMED | Multiple independent corroborations OR directly verified. | Live-validated PMAK token (read-only /me returned 200); breach corpus + crt.sh + DNS all agree; bucket listable AND files retrievable; user enumerated AND password reset flow returns valid hint. |
Rule of three for attribution: require three independent weak signals, OR one strong + one weak, before asserting linkage. Don't single-source attribute.
Confidence isn't static — every TENTATIVE asset should have a documented path to FIRM and to CONFIRMED. Use these per-asset-type rules.
| Asset type | TENTATIVE → FIRM | FIRM → CONFIRMED |
|---|---|---|
| Subdomain | Returned by ≥2 independent passive sources, OR DNS A/AAAA/CNAME resolves successfully. | Serves on a standard port (80/443/22/etc.) AND HTTP banner / TLS cert / SSH banner returned. |
| IP | Discovered via ≥2 sources (passive DNS, ASN lookup, Shodan). | Active probe responds (TCP SYN-ACK on at least one port, or ICMP echo reply). |
| WebApp | URL extracted from JS / API / archive but not yet hit. | HTTP request returns 2xx/3xx/4xx (any non-network-error response) AND content-length > 0. |
| Generated from a name pattern OR returned by snippet-only dork. | Listed in Hunter.io / EmailRep / IntelX / breach corpus, OR MAIL FROM/RCPT TO SMTP probe returns 250 (without delivery — abort at DATA). |
|
| Bucket (S3/GCS/Azure) | Permutation candidate; no probe yet. | HEAD returns 200, 301, or 403 (existence confirmed). Then CONFIRMED when GET returns object listing or known object retrieval. |
| Endpoint (API / wayback) | Extracted from JS regex / Wayback / Postman. | HTTP request returns non-404 (route exists). Then CONFIRMED when the endpoint's behavior is fingerprinted (auth posture, response shape, rate limits). |
| Credential / secret | Matches catalog regex in captured text. | Read-only validator (/me, auth.test, sts:GetCallerIdentity, /user) returns success. Then CONFIRMED with documented scope + account ID. |
| Person | Name extracted from a single source (LinkedIn / breach / GitHub commit). | Confirmed by a second source (Hunter.io role + LinkedIn profile, or two breach sources with same email). |
| Repo | Name match on org keyword in GitHub search. | Repo metadata shows confirmed org/email/website match. Then CONFIRMED when commit-history shows employee involvement. |
| Mobile app | Name match in app store. | Ownership-confidence score ≥70 (see companion skill §21). Then CONFIRMED when binary metadata (signing cert, package name, dev account) ties back to target. |
| Certificate | Returned by crt.sh once. | CT-log entry confirmed in ≥2 logs. Then CONFIRMED when serving on a discovered host. |
| SSO tenant | Discovery-endpoint returns OIDC metadata. | Tenant GUID extracted AND domain resolves through the tenant's expected MX / autodiscover / SP record. |
Default reporting posture: never claim CONFIRMED without explicit corroboration. When in doubt, downgrade. Operators trust under-claims more than over-claims.
When you produce findings during an active session, structure each finding to match the schema below — it drops cleanly into asset-management tools.
Finding:
id: <stable hash or UUID>
module: <which technique discovered it; "manual" if hand-found>
asset_key: <typed key, e.g. sub:api.example.com or webapp:https://example.com/admin>
category: <e.g. SECRET_LEAK, MISSING_HSTS, OPEN_GRAPHQL_API, LEAKED_CRED, SSO_EXPOSURE>
severity: <info|low|medium|high|critical>
confidence: <tentative|firm|confirmed>
title: <one-line summary>
description: <2-5 sentences>
evidence:
url: <where it was found>
timestamp: <UTC ISO8601>
sha256: <hash of any downloaded artifact>
raw: <truncated to 2 KiB>
references:
- <CVE-ID, advisory URL, vendor doc>
remediation: <action the asset owner can take>
Always use UTC timestamps. Local time creates correlation bugs across notes/screenshots/logs.
For every artifact you capture, record: URL + UTC timestamp + SHA-256 hash + tool version + run_id.
run_id so the entire engagement is replayable.When citing a source in your output, prefer durable references (CVE, vendor advisory, ATT&CK technique ID, RFC) over ephemeral ones (a Twitter post, a forum thread). If the only source is ephemeral, archive it (archive.today, Wayback SavePageNow) before citing.
nuclei fuzzing/* templates outside an explicit DEEP / --aggressive mode.A sock puppet is a fake account that cannot be linked to you. Build a posting history, age the account, use it from a separate browser profile.
Resources & techniques:
References:
Every probe leaves a footprint. Tag every operation in your notes with a detectability level so you can reason about the SIEM trail you're leaving on the target's side.
| Tag | Examples |
|---|---|
| Low | Passive Shodan InternetDB; CT-log queries (crt.sh); Wayback CDX; passive DNS (SecurityTrails); Hunter.io email enrichment; HTTP HEAD on public buckets; getuserrealm.srf; Microsoft OIDC metadata fetch. |
| Medium | Microsoft GetCredentialType user-enum; Okta /api/v1/authn user-enum; Postman API key validation; AWS sts:GetCallerIdentity (logs to CloudTrail); Slack auth.test; full-page screenshots; Swagger/GraphQL probes against a 28/13-path wordlist; targeted favicon-hash + JARM fingerprinting. |
| High | Active port scans (naabu / masscan / nmap); Nuclei full template runs against production; subdomain brute-force at scale; APK download from third-party mirrors; deep-mode user enumeration past N attempts per tenant; SMTP RCPT TO enumeration; web fuzzing (ffuf/gobuster). |
When working with a client, document the operations actually run and their detectability tag in the engagement report — clients appreciate knowing what their detection stack should have caught.
Defaults: passive by default. Active probes only when (a) explicitly authorized, (b) within agreed maintenance windows, and (c) with the operator's awareness of the resulting log volume.
When you discover a credential in the wild (a leaked API key, a sourcemap-exposed token, a hard-coded PMAK in a public Postman workspace), you may want to confirm it's live. Do this with read-only validators only.
Discipline:
/me, /whoami, auth.test, sts:GetCallerIdentity).checked_at (UTC), the response (truncated), and the scope/account-ID returned.validation_skipped_by_policy and stop.Concrete validator endpoints (Postman, AWS, GitHub, Slack, Anthropic, OpenAI, npm, Atlassian, DataDog) live in the companion offensive-osint skill.
Your probes will eventually hit detection. Recognize the signs and back off before you trip an active response.
Signs you've been detected (in roughly increasing severity):
429 Too Many Requests, Retry-After header set, X-RateLimit-Remaining: 0./admin/db_dump.sql, exposed .env with credentials that don't validate). Real exposures rarely look this clean.Back-off ladder:
Persona / IP rotation rules:
Don't:
A 5-stage pipeline for any authorized external assessment. Stages are sequential; modules within a stage can run concurrently.
Establish the ground truth of who/what the target is.
Discover everything that might belong to the target.
Add depth to the discovered assets.
Convert assets into findings.
.git/config, .env, phpinfo.php, /actuator/env, /actuator/heapdump, _cat/indices, /console, /manager/html).Make the work usable.
When budget is constrained, work in this order:
.env files. Fastest path to cloud pivot.Stage and asset count drive how long a recon takes. Rough estimates (single operator on a typical SaaS-style target):
| Stage | Small org (<100 employees) | Medium (100–1K) | Large (1K+) |
|---|---|---|---|
| 1. Seed discovery | 30 min | 30 min | 30 min |
| 2. Asset expansion | 1–2 h | 2–4 h | 4–8 h |
| 3. Enrichment (per 100 alive webapps) | ~1 h | ~1 h | ~1 h |
| 4. Exposure analysis | 1–3 h | 3–6 h | 6–12 h |
| 5. Reporting | 2–4 h | 4–8 h | 1–2 days |
Engagement profiles:
When to abort early:
Treat every discovery as a typed asset in a graph, not a free-floating string.
| Category | Asset Types |
|---|---|
| DNS / Network | domain, subdomain, ip, netblock, asn |
| Service | port, service, certificate |
| Identity | email, person, credential |
| Code / Config | repo, secret |
| Cloud / Storage | bucket, firebase_project |
| Web | webapp, wayback_endpoint, api_endpoint, api_spec, graphql_schema |
| Mobile | mobile_app, deep_link, exported_component |
| Phishing / Adversarial | typosquat_domain |
| Collaboration / SaaS | postman_collection, postman_workspace, postman_api_key, stack_post, saas_public_surface |
Every asset carries:
type — one of the 29 above.key — unique dedup id (typed prefix, e.g. sub:api.example.com, email:alice@example.com).value — the actual string/object.sources[] — every source that confirmed this asset (deduplicated).confidence — TENTATIVE / FIRM / CONFIRMED.first_seen, last_seen — UTC timestamps.attrs{} — type-specific metadata (e.g., for a webapp: status_code, title, tech-stack list, JARM, favicon mmh3, screenshot path).Relationships are typed edges, not text:
RESOLVES_TO, HOSTED_ON, IN_NETBLOCK, BELONGS_TO_ASN, LISTED_IN_CERT, OWNED_BY, ALIAS_OF, BREACHED_FROM, EMPLOYED_BY, HOSTS_REPO, TYPOSQUAT_OF, EXPOSES, DOCUMENTED_BY, BELONGS_TO_HOST, REQUIRES_AUTH, LEAKS_SCHEMA, SHIPPED_BY_ORG, CONTAINS_SECRET, TALKS_TO_HOST, EXPOSES_DEEPLINK, HAS_EXPORTED_COMPONENT, USES_FIREBASE_PROJECT, LACKS_PINNING_FOR.
sub:api.example.com and webapp:https://api.example.com/ are different assets with a BELONGS_TO_HOST edge).sources[] must list every source. If two sources confirmed it, both go in.mobile_endpoints.json, secrets_sidecar.json) — don't block module B on module A. See §24.When you have a mixed bag of assets and limited probe budget, prioritize by what each asset enables:
WebApp priority by hostname signal (highest first):
auth., login., sso., idp., accounts., oauth.)./admin, /dashboard, /console, /manager, /wp-admin, /phpmyadmin).dev., staging., stg., qa., uat., test., sandbox., preprod., preview.) — lower defenses, often dump prod data.api., services., gateway., graph.).portal., app., my., account.).www., blog., news., careers., support.).Subdomain priority by inferred function:
IP priority by netblock:
Email priority by role hint:
| Role indicator | Priority | Why |
|---|---|---|
ceo@, cfo@, cto@, ciso@ |
HIGHEST | Exec accounts have highest breach value (BEC, finance authority, board access). |
it@, helpdesk@, support@, security@ |
HIGH | IT/security accounts have privileged tool access; helpdesk accounts handle reset workflows. |
dev, engineer, architect, dba |
MEDIUM | Developer accounts often have GitHub / cloud / CI access. |
sales, marketing, hr, finance |
MEDIUM | SaaS access (Salesforce, HubSpot, Workday); finance enables BEC. |
Generic role accounts (info@, noreply@, contact@) |
LOW | Often unmonitored or alias forwarded; less personal context. |
Repo priority by recency + naming:
prod, internal, private, secret in name → priority HIGH despite being public (may be misnamed or accidentally exposed).Application order: when you have N assets and budget for M probes (M < N), apply asset-type priority first, then within-type priority. E.g.: 50 subdomains → probe API + admin + dev first (~15), then auth + prod-app (~20), defer marketing/content to a later pass.
Severity is operational, not subjective. Use these anchors:
Pre-auth code execution, confirmed valid credentials, listable production data, fundamental trust violations.
Examples:
.git/config exposed on production webapp (full source-code disclosure)./.env exposed (credentials in plaintext, often DB / cloud / API)./actuator/env or /actuator/heapdump reachable unauthenticated.https://{project}.firebaseio.com/.json returns data).android:debuggable=true in a production Android app./_cat/indices returns data)./v1.40/containers/json returns containers).authorized_keys).Significant exposure but not yet RCE; clear path to escalation; high-value information disclosure.
Examples:
.js.map) accessible — full original-source disclosure of frontend.Access-Control-Allow-Origin: <reflected> + Access-Control-Allow-Credentials: true)./login, /sso, /admin, /auth) — escalated from MED.p=none on production sending domain (spoof-feasible).Information disclosure, hardening gaps, brute-force exposure.
Examples:
/server-status or /server-info reachable.phpinfo() or /info.php reachable on dev/staging only.android:allowBackup=true in Android app.android:usesCleartextTraffic=true in Android app.android:permission protection.Access-Control-Allow-Origin: *) on an API that returns user-tied data (no creds).+all or many includes).Cosmetic or marginal hardening gaps.
Examples:
X-Frame-Options.X-Content-Type-Options..DS_Store exposed.Worth recording, no action required immediately.
Examples:
Referrer-Policy / Permissions-Policy./.well-known/security.txt.robots.txt reveals interesting paths.p=reject + SPF strict + DKIM rotated → no escalation; well-postured.Existing investigative work (threat-actor research, doxxing investigations, attribution) operates under different posture than offensive recon. Switch posture explicitly.
| Aspect | Investigative Mode | Offensive Recon Mode |
|---|---|---|
| Probing rate | Slow, single-threaded, blend with normal traffic. | Bursts, parallel, but rate-limited per provider. |
| OpSec posture | Sock-puppet only, never reveal investigator. | Persona may be the engagement persona; team may notify SOC. |
| Evidence handling | Court-grade chain of custody; hashes, timestamps, screenshots. | Engagement-grade; same hashing/timestamp discipline but evidence is for the client report. |
| Severity in scope | All severity levels relevant for context. | CRIT/HIGH/MED matter; LOW/INFO often dropped from exec summary. |
| Authorization posture | Public-record / OSINT-only; no probing private resources without authorization. | Written rules of engagement; explicit scope; explicit out-of-scope list. |
| Reporting format | Narrative + sourced timeline. | Per-asset findings + remediation + reproduction steps. |
| Stop conditions | When the question is answered. | When the engagement window closes OR when the report is delivered. |
When you're working with the user, ask which mode they're in if it's unclear from context.
Org size shapes which techniques pay off.
Small org (< 100 employees):
Medium org (100–1K):
Large org (1K–10K):
Very large org (10K+) or conglomerate:
Cross-scale principle: the smaller the org, the more individual-account focus pays off. The larger the org, the more systemic posture findings (DMARC gaps, SSO_EXPOSURE breadth, vendor-product version sweeps) pay off.
An organization's IdP/SSO posture is a high-value target: compromise the identity fabric and you don't need to break into individual apps. Map it methodically.
Probe these prefixes against the target's root domain (and any sibling brand domains discovered):
auth.{domain}
login.{domain}
sso.{domain}
idp.{domain}
iam.{domain}
identity.{domain}
accounts.{domain}
oauth.{domain}
Plus generic OIDC discovery on every alive subdomain:
{any-host}/.well-known/openid-configuration
https://login.microsoftonline.com/{tenant-or-domain}/.well-known/openid-configuration. The issuer field returns a URL containing the tenant GUID (8-4-4-4-12 hex format). Tenant GUID + domain = stable tenant fingerprint.https://login.microsoftonline.com/getuserrealm.srf?login=<user>@<domain> returns NameSpaceType: Managed (cloud-native), Federated (on-prem ADFS / external IdP), or Unknown. Detectability: low.https://autodiscover-s.outlook.com/autodiscover/metadata/json/1 POST with email; detects tenant membership.https://login.microsoftonline.com/common/GetCredentialType POST {"username": "<email>"}. Response indicates whether email exists in tenant. Detectability: medium. Cap attempts at 20 per tenant.<slug>.okta.com (or <slug>.oktapreview.com).https://<slug>.okta.com/.well-known/openid-configuration.{"username": "<email>", "password": "invalid"}. 400 vs 401 response code indicates user existence. Detectability: medium. Cap at 20 per tenant.https://{domain}/adfs/idpinitiatedsignon.aspx → 200 indicates ADFS present.https://{domain}/adfs/Services/Trust/mex returns SOAP metadata.https://{domain}/.well-known/openid-configuration — Google-hosted-domain customers expose discovery endpoints with characteristic issuer/JWKS URIs.*.googlemail.com / aspmx.l.google.com is a strong Google Workspace signal./.well-known/openid-configuration.issuer and authorization_endpoint fields fingerprint the IdP product.*.auth0.com, *.onelogin.com, *.pingone.com, *.duosecurity.com patterns are characteristic.Probe these paths on every alive webapp:
/saml/metadata
/FederationMetadata/2007-06/FederationMetadata.xml
/federationmetadata/2007-06/federationmetadata.xml
/simplesaml/saml2/idp/metadata.php
/auth/saml2/metadata
SAML metadata XML contains: EntityID, signing certs, SingleSignOnService URL, NameIDFormat.
x-amz-bucket-region; correlate with bucket-name entropy to infer account.arn:aws:[a-z0-9-]+:[a-z0-9-]*:([0-9]{12}): (the 12-digit AWS account ID is the capture group).AccountId property in JS / API responses — common in IAM-related error messages and CloudFormation outputs.<digits>-<chars>.apps.googleusercontent.com; MSAL: GUID in clientId property.Each discovered IdP becomes a Service asset with attrs.product, attrs.tenant_id, attrs.discovery_endpoint. Then in Stage 4, correlate with breach data: every compromised user under a discovered tenant becomes an SSO_EXPOSURE finding (CRITICAL — see §22.3).
Beyond plain Entra fingerprinting, M365 exposes a wider attack surface that's worth enumerating in depth.
Teams Federation:
https://login.microsoftonline.com/<target-domain>/.well-known/openid-configuration confirms tenant.https://teams.microsoft.com/api/mt/<region>/beta/users/<email>/externalsearchv3 (requires authenticated request from a federated tenant; useful for confirming whether external Teams chat is allowed).SharePoint subdomains:
<target-stem>.sharepoint.com — main tenant SharePoint.<target-stem>-my.sharepoint.com — OneDrive-for-Business URLs (per-user personal sites).<target-stem>-admin.sharepoint.com — SharePoint admin center (auth-required, but presence confirms tenancy).<target-stem> is derived from the company name (often the part before .com).OneDrive personal site enumeration:
https://<target-stem>-my.sharepoint.com/personal/<user_email_with_underscore>/Documents/.@ with _ and . with _ in the email (e.g., alice@acme.com → alice_acme_com).M365 OAuth client_id discovery:
client_id=<GUID> patterns.https://login.microsoftonline.com/<tenant>/v2.0/.well-known/openid-configuration lists supported endpoints; some tenants leave device_authorization_endpoint enabled (device-code phishing target).Power Platform / Power Apps:
https://make.powerapps.com/environments (auth-required); environment IDs sometimes leak in URLs.*.crm.dynamics.com (Dynamics 365 / Power Apps default URLs).*.azurewebsites.net for App Service deployments.M365 OAuth misconfig findings to look for:
device_authorization_endpoint enabled on common tenant (device-code phishing target) → MEDIUM (operational risk; not directly exploitable but enables attack).Public client flow enabled and broad scopes (offline_access, Mail.Read, Files.Read.All) → HIGH if app is approved for the tenant.Detectability: all M365 endpoint probes log to Entra sign-in logs / audit logs (medium-low for fetch-only; medium for any auth attempt).
Modern targets expose REST, GraphQL, and undocumented internal APIs. The OSINT goal is to enumerate them, classify them, and rank by attack interest.
mobile_endpoints.json).For each endpoint, capture:
url, method, source[], auth_required, auth_type, auth_location,
rate_limited, cors_policy, sensitive_path_keywords[], schema_leaks,
verb_tampering_possible, interest_score (0..100), interest_reasons[]
How to determine each field:
OPTIONS → Allow header reveals supported methods (verb tampering check).GET without auth → 200 = auth_required=false; 401/403 = auth_required=true.WWW-Authenticate (auth_type), RateLimit-* / X-RateLimit-*.Origin: https://attacker.example → response Access-Control-Allow-Origin reflected = cors_policy=reflected.See companion skill §20 for the full rubric. Score ≥ 70 → HIGH/CRITICAL finding with attack_path_hint in evidence.
When emitting a HIGH/CRITICAL finding, include a one-sentence attack-path hint in the evidence so the operator knows where to start exploiting. Templates in companion skill §39.
For every alive webapp, scrape its JS — it's where modern frontends leak.
<script src="..."> and <link rel="modulepreload" href="...">.//[#@]\s*sourceMappingURL= at end of bundle.<bundle>.map next to every discovered JS.sources[] (leaks repo structure) and sourcesContent[] (full original source code embedded). Run the secret catalog over sourcesContent[].INFO_DISCLOSURE.Run the 29+-pattern catalog (companion skill §17) over every JS body and every parsed sourcesContent[] blob. Each hit = SECRET_LEAK finding with the catalog's per-pattern severity.
Three regex tiers (companion skill §16.10). Each unique endpoint becomes a wayback_endpoint asset and feeds into the API classifier in §12.
Three patterns (companion skill §16.11): RFC1918, internal DNS suffixes, K8s service DNS. Each match = MEDIUM INFO_DISCLOSURE.
When an extracted endpoint ends in /graphql or /graphiql, POST the standard introspection query. If schema returns → HIGH MISCONFIG. Then enumerate mutations and subscriptions for high-value targets.
_buildManifest.js and _ssgManifest.js enumerate every Next.js page route — exposes the application's full route structure.
--deep.--deep.Mobile apps are often the weakest link.
See companion skill §21. Threshold: ≥70 for deep analysis.
zipfile + optional androguard.AndroidManifest.xml, resource strings, asset files, native .so files, classes*.dex (string-extract).Run the catalog over manifest, resources, asset files, dex string-extract output.
Every discovered hostname becomes a subdomain asset. Write sidecar mobile_endpoints.json for the API discovery module to consume.
| Manifest attribute | Severity |
|---|---|
android:debuggable="true" |
CRITICAL |
android:allowBackup="true" (without whitelist) |
MEDIUM |
android:usesCleartextTraffic="true" |
MEDIUM |
Exported activity/service/receiver without android:permission |
MEDIUM |
| Sensitive deep-link handler | HIGH |
WebView with setJavaScriptEnabled(true) + addJavascriptInterface(...) |
HIGH |
| Certificate pinning absent | LOW |
For every Firebase project ID extracted:
https://{project-id}.firebaseio.com/.json. Returns JSON tree → CRITICAL OPEN_FIREBASE_RTDB.https://{project-id}.firebaseapp.com/.https://apps.apple.com/<region>/app/<slug>/id<bundle-id>.Build candidate bucket names from: target's root domain, subdomain stems, optional brand/company name. Filter generic stems unless combined with target-identifying tokens. Apply 6 prefixes × 15 suffixes (companion skill §16.8 has the lists).
S3, GCS, Azure Blob templates — see companion skill §16.8.
| Outcome | Severity |
|---|---|
| Bucket listable | CRITICAL |
| Bucket exists, objects readable by direct URL but not listable | HIGH |
| Bucket exists, ACL private | INFO |
Extract AWS account-ID from S3 region/error responses. GCP project ID from GCS error responses. Azure tenant ID from blob URL patterns.
Tools: Cielo (multi-chain), TRM (graphs), Arkham (multichain + entity labels), MetaSleuth (visual), Range (CCTP), Socketscan (EVM bridge), Pulsy (bridge aggregator), Chainalysis Horizon 2.0 (paid), Elliptic Lens.
Methodology: start with L1 bridge events; use L2 explorers for in-rollup activity; for privacy protocols focus on timing analysis and clustering.
Same patterns: age + activity + connections + balance over time + linked accounts. NFTs add ownership history + metadata + connected wallets.
ExifTool, Jeffrey's Image Metadata Viewer, EXIF Viewer Pro.
Forensically, FotoForensics, Bellingcat Photo Checker, Sensity AI, Exposing.ai, Adobe Content Credentials Verify, c2patool. Techniques: ELA, metadata, clone detection, noise analysis.
Signs/banners, architecture, road markings, license plates, clothing, cross-platform snippet search.
YouTube Data Viewer (Amnesty), ExifTool on downloaded files.
bsky.social/xrpc/com.atproto.identity.resolveHandle; identity doc via plc.directory/<did>; firehose at Firesky; SkyView for graphs. Archive early — handle migration / post deletion.Languages, dialects, background noises (train horns, prayer calls, wildlife). Tools: Audacity, Sonic Visualiser, SoundCMD. Spectrograms for unique patterns; Shazam/SoundHound for music.
FFmpeg, VLC. Stitch panoramas; stabilize panning footage (FFmpeg deshake or Blender VSE). Prefer original uploads over re-encodes.
Tools: SunCalc, ShadeMap, Bellingcat Shadow-Finder, NOAA Solar Calculator.
Stellarium, SkyMap, MoonCalc to simulate sky at different times/locations.
Google Earth Pro (historical imagery slider), Sentinel Hub EO Browser (Sentinel + Landsat with timelapse). Record coordinates in WKT; hash cached tilesets.
whois.tcinet.ru; Telegram for channels/admins/cross-posts.WhatsMyName, NameCheckup, Sherlock, Maigret.
PimEyes, Exposing.ai, Azure Face API (compliance).
Maltego, snscrape, SocialBlade. Bluesky / Mastodon: instance explorers + handle resolvers.
This is the highest-ROI single technique for external red teams. Execute it on every engagement.
| Source | Tier | Notes |
|---|---|---|
| Hudson Rock Cavalier | FREE | Infostealer-log corpus; very high signal for corp SSO. |
| Have I Been Pwned | Free + paid | Domain-wide existence + Pwned Passwords (k-anonymity). |
| DeHashed | Paid | Searchable per-record API. |
| IntelX | Free + paid | Aggregator; phonebook search. |
| Local breach corpus | Operator-supplied | Whatever's on disk. |
| Stat | Severity |
|---|---|
| ≥10 employees compromised | CRITICAL |
| 1–9 employees compromised | HIGH |
| ≥1 end-user (non-employee) compromised | MEDIUM |
| Domain seen in breach with 0 named accounts | INFO |
After Stage 3 has run identity-fabric mapping AND breach lookups have completed: for every discovered IdP tenant, intersect with breach corpus on the tenant's domain. Non-empty intersection → SSO_EXPOSURE finding, severity CRITICAL. Evidence: tenant ID + product + employee count + per-account source attribution.
Shodan, Censys, Onyphe, DNSDB.
crt.sh, SecurityTrails.
__biz IDs. Expect link rot.Hunchly, Kasm Workspaces, ArchiveBox, SingleFileZ.
When multiple OSINT tools (or modules) run, late-arriving outputs need to feed into earlier-running consumers. Three patterns:
<scan>/mobile_endpoints.json or <scan>/secrets_sidecar.json; later modules read on start. No blocking.In ad-hoc engagement: tmpdir + JSON sidecars + one-line manifest makes operations composable.
Running a large dork corpus across multiple engines:
(severity, category, confidence).run_id, every event one line, UTC timestamps, tool versions.run_id + tool versions + JSONL log + asset/findings DB.Sensity AI, Hive Moderation, Reality Defender, Adobe Content Credentials Verify, CarNet (AI car-model identification for geolocation aid).
A non-exhaustive list of mistakes that come up often.
whoami from a discovered API. Could be a honeypot.Web targets are increasingly behind Cloudflare / Akamai / Fastly / AWS CloudFront. The CDN itself is hard to attack; the origin server is often softly defended. Six techniques to find it.
The target's domain may have pointed directly at the origin IP before the CDN was deployed. Query historical passive DNS:
https://api.securitytrails.com/v1/history/<domain>/dns/a (paid; the most complete).What to look for: an IP that resolved 1–5 years ago and is not in the current CDN's published IP ranges (Cloudflare / Akamai / etc.). Cross-check the IP's current banner — if it serves the same site without the CDN, you've got the origin.
Certificates often get re-issued with the same SAN list across origin and CDN. Search CT logs:
?q=%.<target.com>&exclude=expired — find certs with the target's domain in SAN.cero (go install github.com/glebarez/cero@latest) crawl IPs and pull certs; correlate IPs whose certs include the target's hostname.If the target has a unique favicon, the origin server still serves it (CDNs proxy but don't strip the favicon). Compute the favicon's mmh3 hash and search:
http.favicon.hash:<mmh3-hash>.services.http.response.favicons.hashes:<mmh3-hash>.Returned IPs that are not in CDN ranges are origin candidates.
JARM (TLS handshake hash) works similarly: compute the target's JARM via jarm, search Shodan ssl.jarm:<jarm-hash>. Origin servers usually have a different JARM than CDNs.
If you have an origin candidate IP from steps 27.1–27.3:
curl -sk -H "Host: target.example.com" https://<candidate-IP>/
If the response matches the public site (same title, same body fingerprint) — you've found the origin. CDN-only IPs return generic CDN error pages or 403 to wrong Host.
Targets often forget to put auxiliary subdomains behind the CDN:
mail.<target> — often points at the actual mail server, sometimes co-located with web origin.ftp.<target>, sftp.<target> — likewise.cpanel.<target>, whm.<target>, webmail.<target> — shared hosting controls; same IP as web origin.direct.<target>, origin.<target>, direct-connect.<target>, noproxy.<target> — ironic admin labels.dev.<target>, staging.<target> — dev environments often skip CDN.Probe each. If any resolves to a non-CDN IP, that IP often hosts the prod origin too.
When the CDN throws an error (request triggers WAF, origin is down, configuration mismatch), it sometimes leaks the origin IP in the error body:
cf-ray and sometimes the underlying upstream details.X-Cache: MISS from <origin-host> headers.X-Powered-By, Server, X-Backend-Server).Send an email to a non-existent address at the target. The bounce often reveals origin mail server IPs in the Received: headers — these mail servers are sometimes on the same IP / netblock as the web origin. (Use a sock-puppet email; never your real engagement persona.)
When unsure, document the hypothesis in the asset attrs — don't claim origin discovery without ≥2 corroborating signals.
A Nuclei scan can return 100+ CVEs against a target. You can't validate all of them. Prioritize by exploitability.
For each CVE in your list, score:
| Signal | Weight |
|---|---|
| Listed in CISA KEV | +50 (proven exploited; treat as immediate) |
| EPSS score ≥ 0.7 | +30 |
| EPSS score 0.3–0.69 | +15 |
| Public Metasploit module exists | +25 |
| Public POC on ExploitDB / GitHub | +15 |
| Vendor-issued advisory + patch | +10 (means the vuln is real and patchable; not always exploitable) |
| Auth-required vs unauth-required | unauth +20, post-auth +0 |
| Network-vector (network) vs adjacent / local | network +15, adjacent +5, local +0 |
| CVSS v3 base ≥ 9.0 | +15 |
Total score → priority tier:
| Score | Tier | Action |
|---|---|---|
| ≥ 100 | P0 | Immediate validation; surface in engagement summary now. |
| 70–99 | P1 | Validate this engagement; include in technical report. |
| 40–69 | P2 | Mention in technical report; validate if time permits. |
| < 40 | P3 | List in appendix; no validation expected. |
Many real-world findings (sourcemap exposure, open GraphQL introspection, public bucket) don't have a CVE. Score them by their independent interest score (companion skill §20 for endpoints; §40 for severity-mapping examples). Don't gate on CVE availability.
Authorized red team engagements often include phishing. The OSINT side of phishing — building the phishing-feasibility shortlist and the pretext list — is in scope here. Crafting actual phishing payloads is out of scope (operational tradecraft, separate domain).
For an authorized engagement, the operator typically wants three lists:
A. Already-registered typosquats — these are findings (someone is squatting; client should know).
B. Available-for-registration typosquats — these are the operator's phishing-domain shortlist for the engagement.
C. Cert-SAN impersonation patterns — domains the operator could register that would make convincing certs (e.g., acme-secure.com, acme-login.com, acme-vpn-portal.com).
Generation pattern:
secure, login, vpn, portal, mail, helpdesk, it, account, verify, support, password, auth, sso).If you found a takeover-able subdomain (companion skill §16.12), you can host phishing content on a subdomain of the actual target. This bypasses every brand-impersonation defense the user has.
Procedure:
<sub>.target.com CNAMEd to unclaimed <x>.herokuapp.com).<x>.herokuapp.com on Heroku).<sub>.target.com serves your content.Use email security analysis (companion skill §16.14) to determine spoof feasibility:
| SPF policy | DMARC policy | Spoof feasibility |
|---|---|---|
~all (softfail) or absent |
p=none or absent |
HIGH — direct spoof of <anything>@<target> likely lands. |
~all |
p=quarantine |
MEDIUM — lands in spam folder, but lands. |
-all (hardfail) |
p=quarantine |
LOW — most providers reject; some still deliver to spam. |
-all |
p=reject |
VERY LOW — spoof rejected by major providers. Requires lookalike domain. |
If spoof is hard, fall back to lookalike (list B) or compromised-third-party (different engagement). Document the postural finding regardless.
Pretexts work when they tap a target's existing context. Build pretexts from harvested OSINT.
Pretext sources:
Per-role pretext templates (the ones operators use most):
When you find an issue on a bug-bounty target (HackerOne, Bugcrowd, Intigriti, YesWeHack) or on a non-program target where you choose to disclose responsibly.
| Platform | URL | Notes |
|---|---|---|
| HackerOne | hackerone.com | Largest; strong scope-tracking; CVSS-based reward calc. |
| Bugcrowd | bugcrowd.com | VRT (Vulnerability Rating Taxonomy) instead of CVSS for severity. |
| Intigriti | intigriti.com | EU-strong; flexible scope models. |
| YesWeHack | yeswehack.com | EU-headquartered; growing. |
| HackenProof | hackenproof.com | Crypto/blockchain-focused programs. |
| Open Bug Bounty | openbugbounty.org | Free for sites without official programs (only XSS/SSRF disclosure). |
| security.txt | rfc9116 | Universal: every site should publish /.well-known/security.txt. |
Title: [Severity] [Affected component] Brief description
Example: [HIGH] [api.acme.com] Unauthenticated SSRF via /v1/proxy
Summary
2-3 sentences explaining what was found and why it matters.
Steps to Reproduce
1. Numbered, copy-pasteable.
2. Include exact URLs, payloads, expected vs actual response.
3. Reproduce the issue from a fresh state where possible.
Proof of Concept
- Screenshot showing the vulnerability triggered.
- HTTP request/response (sanitize sensitive data; redact other users' data).
- Or short video/GIF for complex multi-step issues.
Impact
Quantify: what data is at risk, how many users, what business functions break.
Tie to the program's impact criteria where defined.
Severity (per program criteria)
- CVSS v3 vector: AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H
- Score: 9.8 (Critical)
- Justification: <1-2 sentence reasoning>
Remediation
- Concrete recommendation. "Validate the URL parameter against an allowlist before fetching."
- If a quick fix: WAF rule, header check.
- If a structural fix: refactor recommendation.
Affected component
- URL: https://api.acme.com/v1/proxy
- Date discovered: 2026-04-27 14:23 UTC
- Method: HTTP GET / POST / etc.
Exceptional / Critical / High / Medium / Low tiers.If the target has no bug bounty program but you found a real vulnerability during authorized testing (e.g., a customer's external assessment surfacing a third-party vendor's bug):
<target>/.well-known/security.txt — if present, follow its Contact: and Encryption: (PGP) instructions.security@<target> then abuse@<target>.aws-security@amazon.com. For exposed AWS keys: also notify the account owner if discoverable via WHOIS/contact.google-cloud-trust@google.com.https://msrc.microsoft.com/.security@github.com.abuse@ or security@ channel; npm specifically auto-revokes leaked tokens via their secret scanner.Operator-facing artifacts (asset graph, JSONL log, finding DB) are not the same as client-facing artifacts (exec summary, technical report). Build deliverables intentionally.
ENGAGEMENT: <Client Name>
ASSESSMENT TYPE: External Attack Surface Assessment
ENGAGEMENT WINDOW: <start date> – <end date>
SCOPE: <one-line scope description, e.g., "All internet-facing assets of acme.com and its 3 brand domains">
LEAD: <your name / team>
----- KEY FINDINGS -----
1. [CRITICAL] <One-line title>
Business impact: <one sentence in business language>
Estimated remediation effort: <hours / days / weeks>
Recommended action: <verb + object, e.g., "Rotate the exposed AWS access key and audit CloudTrail">
2. [CRITICAL] <One-line title>
…
3. [HIGH] <One-line title>
…
(Top 3-5 findings only; full list in technical report)
----- POSTURAL OBSERVATIONS -----
- Email security (SPF/DMARC): <2-3 sentences on posture, e.g., "DMARC is set to p=none, allowing spoof of <target>.com email; tightening to p=reject would block external spoofing.">
- Identity fabric (SSO): <2-3 sentences>
- Cloud surface (S3/GCS/Azure): <2-3 sentences>
- Mobile attack surface: <2-3 sentences if applicable>
----- AGGREGATE METRICS -----
- Assets discovered: <N> (<breakdown>)
- Findings: <N CRIT, M HIGH, P MED, Q LOW, R INFO>
- Live credentials confirmed: <N>
- Detectability of our operations: <90% low / 8% medium / 2% high>
----- RECOMMENDED NEXT STEPS -----
1. Address P0 findings in next 7 days.
2. Address P1 findings in next 30 days.
3. Schedule re-test for: <date>.
4. Consider follow-on assessments: <if applicable, e.g., authenticated app testing, internal pentest>.
Each finding in the technical report uses this card:
═══════════════════════════════════════════════════════════
FINDING #<N>: <Title>
SEVERITY: <CRIT / HIGH / MED / LOW / INFO>
CONFIDENCE: <CONFIRMED / FIRM / TENTATIVE>
ASSET: <typed asset key>
DISCOVERED: <UTC timestamp>
═══════════════════════════════════════════════════════════
DESCRIPTION
<2-5 sentence technical explanation>
EVIDENCE
- URL: <where it was found>
- Tool: <how it was discovered>
- Screenshot: <attachment ref>
- Raw HTTP: <sanitized capture>
- Hash (SHA-256): <of any downloaded artifact>
REPRODUCTION
Step 1: <command or action>
Expected: <output>
Step 2: …
IMPACT
<Business-language impact statement>
Affected systems: <list>
Affected user populations: <if applicable>
REMEDIATION
Immediate (within hours):
- <action>
Short-term (within days):
- <action>
Long-term (within weeks):
- <action>
REFERENCES
- <CVE-ID, advisory URL, OWASP top-10 link, vendor doc>
ATTACK PATH HINT
<If applicable, the one-sentence hint from companion skill §39>
Engineers think in CVSS. Executives think in business outcomes. Translate.
| Technical finding | Business-language impact |
|---|---|
| Listable S3 bucket with PII | "Customer records publicly downloadable. Potential GDPR/CCPA notification trigger if accessed. Estimated cost of disclosure: 30-day notification + credit monitoring + legal review." |
Exposed .env with DB credentials |
"Database access to all customer data. Pivots to backups, billing systems, employee PII. If exploited: full data breach scope." |
| Live AWS access key with admin scope | "Full cloud account compromise. Attacker can spin up cryptominers, exfiltrate all data, lateral-move to connected accounts. If exploited: 6-figure cloud bill + complete environment rebuild." |
| Open GraphQL introspection on prod | "API attack surface fully mapped by attackers. Enables more precise follow-on attacks; not directly exploitable but attacker reconnaissance is now zero-effort." |
| Subdomain takeover possible | "Attackers can host content under your trusted domain. Phishing emails from this domain bypass brand-impersonation defenses; users will trust them." |
| Open Firebase Realtime Database | "Mobile app's backend database is publicly readable. All user data, possibly writable. If exploited: full data breach + potential service disruption." |
| Missing HSTS on /login | "Login pages can be downgraded to HTTP via active network attacks. Credentials potentially captured by anyone with network access (coffee shop, conference WiFi)." |
DMARC p=none |
"Anyone on the internet can send email appearing to be from your domain. Phishing campaigns become trivially convincing for both customers and employees." |
| ≥10 employees in breach corpus | "Stolen credentials for your staff are circulating; attackers can attempt these against your SSO. Even if SSO has MFA, password reuse against other services puts those at risk." |
android:debuggable=true |
"Mobile app can be reverse-engineered and modified by anyone. Trust boundary between app and server is undermined; backend assumes app integrity that doesn't exist." |
| Vendor product (Citrix/F5/Pulse) version with KEV CVE | "Network appliance has a known-exploited vulnerability. Attackers are actively scanning the internet for this exact issue. Patch immediately." |
Deliver alongside the report:
<engagement-id>-reproduction-package.zip
├── README.md # how to use the package
├── engagement-metadata.json # client, dates, scope, lead
├── tools-used.txt # tool name + version, one per line
├── run-log.jsonl # every event during engagement
├── assets.db # SQLite of all discovered assets
├── findings.db # SQLite of all findings
├── evidence/
│ ├── screenshots/ # PNG, named by finding-id
│ ├── http/ # raw HTTP captures (sanitized)
│ ├── downloads/ # any binary artifacts (with .sha256 alongside)
│ └── code/ # any extracted source (sanitized)
├── re-test-script.sh # reruns probes for the CRIT/HIGH findings
└── disclosure/ # if applicable: bounty submissions, vendor notifications
The package is the source of truth — the report is the human-readable view. Anyone with the package can reproduce the engagement and verify findings.
Drop these prompts into a fresh Claude session to verify the skill loads and behaves correctly. Pass criteria: expected sections referenced, no hallucinated content, scope-check invoked when needed.
.cn domain to its operating company?" → §20.4 + companion skill §14.2.