Document Object Model-based Cross-site Scripting (DOM-based XSS) is a lesser-known form of XSS. It’s different from reflected and stored XSS because the exploit happens entirely on the client-side and does not conceptually require a server-side vulnerability.
What is the DOM?
The Document Object Model (DOM) is the data representation of the objects that comprise the structure and content of a document on the web.
In the following examples, the source of the data is the hash component of the URL,
In the case of modern browsers, DOM-based XSS has lost some of its relevance because most obvious data sources (URL, referrer) are automatically escaped by the browser. For example, the hash component of the URL is automatically URL encoded.
However, if a web page does something less trivial than simply reading a URL and inserting it into a document, it might still be possible to trick scripts into interpreting the content in an unsafe way. For example, a web application might semantically extract and decode data out of the URL and use that.
Let’s say we have the following script snippet:
<script> var hashData = document.location.hash.substring(1); var data = decodeURIComponent(hashData); document.write(data); </script>
The script simply takes everything after the hash part of the URI, decodes it and writes the data to the document.
In that case, we can simply urlencode any payload and execute it by placing it in the hash part of the URI, like this:
For the payload, a simple alert call works:
The final URI will look like this:
If anyone clicks on the link, and the vulnerable script snippet is in that web page, then an alert message will be popped in their browser.
As with any XSS, it is important to escape or whitelist all untrusted data.
When inserting untrusted data into the HTML body, it is important to HTML encode the data. An encoding library should be used for this.
<script src="he.js"></script> <script> var hashData = document.location.hash.substring(1); var data = he.encode(decodeURIComponent(hashData)); document.write(data); </script>
For other languages, there are other encoders. For Java, you can use the OWASP Java Encoder.
Modern frontend frameworks
Most modern frontend frameworks (e.g. Angular, Vue, React) have these mitigations built in. It’s still possible to allow XSS, but the functions which do these are often expressive of their dangers. For example
dangerouslySetInnerHTML is a function which allows XSS in React.
Therefore using these frameworks largely mitigates many XSS vectors.
Note that when you add encoded data to an HTML attribute, then it’s important for the encoded value to be between quotes. The following script snippet is vulnerable:
document.write('<input type="text" value=' + he.encode(data) + ' />');
If the user inputs the following hash fragment:
Then another attribute will be added to the
input HTML element -
Therefore it is imperative to make sure all user input values are between double quotes.
Although escaping is usually preferred, it is also possible to use regex to whitelist untrusted data. The problem with this approach is that it is easy to make mistakes.
XSS comes in many shapes and sizes. The important thing is to be wary of all user-controlled data, even if it comes from an inconspicuous place, like the hash fragment. Any unescaped data can easily become an XSS vulnerability.