{"url":"http://public2.vulnerablecode.io/api/packages/374771?format=json","purl":"pkg:pypi/justhtml@1.12.0","type":"pypi","namespace":"","name":"justhtml","version":"1.12.0","qualifiers":{},"subpath":"","is_vulnerable":true,"next_non_vulnerable_version":"1.14.0","latest_non_vulnerable_version":"1.18.0","affected_by_vulnerabilities":[{"url":"http://public2.vulnerablecode.io/api/vulnerabilities/360022?format=json","vulnerability_id":"VCID-72kf-cb91-dkcy","summary":"JustHTML is vulnerable to XSS via code fence breakout in <pre> content\n## Summary\n\n`to_markdown()` is vulnerable when serializing attacker-controlled `<pre>` content. The `<pre>` handler emits a fixed three-backtick fenced code block, but writes decoded text content into that fence without choosing a delimiter longer than any backtick run inside the content.\n\nAn attacker can place backticks and HTML-like text inside a sanitized `<pre>` element so that the generated Markdown closes the fence early and leaves raw HTML outside the code block. When that Markdown is rendered by a CommonMark/GFM-style renderer that allows raw HTML, the HTML executes.\n\nThis is a bypass of the v1.12.0 Markdown hardening. That fix escaped HTML-significant characters for regular text nodes, but `<pre>` uses a separate serialization path and does not apply the same protection.\n\n## Details\n\nThe vulnerable `<pre>` Markdown path:\n\n- extracts decoded text from the `<pre>` subtree\n- opens a fenced block with a fixed delimiter of ``````\n- writes the decoded text directly into the output\n- closes with another fixed ``````\n\nBecause the fence length is fixed, attacker-controlled content containing a backtick run of length 3 or more can terminate the code block. If the content also contains decoded HTML-like text such as `&lt;img ...&gt;`, that text appears outside the fence in the resulting Markdown and is treated as raw HTML by downstream Markdown renderers.\n\nThe issue is not that HTML-like text appears inside code blocks. The issue is that the serializer allows attacker-controlled `<pre>` text to break out of the fixed fence.\n\n## Reproduction\n\n```python\nfrom justhtml import JustHTML\n\npayload = \"<pre>&#96;&#96;&#96;\\n&lt;img src=x onerror=alert(1)&gt;</pre>\"\ndoc = JustHTML(payload, fragment=True)  # default sanitize=True\n\nprint(doc.to_html(pretty=False))\n# <pre>```\n# &lt;img src=x onerror=alert(1)&gt;</pre>\n\nprint(doc.to_markdown())\n# ```\n# ```\n# <img src=x onerror=alert(1)>\n# ```\n\n```\n\nRendered as CommonMark/GFM-style Markdown, that output is interpreted as:\n\n1. Line 1 opens a fenced code block\n2. Line 2 closes it\n3. Line 3 is raw HTML outside the fence\n4. Line 4 opens a new fence\n\n## Impact\n\nApplications that treat `JustHTML(..., sanitize=True).to_markdown()` output as safe for direct rendering in Markdown contexts may be exposed to XSS, depending on the downstream Markdown renderer's raw-HTML handling.\n\n## Root Cause\n\nThe `<pre>` Markdown serializer uses a fixed fence instead of selecting a delimiter longer than the longest backtick run in the content.\n\n## Fix\n\nWhen serializing `<pre>` content to Markdown, choose a fence length longer than any backtick run present in the code block content, with a minimum length of 3.","references":[{"reference_url":"https://github.com/EmilStenstrom/justhtml","reference_id":"","reference_type":"","scores":[{"value":"7.1","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N"},{"value":"HIGH","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml"},{"reference_url":"https://github.com/EmilStenstrom/justhtml/commit/f35f8f723c713bd8f912d86e9ec6881275ff5af9","reference_id":"","reference_type":"","scores":[{"value":"7.1","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N"},{"value":"HIGH","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml/commit/f35f8f723c713bd8f912d86e9ec6881275ff5af9"},{"reference_url":"https://github.com/EmilStenstrom/justhtml/releases/tag/v1.13.0","reference_id":"","reference_type":"","scores":[{"value":"7.1","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N"},{"value":"HIGH","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml/releases/tag/v1.13.0"},{"reference_url":"https://github.com/EmilStenstrom/justhtml/security/advisories/GHSA-5vp3-3cg6-2rq3","reference_id":"","reference_type":"","scores":[{"value":"7.1","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N"},{"value":"HIGH","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml/security/advisories/GHSA-5vp3-3cg6-2rq3"},{"reference_url":"https://github.com/advisories/GHSA-5vp3-3cg6-2rq3","reference_id":"GHSA-5vp3-3cg6-2rq3","reference_type":"","scores":[],"url":"https://github.com/advisories/GHSA-5vp3-3cg6-2rq3"}],"fixed_packages":[{"url":"http://public2.vulnerablecode.io/api/packages/374611?format=json","purl":"pkg:pypi/justhtml@1.13.0","is_vulnerable":true,"affected_by_vulnerabilities":[{"vulnerability":"VCID-kg61-21wu-kyfd"}],"resource_url":"http://public2.vulnerablecode.io/packages/pkg:pypi/justhtml@1.13.0"}],"aliases":["GHSA-5vp3-3cg6-2rq3"],"risk_score":null,"exploitability":null,"weighted_severity":null,"resource_url":"http://public2.vulnerablecode.io/vulnerabilities/VCID-72kf-cb91-dkcy"}],"fixing_vulnerabilities":[{"url":"http://public2.vulnerablecode.io/api/vulnerabilities/360127?format=json","vulnerability_id":"VCID-8k68-7fmz-duhz","summary":"JustHTML Affected by Mutation XSS via Literal Text Serialization in Raw Text Elements (style/script)\n## Summary\n\nSanitized DOM trees can be unsafe to serialize when a custom policy allows raw-text elements such as `<style>` or `<script>`.\n\nThe issue affects DOM trees that are constructed or modified programmatically and then passed through `sanitize_dom()` with a policy that keeps these elements. Text nodes inside `<style>` and `<script>` are serialized literally, so attacker-controlled text containing the matching closing tag sequence can break out of the raw-text context and inject HTML into the serialized output.\n\nThe default sanitization policy is not affected because it drops the contents of `style` and `script`.\n\n## Details\n\nThe root cause is in HTML serialization of raw-text elements. In serialize.py, text children of `script` and `style` are emitted verbatim:\n\n```python\n_LITERAL_TEXT_SERIALIZATION_ELEMENTS = frozenset({\"script\", \"style\"})\n\ndef _serialize_text_for_parent(text: str | None, parent_name: str | None) -> str:\n    if not text:\n        return \"\"\n    if parent_name in _LITERAL_TEXT_SERIALIZATION_ELEMENTS:\n        return text\n    return _escape_text(text)","references":[{"reference_url":"https://github.com/EmilStenstrom/justhtml","reference_id":"","reference_type":"","scores":[{"value":"5.3","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:N/SC:L/SI:L/SA:N"},{"value":"MODERATE","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml"},{"reference_url":"https://github.com/EmilStenstrom/justhtml/security/advisories/GHSA-qvc2-mg72-jjhx","reference_id":"","reference_type":"","scores":[{"value":"5.3","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:N/SC:L/SI:L/SA:N"},{"value":"MODERATE","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml/security/advisories/GHSA-qvc2-mg72-jjhx"},{"reference_url":"https://github.com/advisories/GHSA-qvc2-mg72-jjhx","reference_id":"GHSA-qvc2-mg72-jjhx","reference_type":"","scores":[],"url":"https://github.com/advisories/GHSA-qvc2-mg72-jjhx"}],"fixed_packages":[{"url":"http://public2.vulnerablecode.io/api/packages/374771?format=json","purl":"pkg:pypi/justhtml@1.12.0","is_vulnerable":true,"affected_by_vulnerabilities":[{"vulnerability":"VCID-72kf-cb91-dkcy"}],"resource_url":"http://public2.vulnerablecode.io/packages/pkg:pypi/justhtml@1.12.0"}],"aliases":["GHSA-qvc2-mg72-jjhx"],"risk_score":null,"exploitability":null,"weighted_severity":null,"resource_url":"http://public2.vulnerablecode.io/vulnerabilities/VCID-8k68-7fmz-duhz"},{"url":"http://public2.vulnerablecode.io/api/vulnerabilities/360049?format=json","vulnerability_id":"VCID-vjy5-thkw-1kaq","summary":"JustHTML has a Sanitizer Bypass (in Markdown)\n## Summary\n\n`to_markdown()` does not sufficiently escape text content that looks like HTML. As a result, untrusted input that is safe in `to_html()` can become raw HTML in Markdown output.\n\nThis is not specific to tokenizer raw-text states like `<title>`, `<noscript>`, or `<plaintext>`, although those states can trigger the behavior. The root cause is broader: Markdown text serialization leaves angle brackets unescaped in text nodes.\n\n## Details\n\nWhen converting a parsed document to Markdown, text nodes are escaped for a small set of Markdown metacharacters, but HTML-significant characters such as `<` and `>` are preserved. That means content parsed as text, including entity-decoded text or text produced by RCDATA/RAWTEXT-style parsing, can be emitted into Markdown as raw HTML.\n\nExamples of affected input include:\n\n- Text produced from entity-decoded input such as `&lt;script&gt;...&lt;/script&gt;`\n- Text inside elements like `<title>`, `<textarea>`, `<noscript>` (when parsed as raw text), and `<plaintext>`\n\nThis is distinct from actual `<script>` or `<style>` elements in the DOM. Those are already dropped by default in `to_markdown()` unless `html_passthrough=True`.\n\n## Proof of Concept\n\n### General case\n\n```python\nfrom justhtml import JustHTML\n\ndoc = JustHTML(\"<p>&lt;img src=x onerror=alert(1)&gt;</p>\", fragment=True)\n\nprint(doc.to_html())\nprint()\nprint(doc.to_markdown())","references":[{"reference_url":"https://github.com/EmilStenstrom/justhtml","reference_id":"","reference_type":"","scores":[{"value":"5.3","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:N/SC:L/SI:L/SA:N"},{"value":"MODERATE","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml"},{"reference_url":"https://github.com/EmilStenstrom/justhtml/security/advisories/GHSA-3rcm-vjrc-p45j","reference_id":"","reference_type":"","scores":[{"value":"5.3","scoring_system":"cvssv4","scoring_elements":"CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:N/VI:N/VA:N/SC:L/SI:L/SA:N"},{"value":"MODERATE","scoring_system":"generic_textual","scoring_elements":""}],"url":"https://github.com/EmilStenstrom/justhtml/security/advisories/GHSA-3rcm-vjrc-p45j"},{"reference_url":"https://github.com/advisories/GHSA-3rcm-vjrc-p45j","reference_id":"GHSA-3rcm-vjrc-p45j","reference_type":"","scores":[],"url":"https://github.com/advisories/GHSA-3rcm-vjrc-p45j"}],"fixed_packages":[{"url":"http://public2.vulnerablecode.io/api/packages/374771?format=json","purl":"pkg:pypi/justhtml@1.12.0","is_vulnerable":true,"affected_by_vulnerabilities":[{"vulnerability":"VCID-72kf-cb91-dkcy"}],"resource_url":"http://public2.vulnerablecode.io/packages/pkg:pypi/justhtml@1.12.0"}],"aliases":["GHSA-3rcm-vjrc-p45j"],"risk_score":null,"exploitability":null,"weighted_severity":null,"resource_url":"http://public2.vulnerablecode.io/vulnerabilities/VCID-vjy5-thkw-1kaq"}],"risk_score":null,"resource_url":"http://public2.vulnerablecode.io/packages/pkg:pypi/justhtml@1.12.0"}