Search for packages
| purl | pkg:pypi/justhtml@1.13.0 |
| Vulnerability | Summary | Fixed by |
|---|---|---|
|
VCID-7efv-ez9t-cyh9
Aliases: GHSA-4p64-v8f5-r2gx |
Multiple security fixes in justhtml ## Summary `justhtml` `1.16.0` fixes multiple security issues in sanitization, serialization, and programmatic DOM handling. Most of these issues affected one of these advanced paths rather than ordinary parsed HTML with the default safe settings: - programmatic DOM input to `sanitize()` or `sanitize_dom()` - reused or mutated sanitization policy objects - custom policies that preserve foreign namespaces such as SVG or MathML ## Affected versions - `justhtml` `<= 1.15.0` ## Fixed version - `justhtml` `1.16.0` released on April 12, 2026 ## Impact ### Policy reuse and mutation Nested mutation of sanitization policy internals could weaken later sanitization by leaving stale compiled sanitizers active, or by mutating exported default policy internals process-wide. ### In-memory sanitization gaps Programmatic DOM sanitization could miss dangerous mixed-case tag names such as `ScRiPt` or `StYlE`, and custom `drop_content_tags` values such as `{"SCRIPT"}` could silently fail to drop dangerous subtrees. ### Serialization injection Crafted programmatic doctype names could serialize into active markup before the document body. ### Foreign-namespace policy bypasses Custom policies that preserve SVG or MathML could allow active SVG features to survive sanitization, including: - animation elements such as `<set>` and `<animate>` that mutate already-sanitized attributes after sanitization - presentation attributes such as `fill`, `clip-path`, `mask`, `marker-start`, and `cursor` containing external `url(...)` references - programmatic DOM trees that claim `namespace="html"` but serialize as `<svg>` or `<math>`, bypassing foreign-content checks ### Rawtext hardening gap Mixed-case programmatic `style` or `script` nodes could bypass rawtext hardening and preserve active stylesheet content such as remote `@import` rules. ## Default configuration Most of these issues did **not** affect the normal `JustHTML(..., sanitize=True)` path for ordinary parsed HTML. The main exceptions were policy-mutation issues, which could weaken later sanitization if code mutated nested state on reused policy objects or exported defaults. ## Recommended action Upgrade to `justhtml` `1.16.0`. If you cannot upgrade immediately: - do not mutate `DEFAULT_POLICY`, `DEFAULT_DOCUMENT_POLICY`, or nested policy internals - avoid reusing policy objects after mutating nested state - avoid preserving SVG or MathML for untrusted input - avoid preserving `style` or `script` in custom policies for untrusted input - avoid serializing untrusted programmatic doctypes or DOM trees ## Credit Discovered during an internal security review of `justhtml`. |
Affected by 2 other vulnerabilities. |
|
VCID-j2rc-dcnp-n7eh
Aliases: GHSA-c9vm-hv86-f23r |
justhtml includes multiple security fixes ## Summary `justhtml` `1.15.0` includes multiple security fixes affecting URL sanitization helpers, HTML serialization, Markdown passthrough, and several custom sanitization-policy edge cases. These issues have different impact levels and do not all affect the default configuration in the same way. ## Affected versions - `justhtml` `<= 1.14.0` ## Fixed version - `justhtml` `1.15.0` released on April 9, 2026 ## Impact overview ### Helper and serialization issues These issues could affect applications using JustHTML helpers or programmatic DOM construction, even outside the default HTML sanitization path. - `JustHTML.clean_url_value(...)` and `clean_url_in_js_string(...)` could accept URL values such as `javascript:...`, which became active `javascript:` URLs after HTML attribute parsing. - URL sanitization could treat values like `\\evil.example/x` or `/\\evil.example/x` as safe relative URLs even though browsers could resolve them as remote requests. - Malformed bracketed hosts such as `https://[evil.example]/x` could raise exceptions and crash sanitization when host allowlists were used. - Programmatic element or attribute names containing markup-breaking characters could be serialized into active HTML. - Programmatic HTML comments containing `-->` could break out of the comment and inject live markup. ### Markdown passthrough issue - `to_markdown(html_passthrough=True)` could reintroduce active HTML from sanitized `<textarea>` content by emitting a raw closing `</textarea>` sequence. ### Custom policy issues These issues affected custom policies more than the default safe configuration. - `a[ping]` was handled as a single URL even though browsers interpret it as a space-separated URL list. - `attributionsrc` was not treated as URL-bearing and could preserve attacker-controlled reporting endpoints. - `link[imagesrcset]` was not treated as URL-bearing and could preserve attacker-controlled image candidates. - Preserved `<meta http-equiv="refresh">` tags could keep redirect targets without URL-policy enforcement. - Preserved `<base href>` tags could rewrite how later relative URLs resolved in the browser. - Preserved `<style>` blocks could keep resource-loading CSS such as `@import`, `url(...)`, or `image-set(...)`. - Mixed-case attribute names in custom transform pipelines could bypass or confuse security-related transforms such as `DropAttrs(...)`, `DropUrlAttrs(...)`, `AllowStyleAttrs(...)`, and `MergeAttrs(...)`. ## Default configuration Most of the custom-policy issues above did **not** affect the default `JustHTML(..., sanitize=True)` behavior. The main exceptions were: - helper APIs such as `clean_url_value(...)` - programmatic DOM / serializer usage - applications explicitly using `html_passthrough=True` - applications using custom policies or custom transform pipelines ## Recommended action Upgrade to `justhtml` `1.15.0`. If you cannot upgrade immediately: - avoid `html_passthrough=True` for untrusted content - avoid preserving `<style>`, `<meta http-equiv="refresh">`, and `<base href>` in custom policies - avoid allowing `ping`, `attributionsrc`, or `imagesrcset` unless you explicitly validate them - avoid serializing untrusted programmatic node names, attribute names, or comment data |
Affected by 3 other vulnerabilities. |
|
VCID-kg61-21wu-kyfd
Aliases: GHSA-r758-8hxw-4845 |
justhtml: Mutation XSS with custom foreign-namespace sanitization policies ## Summary A parser-differential / mutation XSS issue was found in `justhtml` when using a **custom sanitization policy** that preserves foreign namespaces such as SVG or MathML. Under these custom settings, specially crafted input could sanitize into HTML that looked safe at first, but became unsafe when parsed again by a browser or another HTML parser. ## Impact This issue does **not** affect the default safe configuration. You may be affected if you use a custom `SanitizationPolicy` with settings like: - `drop_foreign_namespaces=False` - allowlisted foreign elements such as MathML or SVG - allowlisted raw-text containers such as `<style>` In that case, an attacker could inject markup that survives sanitization and turns into active HTML after re-parsing. ## Affected versions - `justhtml` `<= 1.13.0` ## Fixed version - Fixed in `1.14.0` ## Workarounds Until you upgrade: - keep `drop_foreign_namespaces=True` - avoid allowlisting foreign namespaces for untrusted input - avoid allowlisting raw-text containers such as `<style>` in custom policies ## Notes The default `JustHTML(..., sanitize=True)` behavior was not found to be vulnerable in this issue. ## Credit Discovered by JustHTML author during a LLM-based security review of `justhtml`. |
Affected by 4 other vulnerabilities. |
|
VCID-pe3n-8tcx-5bb5
Aliases: GHSA-vrx2-77f2-ww34 |
justhtml has sanitization bypass in custom policies and programmatic DOM ## Summary `justhtml` `1.17.0` fixes multiple security issues in sanitization, serialization, and programmatic DOM handling. Most of these issues affected advanced or custom configurations rather than the default safe path. ## Affected versions - `justhtml` `<= 1.16.0` ## Fixed version - `justhtml` `1.17.0` released on April 19, 2026 ## Impact ### Custom SVG / MathML sanitization policies Custom policies that preserved foreign namespaces could allow dangerous content to survive sanitization, including: - active HTML integration points such as SVG `<foreignObject>`, MathML `<annotation-xml encoding="text/html">`, SVG `<title>` / `<desc>`, and MathML text integration points - mutation-XSS parser-differential payloads that looked inert in memory but became active HTML after reparse - SVG `filter="url(...)"` attributes that could trigger external fetches These issues affected: - `JustHTML(..., sanitize=True)` with custom foreign-namespace policies - `sanitize()` / `sanitize_dom()` - low-level terminal `Sanitize(...)` transform execution ### Preserved `<style>` handling Constructor-time sanitization and explicit `Sanitize(...)` transforms did not fully match `sanitize()` / `sanitize_dom()` when custom policies preserved `<style>`. That could leave resource-loading CSS such as `@import` or `background-image:url(...)` in sanitized output from HTML string input. ### Programmatic DOM serialization Programmatic `script`, `style`, and `Comment(...)` nodes could still serialize into active markup in some edge cases. This could affect applications that build or mutate DOM trees directly before calling `to_html()` or `to_markdown(html_passthrough=True)`. ### Cache mutation and DOM cycle handling Two lower-severity hardening fixes were included: - compiled sanitize-pipeline caches could be mutated after warming and weaken later sanitization - parent/child cycles in programmatic DOM trees could cause infinite loops in operations such as `to_html()` and `sanitize_dom()` ## Default configuration Most of the issues above did **not** affect ordinary parsed HTML with the default `JustHTML(..., sanitize=True)` configuration. The main risk areas were: - custom policies that preserve SVG or MathML - custom policies that preserve `<style>` - programmatic DOM construction or mutation - low-level direct sanitizer/transform APIs ## Recommended action Upgrade to `justhtml` `1.17.0`. If users cannot upgrade immediately: - avoid preserving SVG or MathML for untrusted input - avoid preserving `<style>` for untrusted input - avoid mutating programmatic DOM trees with untrusted `script`, `style`, or comment content - avoid mutating warmed policy internals or sanitizer caches ## Credit Discovered during an internal security review of `justhtml`. |
Affected by 1 other vulnerability. |
|
VCID-ze6z-2zm7-rud9
Aliases: GHSA-r8cj-3554-33mr |
justhtml introduces denial-of-service hardening ## Summary `justhtml` `1.18.0` fixes multiple low-severity denial-of-service hardening issues in CSS selector handling and linkification. These issues are availability concerns. They do not allow script execution, data disclosure, or sanitizer bypass by themselves. ## Affected versions - `justhtml` `< 1.18.0` ## Fixed version - `justhtml` `1.18.0` released on May 4, 2026 ## Impact ### CSS selector handling Applications that evaluate attacker-controlled selector strings, or that run selector-based transform pipelines over attacker-controlled documents, could consume disproportionate CPU or memory. The affected selector patterns included oversized selectors, large selector lists, oversized compound selectors, long combinator chains, deeply nested functional pseudo-classes such as `:not(...)`, repeated attribute/class token matching over large values, repeated sibling or ancestor scans, repeated positional pseudo-class work, and `:contains(...)` over large descendant text. Programmatically constructed malformed DOM graphs could also trigger non-terminating or duplicate traversal in some selector paths, including cyclic/shared child graphs, cyclic parent chains, and cyclic text traversal for `:contains(...)`. ### Linkification Attacker-controlled text containing punctuation-heavy input or URL candidates ending in long runs of unmatched closing brackets could cause repeated rescanning and consume disproportionate CPU when linkification was enabled. ## Default configuration Ordinary sanitization of parsed HTML with the default `JustHTML(..., sanitize=True)` configuration is not expected to expose untrusted users to selector injection, because selectors are normally supplied by application code. The main risk areas are: - applications that accept selector strings from untrusted users and pass them to `query(...)`, `matches(...)`, or selector-based transforms - custom transform or sanitization pipelines that run selector matching over very large untrusted documents - applications that construct or mutate DOM trees programmatically from untrusted structure - applications that enable `Linkify(...)` over attacker-controlled text ## Fixes in 1.18.0 `1.18.0` adds generalized selector resource controls and removes several repeated-work hot paths: - shared selector limits for parse and match operations - structural caps for selector length, selector lists, compound selectors, complex selectors, and parse depth - match-operation and string-byte budgets - per-query matcher state for caches and cycle guards - precomputed or cached ancestor, sibling, positional, attribute-token, text-content, `:not(...)`, `:empty`, and `:nth-child(...)` work - consistent enforcement across public parsing, `query(...)`, tag-only query fast paths, transform selector compilation, and sanitization transform matching - linkification hardening for punctuation-heavy inputs and trailing bracket trimming ## CWE mapping - CWE-400: Uncontrolled Resource Consumption - CWE-407: Inefficient Algorithmic Complexity - CWE-835: Loop with Unreachable Exit Condition ## Recommended action Upgrade to `justhtml` `1.18.0`. If users cannot upgrade immediately: - do not pass untrusted selector strings to `query(...)`, `matches(...)`, or selector-based transforms - restrict the size of untrusted documents before selector matching or linkification - avoid constructing programmatic DOM graphs from untrusted structure - avoid enabling `Linkify(...)` on very large attacker-controlled text ## Credit Discovered during an internal security review of `justhtml`. |
Affected by 0 other vulnerabilities. |
| Vulnerability | Summary | Aliases |
|---|---|---|
| VCID-72kf-cb91-dkcy | JustHTML is vulnerable to XSS via code fence breakout in <pre> content ## Summary `to_markdown()` is vulnerable when serializing attacker-controlled `<pre>` content. The `<pre>` handler emits a fixed three-backtick fenced code block, but writes decoded text content into that fence without choosing a delimiter longer than any backtick run inside the content. An attacker can place backticks and HTML-like text inside a sanitized `<pre>` element so that the generated Markdown closes the fence early and leaves raw HTML outside the code block. When that Markdown is rendered by a CommonMark/GFM-style renderer that allows raw HTML, the HTML executes. This is a bypass of the v1.12.0 Markdown hardening. That fix escaped HTML-significant characters for regular text nodes, but `<pre>` uses a separate serialization path and does not apply the same protection. ## Details The vulnerable `<pre>` Markdown path: - extracts decoded text from the `<pre>` subtree - opens a fenced block with a fixed delimiter of `````` - writes the decoded text directly into the output - closes with another fixed `````` Because the fence length is fixed, attacker-controlled content containing a backtick run of length 3 or more can terminate the code block. If the content also contains decoded HTML-like text such as `<img ...>`, that text appears outside the fence in the resulting Markdown and is treated as raw HTML by downstream Markdown renderers. The issue is not that HTML-like text appears inside code blocks. The issue is that the serializer allows attacker-controlled `<pre>` text to break out of the fixed fence. ## Reproduction ```python from justhtml import JustHTML payload = "<pre>```\n<img src=x onerror=alert(1)></pre>" doc = JustHTML(payload, fragment=True) # default sanitize=True print(doc.to_html(pretty=False)) # <pre>``` # <img src=x onerror=alert(1)></pre> print(doc.to_markdown()) # ``` # ``` # <img src=x onerror=alert(1)> # ``` ``` Rendered as CommonMark/GFM-style Markdown, that output is interpreted as: 1. Line 1 opens a fenced code block 2. Line 2 closes it 3. Line 3 is raw HTML outside the fence 4. Line 4 opens a new fence ## Impact Applications that treat `JustHTML(..., sanitize=True).to_markdown()` output as safe for direct rendering in Markdown contexts may be exposed to XSS, depending on the downstream Markdown renderer's raw-HTML handling. ## Root Cause The `<pre>` Markdown serializer uses a fixed fence instead of selecting a delimiter longer than the longest backtick run in the content. ## Fix When serializing `<pre>` content to Markdown, choose a fence length longer than any backtick run present in the code block content, with a minimum length of 3. |
GHSA-5vp3-3cg6-2rq3
|