Search for packages
| purl | pkg:pypi/scrapy@2.11.2 |
| Vulnerability | Summary | Fixed by |
|---|---|---|
|
VCID-1k4b-pr5k-s7e5
Aliases: GHSA-cwxj-rr6w-m6w7 |
Scrapy: Arbitrary Module Import via Referrer-Policy Header in RefererMiddleware ### Impact Since version 1.4.0, Scrapy respects the `Referrer-Policy` response header to decide whether and how to set a `Referer` header on follow-up requests. If the header value looked like a valid Python import path, Scrapy would import the referenced object and call it, assuming it referred to a referrer policy class (for example, `scrapy.spidermiddlewares.referer.DefaultReferrerPolicy`) and attempting to instantiate it to handle the `Referer` header. A malicious site could exploit this by setting `Referrer-Policy` to a path such as `sys.exit`, causing Scrapy to import and execute it and potentially terminate the process. ### Patches Upgrade to Scrapy 2.14.2 (or later). ### Workarounds If you cannot upgrade to Scrapy 2.14.2, consider the following mitigations. - **Disable the middleware:** If you don't need the `Referer` header on follow-up requests, set [`REFERER_ENABLED`](https://docs.scrapy.org/en/latest/topics/spider-middleware.html#referer-enabled) to `False`. - **Set headers manually:** If you do need a `Referer`, disable the middleware and set the header explicitly on the requests that require it. - **Set `referrer_policy` in request metadata:** If disabling the middleware is not viable, set the [`referrer_policy`](https://docs.scrapy.org/en/latest/topics/spider-middleware.html#referrer-policy) request meta key on all requests to prevent evaluating preceding responses' `Referrer-Policy`. For example: ```python Request( url, meta={ "referrer_policy": "scrapy.spidermiddlewares.referer.DefaultReferrerPolicy", }, ) ``` Instead of editing requests individually, you can: - implement a custom [spider middleware](https://docs.scrapy.org/en/latest/topics/spider-middleware.html) that runs before the built-in referrer policy middleware and sets the `referrer_policy` meta key; or - set the meta key in start requests and use the [scrapy-sticky-meta-params](https://github.com/heylouiz/scrapy-sticky-meta-params) plugin to propagate it to follow-up requests. If you want to continue respecting legitimate `Referrer-Policy` headers while protecting against malicious ones, disable the built-in referrer policy middleware by setting it to `None` in [`SPIDER_MIDDLEWARES`](https://docs.scrapy.org/en/latest/topics/settings.html#std-setting-SPIDER_MIDDLEWARES) and replace it with the fixed implementation from Scrapy 2.14.2. If the Scrapy 2.14.2 implementation is incompatible with your project (for example, because your Scrapy version is older), copy the corresponding middleware from your Scrapy version, apply the same patch, and use that as a replacement. |
Affected by 0 other vulnerabilities. |
|
VCID-dc1m-rt7j-w3af
Aliases: CVE-2025-6176 GHSA-2qfp-q593-8484 |
Scrapy is vulnerable to a denial of service (DoS) attack due to flaws in brotli decompression implementation Scrapy versions up to 2.13.3 are vulnerable to a denial of service (DoS) attack due to a flaw in its brotli decompression implementation. The protection mechanism against decompression bombs fails to mitigate the brotli variant, allowing remote servers to crash clients with less than 80GB of available memory. This occurs because brotli can achieve extremely high compression ratios for zero-filled data, leading to excessive memory consumption during decompression. Mitigation for this vulnerability needs security enhancement added in brotli v1.2.0. |
Affected by 1 other vulnerability. |
| Vulnerability | Summary | Aliases |
|---|---|---|
| VCID-nekz-z7zw-mfgz | Scrapy allows redirect following in protocols other than HTTP ### Impact Scrapy was following redirects regardless of the URL protocol, so redirects were working for `data://`, `file://`, `ftp://`, `s3://`, and any other scheme defined in the `DOWNLOAD_HANDLERS` setting. However, HTTP redirects should only work between URLs that use the `http://` or `https://` schemes. A malicious actor, given write access to the start requests (e.g. ability to define `start_urls`) of a spider and read access to the spider output, could exploit this vulnerability to: - Redirect to any local file using the `file://` scheme to read its contents. - Redirect to an `ftp://` URL of a malicious FTP server to obtain the FTP username and password configured in the spider or project. - Redirect to any `s3://` URL to read its content using the S3 credentials configured in the spider or project. For `file://` and `s3://`, how the spider implements its parsing of input data into an output item determines what data would be vulnerable. A spider that always outputs the entire contents of a response would be completely vulnerable, while a spider that extracted only fragments from the response could significantly limit vulnerable data. ### Patches Upgrade to Scrapy 2.11.2. ### Workarounds Replace the built-in retry middlewares (`RedirectMiddleware` and `MetaRefreshMiddleware`) with custom ones that implement the fix from Scrapy 2.11.2, and verify that they work as intended. ### References This security issue was reported by @mvsantos at https://github.com/scrapy/scrapy/issues/457. |
GHSA-23j4-mw76-5v7h
|
| VCID-t5cn-a543-nyag | Duplicate Advisory: Scrapy leaks the authorization header on same-domain but cross-origin redirects ## Duplicate Advisory This advisory has been withdrawn because it is a duplicate of GHSA-4qqq-9vqf-3h3f. This link is maintained to preserve external references. ## Original Description In scrapy/scrapy, an issue was identified where the Authorization header is not removed during redirects that only change the scheme (e.g., HTTPS to HTTP) but remain within the same domain. This behavior contravenes the Fetch standard, which mandates the removal of Authorization headers in cross-origin requests when the scheme, host, or port changes. Consequently, when a redirect downgrades from HTTPS to HTTP, the Authorization header may be inadvertently exposed in plaintext, leading to potential sensitive information disclosure to unauthorized actors. The flaw is located in the _build_redirect_request function of the redirect middleware. |
GHSA-cg34-w3fm-82h3
|
| VCID-urb1-hv1z-duga | In scrapy/scrapy, an issue was identified where the Authorization header is not removed during redirects that only change the scheme (e.g., HTTPS to HTTP) but remain within the same domain. This behavior contravenes the Fetch standard, which mandates the removal of Authorization headers in cross-origin requests when the scheme, host, or port changes. Consequently, when a redirect downgrades from HTTPS to HTTP, the Authorization header may be inadvertently exposed in plaintext, leading to potential sensitive information disclosure to unauthorized actors. The flaw is located in the _build_redirect_request function of the redirect middleware. |
CVE-2024-1968
GHSA-4qqq-9vqf-3h3f PYSEC-2024-258 |
| VCID-veaw-n6vt-zfgu | Scrapy's redirects ignoring scheme-specific proxy settings ### Impact When using system proxy settings, which are scheme-specific (i.e. specific to `http://` or `https://` URLs), Scrapy was not accounting for scheme changes during redirects. For example, an HTTP request would use the proxy configured for HTTP and, when redirected to an HTTPS URL, the new HTTPS request would still use the proxy configured for HTTP instead of switching to the proxy configured for HTTPS. Same the other way around. If you have different proxy configurations for HTTP and HTTPS in your system for security reasons (e.g., maybe you don’t want one of your proxy providers to be aware of the URLs that you visit with the other one), this would be a security issue. ### Patches Upgrade to Scrapy 2.11.2. ### Workarounds Replace the built-in retry middlewares (`RedirectMiddleware` and `MetaRefreshMiddleware`) and the `HttpProxyMiddleware` middleware with custom ones that implement the fix from Scrapy 2.11.2, and verify that they work as intended. ### References This security issue was reported by @redapple at https://github.com/scrapy/scrapy/issues/767. |
GHSA-jm3v-qxmh-hxwv
|