Staging Environment: Content and features may be unstable or change without notice.
Search for packages
Package details: pkg:pypi/vllm@0.7.0
purl pkg:pypi/vllm@0.7.0
Next non-vulnerable version 0.20.0
Latest non-vulnerable version 0.20.0
Risk
Vulnerabilities affecting this package (9)
Vulnerability Summary Fixed by
VCID-737m-tpkz-qffm
Aliases:
CVE-2025-25183
GHSA-rm76-4mrf-v9r8
PYSEC-2025-62
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of Python's built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions. The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use. This issue has been addressed in version 0.7.2 and all users are advised to upgrade. There are no known workarounds for this vulnerability.
0.7.2
Affected by 8 other vulnerabilities.
VCID-e8w2-9rwg-u7ba
Aliases:
CVE-2025-46570
GHSA-4qjh-9fv9-r85r
PYSEC-2025-53
vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0.
0.9.0
Affected by 2 other vulnerabilities.
VCID-fxgs-s1vm-8bez
Aliases:
CVE-2025-32444
PYSEC-2025-42
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.
0.8.5
Affected by 7 other vulnerabilities.
VCID-k1qz-xe9c-2bg3
Aliases:
CVE-2025-29770
GHSA-mgrm-fgjv-mhv8
PYSEC-2025-223
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesystem. This cache has been on by default in vLLM. Outlines is also available by default through the OpenAI compatible API server. The affected code in vLLM is vllm/model_executor/guided_decoding/outlines_logits_processors.py, which unconditionally uses the cache from outlines. A malicious user can send a stream of very short decoding requests with unique schemas, resulting in an addition to the cache for each request. This can result in a Denial of Service if the filesystem runs out of space. Note that even if vLLM was configured to use a different backend by default, it is still possible to choose outlines on a per-request basis using the guided_decoding_backend key of the extra_body field of the request. This issue applies only to the V0 engine and is fixed in 0.8.0.
0.8.0
Affected by 8 other vulnerabilities.
VCID-nctw-rz8h-f3af
Aliases:
CVE-2026-22773
GHSA-grg2-63fw-f2qr
PYSEC-2026-143
vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This causes a tensor dimension mismatch that results in an unhandled runtime error, leading to complete server termination. This issue has been patched in version 0.12.0.
0.12.0
Affected by 1 other vulnerability.
VCID-svzy-7pke-2bdr
Aliases:
CVE-2025-46722
GHSA-c65p-x677-fgj6
PYSEC-2025-43
vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image’s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0.
0.9.0
Affected by 2 other vulnerabilities.
VCID-u659-sd9h-tkf3
Aliases:
CVE-2025-29783
GHSA-x3m8-f7g5-qhm7
PYSEC-2025-63
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts. This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts. This vulnerability is fixed in 0.8.0.
0.8.0
Affected by 8 other vulnerabilities.
VCID-ugds-eqgw-fbbz
Aliases:
CVE-2025-48887
PYSEC-2025-50
vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but excluding 0.9.0. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable. The pattern contains multiple nested quantifiers, optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking. Version 0.9.0 contains a patch for the issue.
0.9.0
Affected by 2 other vulnerabilities.
VCID-za3a-c9m1-jqgz
Aliases:
CVE-2026-34755
GHSA-pq5c-rjhq-qp7p
PYSEC-2026-144
vLLM is an inference and serving engine for large language models (LLMs). From 0.7.0 to before 0.19.0, the VideoMediaIO.load_base64() method at vllm/multimodal/media/video.py splits video/jpeg data URLs by comma to extract individual JPEG frames, but does not enforce a frame count limit. The num_frames parameter (default: 32), which is enforced by the load_bytes() code path, is completely bypassed in the video/jpeg base64 path. An attacker can send a single API request containing thousands of comma-separated base64-encoded JPEG frames, causing the server to decode all frames into memory and crash with OOM. This vulnerability is fixed in 0.19.0.
0.19.0
Affected by 1 other vulnerability.
Vulnerabilities fixed by this package (1)
Vulnerability Summary Aliases
VCID-w9kt-yaqy-47fb vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When torch.load loads malicious pickle data, it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0. CVE-2025-24357
GHSA-rh4j-5rhw-hr54
PYSEC-2025-58

Date Actor Action Vulnerability Source VulnerableCode Version
2026-06-02T04:24:27.903445+00:00 Pypa Importer Affected by VCID-za3a-c9m1-jqgz https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2026-144.yaml 38.6.0
2026-06-02T04:23:39.268624+00:00 Pypa Importer Affected by VCID-nctw-rz8h-f3af https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2026-143.yaml 38.6.0
2026-06-02T04:23:06.692427+00:00 Pypa Importer Affected by VCID-ugds-eqgw-fbbz https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-50.yaml 38.6.0
2026-06-02T04:23:06.543919+00:00 Pypa Importer Affected by VCID-e8w2-9rwg-u7ba https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-53.yaml 38.6.0
2026-06-02T04:23:06.292051+00:00 Pypa Importer Affected by VCID-svzy-7pke-2bdr https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-43.yaml 38.6.0
2026-06-02T04:22:59.629836+00:00 Pypa Importer Affected by VCID-fxgs-s1vm-8bez https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-42.yaml 38.6.0
2026-06-02T04:22:51.959379+00:00 Pypa Importer Affected by VCID-u659-sd9h-tkf3 https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-63.yaml 38.6.0
2026-06-02T04:22:51.901397+00:00 Pypa Importer Affected by VCID-k1qz-xe9c-2bg3 https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-223.yaml 38.6.0
2026-06-02T04:22:47.218281+00:00 Pypa Importer Affected by VCID-737m-tpkz-qffm https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-62.yaml 38.6.0
2026-06-02T04:22:46.187934+00:00 Pypa Importer Fixing VCID-w9kt-yaqy-47fb https://github.com/pypa/advisory-database/blob/main/vulns/vllm/PYSEC-2025-58.yaml 38.6.0