CVE-2026-53923— vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow

EPSS 0.28% · P20 Updated Jun 27, 2026

Get alerts for future matching vulnerabilitiesLog in to subscribe

I. Basic Information for CVE-2026-53923

AIGC Shenlong Model
NVD (National Vulnerability Database)

Vulnerability Information

Have questions about the vulnerability? See if Shenlong's analysis helps!

Although we use advanced large model technology, its output may still contain inaccurate or outdated information.Shenlong tries to ensure data accuracy, but please verify and judge based on the actual situation.

Vulnerability Title

vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow

Source: NVD (National Vulnerability Database)

Vulnerability Description

vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.

Source: NVD (National Vulnerability Database)

CVSS Information

N/A

Source: NVD (National Vulnerability Database)

Vulnerability Type

数值类型间的不正确转换

Source: NVD (National Vulnerability Database)

Affected Products

Vendor	Product	Affected Versions	CPE	Subscribe
vllm-project	vllm	>= 0.5.5, < 0.23.1rc0	-

II. Public POCs for CVE-2026-53923

#	POC Description	Source Link	Shenlong Link

AI-Generated POCPremium

No public POC found.

III. Intelligence Information for CVE-2026-53923

请登录查看更多情报信息。

Other References for CVE-2026-53923 (3)

https://nvd.nist.gov/vuln/detail/CVE-2026-53923

Same Patch Batch · vllm-project · 2026-06-22 · 8 CVEs total

CVE-2026-48746	9.1 CRITICAL	vLLM: OpenAI auth bypass
CVE-2026-54232	8.8 HIGH	vLLM: Dependency Confusion Vulnerability in vLLM Dockerfile
CVE-2026-41523	7.5 HIGH	vLLM: Security Check Bypass via assert Statement in Activation Function Loading Allows Arb
CVE-2026-47155	6.5 MEDIUM	vLLM: Artifact Pin Decay in vLLM allows pinned deployments to load unpinned code, weights,
CVE-2026-54233	6.5 MEDIUM	vLLM: OOM Denial of Service via Audio Decompression Bomb
CVE-2026-54236	5.3 MEDIUM	vLLM: incomplete CVE-2026-22778 fix leaks PIL repr addresses via Anthropic router
CVE-2026-54235		vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kern

IV. Related Vulnerabilities

Same product: vllm

Same vendor: vllm-project

Same weakness: CWE-681

V. Comments for CVE-2026-53923

No comments yet

Goal Reached Thanks to every supporter — we hit 100%!

CVE-2026-53923— vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow

I. Basic Information for CVE-2026-53923

Vulnerability Information

Vulnerability Title

Vulnerability Description

CVSS Information

Vulnerability Type

Affected Products

II. Public POCs for CVE-2026-53923

III. Intelligence Information for CVE-2026-53923

Other References for CVE-2026-53923 (3)

Same Patch Batch · vllm-project · 2026-06-22 · 8 CVEs total

IV. Related Vulnerabilities

V. Comments for CVE-2026-53923

Leave a comment