Goal Reached Thanks to every supporter — we hit 100%!

Goal: 1000 CNY · Raised: 1000 CNY

100.0%

CVE-2025-38104— drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV

EPSS 0.08% · P23
Get alerts for future matching vulnerabilitiesLog in to subscribe

I. Basic Information for CVE-2025-38104

Vulnerability Information

Have questions about the vulnerability? See if Shenlong's analysis helps!
View Shenlong Deep Dive ↗

Although we use advanced large model technology, its output may still contain inaccurate or outdated information.Shenlong tries to ensure data accuracy, but please verify and judge based on the actual situation.

Vulnerability Title
drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV
Source: NVD (National Vulnerability Database)
Vulnerability Description
In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: Replace Mutex with Spinlock for RLCG register access to avoid Priority Inversion in SRIOV RLCG Register Access is a way for virtual functions to safely access GPU registers in a virtualized environment., including TLB flushes and register reads. When multiple threads or VFs try to access the same registers simultaneously, it can lead to race conditions. By using the RLCG interface, the driver can serialize access to the registers. This means that only one thread can access the registers at a time, preventing conflicts and ensuring that operations are performed correctly. Additionally, when a low-priority task holds a mutex that a high-priority task needs, ie., If a thread holding a spinlock tries to acquire a mutex, it can lead to priority inversion. register access in amdgpu_virt_rlcg_reg_rw especially in a fast code path is critical. The call stack shows that the function amdgpu_virt_rlcg_reg_rw is being called, which attempts to acquire the mutex. This function is invoked from amdgpu_sriov_wreg, which in turn is called from gmc_v11_0_flush_gpu_tlb. The [ BUG: Invalid wait context ] indicates that a thread is trying to acquire a mutex while it is in a context that does not allow it to sleep (like holding a spinlock). Fixes the below: [ 253.013423] ============================= [ 253.013434] [ BUG: Invalid wait context ] [ 253.013446] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE [ 253.013464] ----------------------------- [ 253.013475] kworker/0:1/10 is trying to lock: [ 253.013487] ffff9f30542e3cf8 (&adev->virt.rlcg_reg_lock){+.+.}-{3:3}, at: amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.013815] other info that might help us debug this: [ 253.013827] context-{4:4} [ 253.013835] 3 locks held by kworker/0:1/10: [ 253.013847] #0: ffff9f3040050f58 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x3f5/0x680 [ 253.013877] #1: ffffb789c008be40 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_one_work+0x1d6/0x680 [ 253.013905] #2: ffff9f3054281838 (&adev->gmc.invalidate_lock){+.+.}-{2:2}, at: gmc_v11_0_flush_gpu_tlb+0x198/0x4f0 [amdgpu] [ 253.014154] stack backtrace: [ 253.014164] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14 [ 253.014189] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 253.014203] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 11/18/2024 [ 253.014224] Workqueue: events work_for_cpu_fn [ 253.014241] Call Trace: [ 253.014250] <TASK> [ 253.014260] dump_stack_lvl+0x9b/0xf0 [ 253.014275] dump_stack+0x10/0x20 [ 253.014287] __lock_acquire+0xa47/0x2810 [ 253.014303] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014321] lock_acquire+0xd1/0x300 [ 253.014333] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014562] ? __lock_acquire+0xa6b/0x2810 [ 253.014578] __mutex_lock+0x85/0xe20 [ 253.014591] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.014782] ? sched_clock_noinstr+0x9/0x10 [ 253.014795] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.014808] ? local_clock_noinstr+0xe/0xc0 [ 253.014822] ? amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015012] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.015029] mutex_lock_nested+0x1b/0x30 [ 253.015044] ? mutex_lock_nested+0x1b/0x30 [ 253.015057] amdgpu_virt_rlcg_reg_rw+0xf6/0x330 [amdgpu] [ 253.015249] amdgpu_sriov_wreg+0xc5/0xd0 [amdgpu] [ 253.015435] gmc_v11_0_flush_gpu_tlb+0x44b/0x4f0 [amdgpu] [ 253.015667] gfx_v11_0_hw_init+0x499/0x29c0 [amdgpu] [ 253.015901] ? __pfx_smu_v13_0_update_pcie_parameters+0x10/0x10 [amdgpu] [ 253.016159] ? srso_alias_return_thunk+0x5/0xfbef5 [ 253.016173] ? smu_hw_init+0x18d/0x300 [amdgpu] [ 253.016403] amdgpu_device_init+0x29ad/0x36a0 [amdgpu] [ 253.016614] amdgpu_driver_load_kms+0x1a/0xc0 [amdgpu] [ 253.0170 ---truncated---
Source: NVD (National Vulnerability Database)
CVSS Information
N/A
Source: NVD (National Vulnerability Database)
Vulnerability Type
N/A
Source: NVD (National Vulnerability Database)
Vulnerability Title
Linux kernel 安全漏洞
Source: CNNVD (China National Vulnerability Database)
Vulnerability Description
Linux kernel是美国Linux基金会的开源操作系统Linux所使用的内核。 Linux kernel存在安全漏洞,该漏洞源于RLCG寄存器访问使用互斥锁而非自旋锁,可能导致优先级反转。
Source: CNNVD (China National Vulnerability Database)
CVSS Information
N/A
Source: CNNVD (China National Vulnerability Database)
Vulnerability Type
N/A
Source: CNNVD (China National Vulnerability Database)

Affected Products

VendorProductAffected VersionsCPESubscribe
LinuxLinux f39a3bc42815a7016a915f6cb35e9a1448788f06 ~ dd450b513718dfeb4c637c9335d51a55ebcd4320 -
LinuxLinux 6.11 -

II. Public POCs for CVE-2025-38104

#POC DescriptionSource LinkShenlong Link
AI-Generated POCPremium

No public POC found.

Login to generate AI POC

III. Intelligence Information for CVE-2025-38104

登录查看更多情报信息。

Same Patch Batch · Linux · 2025-04-18 · 23 CVEs total

CVE-2025-38637net_sched: skbprio: Remove overly strict queue assertions
CVE-2025-37785ext4: fix OOB read when checking dotdot dir
CVE-2025-37860sfc: fix NULL dereferences in ef100_process_design_param()
CVE-2025-37925jfs: reject on-disk inodes of an unsupported type
CVE-2025-37893LoongArch: BPF: Fix off-by-one error in build_prologue()
CVE-2025-38049x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitors
CVE-2025-38152remoteproc: core: Clear table_sz when rproc_shutdown
CVE-2025-38240drm/mediatek: dp: drm_err => dev_err in HPD path to avoid NULL ptr
CVE-2025-38479dmaengine: fsl-edma: free irq correctly in remove path
CVE-2025-38575ksmbd: use aead_request_free to match aead_request_alloc
CVE-2025-39688nfsd: allow SC_STATUS_FREEABLE when searching via nfs4_lookup_stateid()
CVE-2025-37838HSI: ssi_protocol: Fix use after free vulnerability in ssi_protocol Driver Due to Race Con
CVE-2025-39735jfs: fix slab-out-of-bounds read in ea_get()
CVE-2025-39728clk: samsung: Fix UBSAN panic in samsung_clk_init()
CVE-2025-39755staging: gpib: Fix cb7210 pcmcia Oops
CVE-2025-39778objtool, nvmet: Fix out-of-bounds stack access in nvmet_ctrl_state_show()
CVE-2025-39930ASoC: simple-card-utils: Don't use __free(device_node) at graph_util_parse_dai()
CVE-2025-39989x86/mce: use is_copy_from_user() to determine copy-from-user context
CVE-2025-40014objtool, spi: amd: Fix out-of-bounds stack access in amd_set_spi_freq()
CVE-2025-40114iio: light: Add check for array bounds in veml6075_read_int_time_ms

Showing top 20 of 23 CVEs. View all on vendor page &rarr; →

IV. Related Vulnerabilities

V. Comments for CVE-2025-38104

No comments yet


Leave a comment