Goal Reached Thanks to every supporter — we hit 100%!

Goal: 1000 CNY · Raised: 1000 CNY

100.0%

CVE-2024-53169— nvme-fabrics: fix kernel crash while shutting down controller

EPSS 0.01% · P1
Get alerts for future matching vulnerabilitiesLog in to subscribe

I. Basic Information for CVE-2024-53169

Vulnerability Information

Have questions about the vulnerability? See if Shenlong's analysis helps!
View Shenlong Deep Dive ↗

Although we use advanced large model technology, its output may still contain inaccurate or outdated information.Shenlong tries to ensure data accuracy, but please verify and judge based on the actual situation.

Vulnerability Title
nvme-fabrics: fix kernel crash while shutting down controller
Source: NVD (National Vulnerability Database)
Vulnerability Description
In the Linux kernel, the following vulnerability has been resolved: nvme-fabrics: fix kernel crash while shutting down controller The nvme keep-alive operation, which executes at a periodic interval, could potentially sneak in while shutting down a fabric controller. This may lead to a race between the fabric controller admin queue destroy code path (invoked while shutting down controller) and hw/hctx queue dispatcher called from the nvme keep-alive async request queuing operation. This race could lead to the kernel crash shown below: Call Trace: autoremove_wake_function+0x0/0xbc (unreliable) __blk_mq_sched_dispatch_requests+0x114/0x24c blk_mq_sched_dispatch_requests+0x44/0x84 blk_mq_run_hw_queue+0x140/0x220 nvme_keep_alive_work+0xc8/0x19c [nvme_core] process_one_work+0x200/0x4e0 worker_thread+0x340/0x504 kthread+0x138/0x140 start_kernel_thread+0x14/0x18 While shutting down fabric controller, if nvme keep-alive request sneaks in then it would be flushed off. The nvme_keep_alive_end_io function is then invoked to handle the end of the keep-alive operation which decrements the admin->q_usage_counter and assuming this is the last/only request in the admin queue then the admin->q_usage_counter becomes zero. If that happens then blk-mq destroy queue operation (blk_mq_destroy_ queue()) which could be potentially running simultaneously on another cpu (as this is the controller shutdown code path) would forward progress and deletes the admin queue. So, now from this point onward we are not supposed to access the admin queue resources. However the issue here's that the nvme keep-alive thread running hw/hctx queue dispatch operation hasn't yet finished its work and so it could still potentially access the admin queue resource while the admin queue had been already deleted and that causes the above crash. The above kernel crash is regression caused due to changes implemented in commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()"). Ideally we should stop keep-alive before destroyin g the admin queue and freeing the admin tagset so that it wouldn't sneak in during the shutdown operation. However we removed the keep alive stop operation from the beginning of the controller shutdown code path in commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()") and added it under nvme_uninit_ctrl() which executes very late in the shutdown code path after the admin queue is destroyed and its tagset is removed. So this change created the possibility of keep-alive sneaking in and interfering with the shutdown operation and causing observed kernel crash. To fix the observed crash, we decided to move nvme_stop_keep_alive() from nvme_uninit_ctrl() to nvme_remove_admin_tag_set(). This change would ensure that we don't forward progress and delete the admin queue until the keep- alive operation is finished (if it's in-flight) or cancelled and that would help contain the race condition explained above and hence avoid the crash. Moving nvme_stop_keep_alive() to nvme_remove_admin_tag_set() instead of adding nvme_stop_keep_alive() to the beginning of the controller shutdown code path in nvme_stop_ctrl(), as was the case earlier before commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()"), would help save one callsite of nvme_stop_keep_alive().
Source: NVD (National Vulnerability Database)
CVSS Information
N/A
Source: NVD (National Vulnerability Database)
Vulnerability Type
N/A
Source: NVD (National Vulnerability Database)
Vulnerability Title
Linux kernel 安全漏洞
Source: CNNVD (China National Vulnerability Database)
Vulnerability Description
Linux kernel是美国Linux基金会的开源操作系统Linux所使用的内核。 Linux kernel存在安全漏洞,该漏洞源于net:9p模块中usbg处理失败时未正确处理内存分配,可能导致内存泄漏。
Source: CNNVD (China National Vulnerability Database)
CVSS Information
N/A
Source: CNNVD (China National Vulnerability Database)
Vulnerability Type
N/A
Source: CNNVD (China National Vulnerability Database)

Affected Products

VendorProductAffected VersionsCPESubscribe
LinuxLinux a54a93d0e3599b05856971734e15418ac551a14c ~ 30794f4952decb2ec8efa42f704cac5304499a41 -
LinuxLinux 6.11 -

II. Public POCs for CVE-2024-53169

#POC DescriptionSource LinkShenlong Link
AI-Generated POCPremium

No public POC found.

Login to generate AI POC

III. Intelligence Information for CVE-2024-53169

登录查看更多情报信息。

Same Patch Batch · Linux · 2024-12-27 · 221 CVEs total

CVE-2024-56607wifi: ath12k: fix atomic calls in ath12k_mac_op_set_bitrate_mask()
CVE-2024-56594drm/amdgpu: set the right AMDGPU sg segment limitation
CVE-2024-56595jfs: add a check to prevent array-index-out-of-bounds in dbAdjTree
CVE-2024-56596jfs: fix array-index-out-of-bounds in jfs_readdir
CVE-2024-56597jfs: fix shift-out-of-bounds in dbSplit
CVE-2024-56598jfs: array-index-out-of-bounds fix in dtReadFirst
CVE-2024-56599wifi: ath10k: avoid NULL pointer error during sdio remove
CVE-2024-56601net: inet: do not leave a dangling sk pointer in inet_create()
CVE-2024-56600net: inet6: do not leave a dangling sk pointer in inet6_create()
CVE-2024-56602net: ieee802154: do not leave a dangling sk pointer in ieee802154_create()
CVE-2024-56603net: af_can: do not leave a dangling sk pointer in can_create()
CVE-2024-56604Bluetooth: RFCOMM: avoid leaving dangling sk pointer in rfcomm_sock_alloc()
CVE-2024-56605Bluetooth: L2CAP: do not leave dangling sk pointer on error in l2cap_sock_create()
CVE-2024-56606af_packet: avoid erroring out after sock_init_data() in packet_create()
CVE-2024-56618pmdomain: imx: gpcv2: Adjust delay after power up handshake
CVE-2024-56615bpf: fix OOB devmap writes when deleting elements
CVE-2024-56616drm/dp_mst: Fix MST sideband message body length check
CVE-2024-56617cacheinfo: Allocate memory during CPU hotplug if not done from the primary CPU
CVE-2024-56614xsk: fix OOB map writes when deleting elements
CVE-2024-56619nilfs2: fix potential out-of-bounds memory access in nilfs_find_entry()

Showing top 20 of 221 CVEs. View all on vendor page → →

IV. Related Vulnerabilities

V. Comments for CVE-2024-53169

No comments yet


Leave a comment