• 公告ID (KylinSec-SA-2024-4583)

摘要:

In the Linux kernel, the following vulnerability has been resolved:

rcu/nocb: Fix missed RCU barrier on deoffloading

Currently, running rcutorture test with torture_type=rcu fwd_progress=8
n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60
test_boost=2, will trigger the following warning:

WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0
RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0
Call Trace:
<TASK&gt;
? __warn+0x7e/0x120
? rcu_nocb_rdp_deoffload+0x292/0x2a0
? report_bug+0x18e/0x1a0
? handle_bug+0x3d/0x70
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? rcu_nocb_rdp_deoffload+0x292/0x2a0
rcu_nocb_cpu_deoffload+0x70/0xa0
rcu_nocb_toggle+0x136/0x1c0
? __pfx_rcu_nocb_toggle+0x10/0x10
kthread+0xd1/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK&gt;

CPU0 CPU2 CPU3
//rcu_nocb_toggle //nocb_cb_wait //rcutorture

// deoffload CPU1 // process CPU1's rdp
rcu_barrier()
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 2
// enqueue barrier
// callback to CPU1's
// rdp-&gt;cblist
rcu_do_batch()
// invoke CPU1's rdp-&gt;cblist
// callback
rcu_barrier_callback()
rcu_barrier()
mutex_lock(&amp;rcu_state.barrier_mutex);
// still see len == 2
// enqueue barrier callback
// to CPU1's rdp-&gt;cblist
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 3
// decrement len
rcu_segcblist_add_len(-2);
kthread_parkme()

// CPU1's rdp-&gt;cblist len == 1
// Warn because there is
// still a pending barrier
// trigger warning
WARN_ON_ONCE(rcu_segcblist_n_cbs(&amp;rdp-&gt;cblist));
cpus_read_unlock();

// wait CPU1 to comes online and
// invoke barrier callback on
// CPU1 rdp's-&gt;cblist
wait_for_completion(&amp;rcu_state.barrier_completion);
// deoffload CPU4
cpus_read_lock()
rcu_barrier()
mutex_lock(&amp;rcu_state.barrier_mutex);
// block on barrier_mutex
// wait rcu_barrier() on
// CPU3 to unlock barrier_mutex
// but CPU3 unlock barrier_mutex
// need to wait CPU1 comes online
// when CPU1 going online will block on cpus_write_lock

The above scenario will not only trigger a WARN_ON_ONCE(), but also
trigger a deadlock.

Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU
will either observe the decremented callback counter down to 0 and spare
the callback enqueue, or rcuo will observe the new callback and keep
rdp-&gt;nocb_cb_sleep to false.

Therefore check rdp-&gt;nocb_cb_sleep before parking to make sure no
further rcu_barrier() is waiting on the rdp.

安全等级: Low

公告ID: KylinSec-SA-2024-4583

发布日期: 2025年1月15日

关联CVE: CVE-2024-56547  

  • 详细介绍

1. 漏洞描述

   

In the Linux kernel, the following vulnerability has been resolved:

rcu/nocb: Fix missed RCU barrier on deoffloading

Currently, running rcutorture test with torture_type=rcu fwd_progress=8
n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60
test_boost=2, will trigger the following warning:

WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0
RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0
Call Trace:
<TASK&gt;
? __warn+0x7e/0x120
? rcu_nocb_rdp_deoffload+0x292/0x2a0
? report_bug+0x18e/0x1a0
? handle_bug+0x3d/0x70
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? rcu_nocb_rdp_deoffload+0x292/0x2a0
rcu_nocb_cpu_deoffload+0x70/0xa0
rcu_nocb_toggle+0x136/0x1c0
? __pfx_rcu_nocb_toggle+0x10/0x10
kthread+0xd1/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK&gt;

CPU0 CPU2 CPU3
//rcu_nocb_toggle //nocb_cb_wait //rcutorture

// deoffload CPU1 // process CPU1's rdp
rcu_barrier()
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 2
// enqueue barrier
// callback to CPU1's
// rdp-&gt;cblist
rcu_do_batch()
// invoke CPU1's rdp-&gt;cblist
// callback
rcu_barrier_callback()
rcu_barrier()
mutex_lock(&amp;rcu_state.barrier_mutex);
// still see len == 2
// enqueue barrier callback
// to CPU1's rdp-&gt;cblist
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 3
// decrement len
rcu_segcblist_add_len(-2);
kthread_parkme()

// CPU1's rdp-&gt;cblist len == 1
// Warn because there is
// still a pending barrier
// trigger warning
WARN_ON_ONCE(rcu_segcblist_n_cbs(&amp;rdp-&gt;cblist));
cpus_read_unlock();

// wait CPU1 to comes online and
// invoke barrier callback on
// CPU1 rdp's-&gt;cblist
wait_for_completion(&amp;rcu_state.barrier_completion);
// deoffload CPU4
cpus_read_lock()
rcu_barrier()
mutex_lock(&amp;rcu_state.barrier_mutex);
// block on barrier_mutex
// wait rcu_barrier() on
// CPU3 to unlock barrier_mutex
// but CPU3 unlock barrier_mutex
// need to wait CPU1 comes online
// when CPU1 going online will block on cpus_write_lock

The above scenario will not only trigger a WARN_ON_ONCE(), but also
trigger a deadlock.

Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU
will either observe the decremented callback counter down to 0 and spare
the callback enqueue, or rcuo will observe the new callback and keep
rdp-&gt;nocb_cb_sleep to false.

Therefore check rdp-&gt;nocb_cb_sleep before parking to make sure no
further rcu_barrier() is waiting on the rdp.

2. 影响范围

cve名称 产品 组件 是否受影响
CVE-2024-56547 KY3.4-5 kernel Unaffected
CVE-2024-56547 KY3.5.3 kernel Unaffected
CVE-2024-56547 V6 kernel Unaffected

3. 影响组件

    无

4. 修复版本

    无

5. 修复方法

   无

6. 下载链接

    无
上一篇:KylinSec-SA-2024-4582 下一篇:KylinSec-SA-2024-4584