From: Mathieu Desnoyers Date: Fri, 3 Feb 2023 19:11:50 +0000 (-0500) Subject: Fix: auto-resize hash table destroy deadlock X-Git-Tag: v0.14.0~9 X-Git-Url: https://git.liburcu.org/?a=commitdiff_plain;h=b047e7a793421e3ff1f5dca2b27c72751a1f4db4;hp=b047e7a793421e3ff1f5dca2b27c72751a1f4db4;p=urcu.git Fix: auto-resize hash table destroy deadlock Fix a deadlock for auto-resize hash tables when cds_lfht_destroy is called with RCU read-side lock held. Example stack track of a hang: Thread 2 (Thread 0x7f21ba876700 (LWP 26114)): #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #1 0x00007f21beba7aa0 in futex (val3=0, uaddr2=0x0, timeout=0x0, val=-1, op=0, uaddr=0x7f21bedac308 ) at ../include/urcu/futex.h:81 #2 futex_noasync (timeout=0x0, uaddr2=0x0, val3=0, val=-1, op=0, uaddr=0x7f21bedac308 ) at ../include/urcu/futex.h:90 #3 wait_gp () at urcu.c:265 #4 wait_for_readers (input_readers=input_readers@entry=0x7f21ba8751b0, cur_snap_readers=cur_snap_readers@entry=0x0, qsreaders=qsreaders@entry=0x7f21ba8751c0) at urcu.c:357 #5 0x00007f21beba8339 in urcu_memb_synchronize_rcu () at urcu.c:498 #6 0x00007f21be99f93f in fini_table (last_order=, first_order=13, ht=0x5651cec75400) at rculfhash.c:1489 #7 _do_cds_lfht_shrink (new_size=, old_size=, ht=0x5651cec75400) at rculfhash.c:2001 #8 _do_cds_lfht_resize (ht=ht@entry=0x5651cec75400) at rculfhash.c:2023 #9 0x00007f21be99fa26 in do_resize_cb (work=0x5651e20621a0) at rculfhash.c:2063 #10 0x00007f21be99dbfd in workqueue_thread (arg=0x5651cec74a00) at workqueue.c:234 #11 0x00007f21bd7c06db in start_thread (arg=0x7f21ba876700) at pthread_create.c:463 #12 0x00007f21bd4e961f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Thread 1 (Thread 0x7f21bf285300 (LWP 26098)): #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 #1 0x00007f21be99d8b7 in futex (val3=0, uaddr2=0x0, timeout=0x0, val=-1, op=0, uaddr=0x5651d8b38584) at ../include/urcu/futex.h:81 #2 futex_async (timeout=0x0, uaddr2=0x0, val3=0, val=-1, op=0, uaddr=0x5651d8b38584) at ../include/urcu/futex.h:113 #3 futex_wait (futex=futex@entry=0x5651d8b38584) at workqueue.c:135 #4 0x00007f21be99e2c8 in urcu_workqueue_wait_completion (completion=completion@entry=0x5651d8b38580) at workqueue.c:423 #5 0x00007f21be99e3f9 in urcu_workqueue_flush_queued_work (workqueue=0x5651cec74a00) at workqueue.c:452 #6 0x00007f21be9a0c83 in cds_lfht_destroy (ht=0x5651d8b2fcf0, attr=attr@entry=0x0) at rculfhash.c:1906 This deadlock is easy to reproduce when rapidly adding a large number of entries in the cds_lfht, removing them, and calling cds_lfht_destroy(). The deadlock will occur if the call to cds_lfht_destroy() takes place while a resize of the hash table is ongoing. Fix this by moving the teardown of the lfht worker thread to libcds library destructor, so it does not have to wait on synchronize_rcu from a resize callback from within a read-side critical section. As a consequence, the atfork callbacks are left registered within each urcu flavor for which a resizeable hash table is created until the end of the executable lifetime. The other part of the fix is to move the hash table destruction to the worker thread for auto-resize hash tables. This prevents having to wait for resize callbacks from RCU read-side critical section. This is guaranteed by the fact that the worker thread serializes previously queued resize callbacks before the destroy callback. Signed-off-by: Mathieu Desnoyers Change-Id: If8b1c3c8063dc7b9846dc5c3fc452efd917eab4d ---