From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date: Sat, 1 Mar 2014 16:33:25 +0000 (-0500)
Subject: Fix: high cpu usage in synchronize_rcu with long RCU read-side C.S.
X-Git-Tag: v0.9.0~101
X-Git-Url: https://git.liburcu.org/?p=urcu.git;a=commitdiff_plain;h=9340c38dbff1b407f35008f7f585a238fbd4de1c;hp=9340c38dbff1b407f35008f7f585a238fbd4de1c

Fix: high cpu usage in synchronize_rcu with long RCU read-side C.S.

We noticed that with this kind of scenario:
- application using urcu-mb, urcu-membarrier, urcu-signal, or urcu-bp,
- long RCU read-side critical sections, caused by e.g. long network I/O
  system calls,
- other short lived RCU critical sections running in other threads,
- very frequent invocation of call_rcu to enqueue callbacks,
lead to abnormally high CPU usage within synchronize_rcu() in the
call_rcu worker threads.

Inspection of the code gives us the answer: in urcu.c, we expect that if
we need to wait on a futex (wait_gp()), we expect to be able to end the
grace period within the next loop, having been notified by a
rcu_read_unlock(). However, this is not always the case: we can very
well be awakened by a rcu_read_unlock() executed on a thread running
short-lived RCU read-side critical sections, while the long-running RCU
read-side C.S. is still active. We end up in a situation where we
busy-wait for a very long time, because the counter is !=
RCU_QS_ACTIVE_ATTEMPTS until a 32-bit overflow happens (or more likely,
until we complete the grace period). We need to change the wait_loops ==
RCU_QS_ACTIVE_ATTEMPTS check into an inequality to use wait_gp() for
every attempts beyond RCU_QS_ACTIVE_ATTEMPTS loops.

urcu-bp.c also has this issue. Moreover, it uses usleep() rather than
poll() when dealing with long-running RCU read-side critical sections.
Turn the usleep 1000us (1ms) into a poll of 10ms. One of the advantage
of using poll() rather than usleep() is that it does not interact with
SIGALRM.

urcu-qsbr.c already checks for wait_loops >= RCU_QS_ACTIVE_ATTEMPTS, so
it is not affected by this issue.

Looking into these loops, however, shows that overflow of the loop
counter, although unlikely, would bring us back to a situation of high
cpu usage (a negative value well below RCU_QS_ACTIVE_ATTEMPTS).
Therefore, change the counter behavior so it stops incrementing when it
reaches RCU_QS_ACTIVE_ATTEMPTS, to eliminate overflow.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---