From 98d705dd32be11b3f134b5aed87f9c201c028c10 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Wed, 22 Jun 2022 16:38:06 -0400 Subject: [PATCH] Fix: call_rcu: futex wait: handle spurious futex wakeups MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit Observed issue ============== The urcu call_rcu() and rcu_barrier() each implement a futex wait/wakeup scheme identical to the workqueue code, which has an issue with spurious wakeups. * call_rcu A spurious wakeup on call_rcu_wait can cause call_rcu_wait to return with a crdp->futex state of -1, which is unexpected. It would cause the following loops in call_rcu_thread() to decrement the crdp->futex to values below -1, thus actively using CPU time as values will be decremented to very low negative values until the futex value underflows back to 0. The state is *not* restored to 0 when the callback list is found to be non-empty, so this unexpected state will persist until the crdp->futex state underflows back to 0, or until the call_rcu_thread is stopped. What prevents this from having too much user-observable effects is that the call rcu thread has a 10ms sleep between loops, to favor batching of callbacks. Therefore, rather than being a purely 100% active busy-wait, this scenario leads to a busy-wait which is paced by 10ms sleeps. Therefore the observed issue will be that the call_rcu_thread will unexpectedly wake up the CPU each 10ms after this spurious wakeup happens. * rcu_barrier A spurious wakeup on call_rcu_completion_wait can cause call_rcu_completion_wait to return with a completion->futex state of -1, which is unexpected. It would cause the following loops in rcu_barrier() to decrement the completion->futex to values below -1, thus actively using CPU time as values will be decremented to very low negative values until either the barrier count reaches 0 or until the futex value underflows to 0. Therefore the observed issue will be that rcu_barrier() will unexpectedly use a lot of CPU time when this spurious wakeup happens. These issues will cause spurious unexpected high CPU use, but will not lead to data corruption. Cause ===== From futex(5): FUTEX_WAIT Returns 0 if the caller was woken up. Note that a wake-up can also be caused by common futex usage patterns in unrelated code that happened to have previously used the futex word's memory location (e.g., typical futex-based implementations of Pthreads mutexes can cause this under some conditions). Therefore, call‐ ers should always conservatively assume that a return value of 0 can mean a spurious wake-up, and use the futex word's value (i.e., the user-space synchronization scheme) to decide whether to continue to block or not. Solution ======== We therefore need to validate whether the value differs from -1 in user-space after the call to FUTEX_WAIT returns 0. Known drawbacks =============== None. Signed-off-by: Mathieu Desnoyers Change-Id: I3e625f1689462f8eb9f1223b5b24b1a754bad324 --- src/urcu-call-rcu-impl.h | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-) diff --git a/src/urcu-call-rcu-impl.h b/src/urcu-call-rcu-impl.h index 4392bc6..2ad02eb 100644 --- a/src/urcu-call-rcu-impl.h +++ b/src/urcu-call-rcu-impl.h @@ -240,17 +240,25 @@ static void call_rcu_wait(struct call_rcu_data *crdp) { /* Read call_rcu list before read futex */ cmm_smp_mb(); - if (uatomic_read(&crdp->futex) != -1) - return; - while (futex_async(&crdp->futex, FUTEX_WAIT, -1, - NULL, NULL, 0)) { + while (uatomic_read(&crdp->futex) == -1) { + if (!futex_async(&crdp->futex, FUTEX_WAIT, -1, NULL, NULL, 0)) { + /* + * Prior queued wakeups queued by unrelated code + * using the same address can cause futex wait to + * return 0 even through the futex value is still + * -1 (spurious wakeups). Check the value again + * in user-space to validate whether it really + * differs from -1. + */ + continue; + } switch (errno) { - case EWOULDBLOCK: + case EAGAIN: /* Value already changed. */ return; case EINTR: /* Retry if interrupted by signal. */ - break; /* Get out of switch. */ + break; /* Get out of switch. Check again. */ default: /* Unexpected error. */ urcu_die(errno); @@ -274,17 +282,25 @@ static void call_rcu_completion_wait(struct call_rcu_completion *completion) { /* Read completion barrier count before read futex */ cmm_smp_mb(); - if (uatomic_read(&completion->futex) != -1) - return; - while (futex_async(&completion->futex, FUTEX_WAIT, -1, - NULL, NULL, 0)) { + while (uatomic_read(&completion->futex) == -1) { + if (!futex_async(&completion->futex, FUTEX_WAIT, -1, NULL, NULL, 0)) { + /* + * Prior queued wakeups queued by unrelated code + * using the same address can cause futex wait to + * return 0 even through the futex value is still + * -1 (spurious wakeups). Check the value again + * in user-space to validate whether it really + * differs from -1. + */ + continue; + } switch (errno) { - case EWOULDBLOCK: + case EAGAIN: /* Value already changed. */ return; case EINTR: /* Retry if interrupted by signal. */ - break; /* Get out of switch. */ + break; /* Get out of switch. Check again. */ default: /* Unexpected error. */ urcu_die(errno); -- 2.34.1