From: Mathieu Desnoyers Date: Thu, 12 Apr 2012 01:07:13 +0000 (-0400) Subject: Fix: work-around glibc __nptl_setxid vs clone hang X-Git-Tag: v2.0.1~6 X-Git-Url: http://git.liburcu.org/?a=commitdiff_plain;h=8cf66a2b6551d770a8aedf1301bcec2270041785;hp=8cf66a2b6551d770a8aedf1301bcec2270041785;p=lttng-tools.git Fix: work-around glibc __nptl_setxid vs clone hang hash table resize threads exit end up setting a "locked" state within libc pthread, which deadlocks with seteuid/setegid called from the cloned process in runas.c when runas() is called exactly when a resize thread exits. Temporarily fix this issue by adding a mutex cross this resize operation, which holds mutual exclusion with runas() usage. We should investigate whether we want to properly call exec() from the runas.c clone child before touching any non-async-signal-safe libc call. However, given that this change is more intrusive, let's first use this mutex-based work-around. Before this fix, running 1000 instances of "demo-trace 300" with sessiond running as root, and: lttng create lttng enable-event -u -a lttng start would sometimes lead to consumerd hang with the following clone child backtrace: setxid_mark_thread (cmdp=, t=0x7f52dd47c700) at allocatestack.c:995 995 allocatestack.c: No such file or directory. (gdb) bt full at allocatestack.c:995 ch = at allocatestack.c:1088 t = 0x80 signalled = result = runp = 0x7f52dd47c9c0 at ../sysdeps/unix/sysv/linux/setegid.c:44 __p = 0xfffffffffffffe00 __cmd = {syscall_no = 119, id = {-1, 1000, -1}, cntr = 0} result = data = 0x7f52e66e1930 writelen = writeleft = index = sendret = {i = 0, c = "\000\000\000"} ret = __func__ = "child_run_as" at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 No locals. No symbol table info available. Signed-off-by: Mathieu Desnoyers ---