From: Jonathan Rajotte Date: Mon, 12 Jul 2021 20:44:38 +0000 (-0400) Subject: Fix: ust: segfault on lttng start on filter bytecode copy X-Git-Tag: v2.11.8~4 X-Git-Url: http://git.liburcu.org/?a=commitdiff_plain;h=b77e9a3d093ea10ac62d7aac4921f0508825dffc;hp=b77e9a3d093ea10ac62d7aac4921f0508825dffc;p=lttng-tools.git Fix: ust: segfault on lttng start on filter bytecode copy Observed issue ============== A segmentation fault is observed for multiple UST timeout scenarios. Backtrace: #0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384 #1 0x0000557fe0395df9 in copy_filter_bytecode (orig_f=0x7f9c5802b790) at ust-app.c:1196 #2 0x0000557fe0397702 in shadow_copy_event (ua_event=0x7f9c58025ff0, uevent=0x7f9c58033560) at ust-app.c:1824 #3 0x0000557fe039ac46 in create_ust_app_event (ua_sess=0x7f9c5802ec20, ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, app=0x7f9c5c001da0) at ust-app.c:3192 #4 0x0000557fe03a054d in ust_app_channel_synchronize_event (ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, ua_sess=0x7f9c5802ec20, app=0x7f9c5c001da0) at ust-app.c:5096 #5 0x0000557fe03a0772 in ust_app_synchronize (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5173 #6 0x0000557fe03a0a70 in ust_app_global_update (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5255 #7 0x0000557fe03a00e0 in ust_app_start_trace_all (usess=0x7f9c580074a0) at ust-app.c:4987 #8 0x0000557fe0355c6a in cmd_start_trace (session=0x7f9c5800a190) at cmd.c:2668 #9 0x0000557fe0382e70 in process_client_msg (cmd_ctx=0x7f9c58003d70, sock=0x7f9c74bf44e0, sock_error=0x7f9c74bf44e4) at client.c:1527 #10 0x0000557fe03848a2 in thread_manage_clients (data=0x557fe06d9440) at client.c:2200 #11 0x0000557fe037d1cb in launch_thread (data=0x557fe06d94b0) at thread.c:75 #12 0x00007f9c796af609 in start_thread (arg=) at pthread_create.c:477 #13 0x00007f9c795b6293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The scenario: # Start an instrumented app ./app gdb lttng-sessiond # put a breakpoint on ustctl_set_filter lttng create my_session lttng enable-event -u tp:tp_test lttng start lttng enable-event -u __dummy --filter 'my_field == "user34"' # The tracepoint should hit. Do not continue. kill -s SIGSTOP $(pgrep app) # Continue lttng-sessiond. # enable-event will return an error. This a bug in itself, still let's # continue with the current bug. lttng stop # Start a new app that will register. ./app & sleep 1 lttng start # lttng-sessiond should segfault. Cause ===== During the "lttng enable-event" command, the timeout error bubbles up all the way to event_ust_enable_tracepoint and is different from LTTNG_UST_ERR_EXIST. `trace_ust_destroy_event` is called and frees the `uevent` object. Note that contrary to the comment `uevent` is added to the channel event hash table at this point. On the next `lttng start` command, the event node is still present in the hash table and is iterated on. lttng-sessiond segfault on the first data access of the previously freed memory. The problem was introduced by commit 88e3c2f5610b9ac89b0923d448fee34140fc46fb [1]. Which essentially move the callsite of `add_unique_ust_event` before `ust_app_*_event_glb` calls. Solution ======== Go to `end` label to prevent freeing of the uevent object. Note that app synchronization should not force an error at the channel level, since a single app can fail but the whole channel should not. The `error` label is now obsolete. Known drawbacks ========= None. References ========== [1] https://github.com/lttng/lttng-tools/commit/88e3c2f5610b9ac89b0923d448fee34140fc46fb Signed-off-by: Jonathan Rajotte Change-Id: Ifaf3f4c71bb2da869c7b441aaa4b367f8f7cbdd6 Signed-off-by: Jérémie Galarneau ---