Francis Deslauriers [Tue, 2 Nov 2021 13:33:02 +0000 (09:33 -0400)]
Fix: use <unistd.h> instead of <sys/unistd.h>
Fixes: #1330
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I07cabde5a0295de06f7c6f42dd12de803b57c907
Francis Deslauriers [Mon, 1 Nov 2021 19:31:26 +0000 (15:31 -0400)]
Fix: Tests: unchecked `close()` return value
CID
1465101 (#1 of 1): Unchecked return value (CHECKED_RETURN)
9. check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
CID
1465100 (#1 of 1): Unchecked return value (CHECKED_RETURN)
4. check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times)
CID
1465099 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
CID
1465098 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
CID
1465097 (#1 of 1): Unchecked return value (CHECKED_RETURN) 4.
check_return: Calling close without checking return value (as is done
elsewhere 177 out of 185 times).
Reported-by: Coverity Scan
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8e2552c75ab7cec5aa3707e2c1c4d9f2484b501a
Jérémie Galarneau [Mon, 1 Nov 2021 19:43:55 +0000 (15:43 -0400)]
Fix: relayd: live: mishandled initial null trace chunk
Observed issue
==============
As reported in #1323 (https://bugs.lttng.org/issues/1323), crashes of
the relay daemon are observed when running the user space clear tests.
The crash occurs with the following stack trace:
#0 0x000055fbb861d6ae in urcu_ref_get_unless_zero (ref=0x28) at /usr/local/include/urcu/ref.h:85
#1 lttng_trace_chunk_get (chunk=0x0) at trace-chunk.c:1836
#2 0x000055fbb86051e2 in make_viewer_streams (relay_session=relay_session@entry=0x7f6ea002d540, viewer_session=<optimized out>, seek_t=seek_t@entry=LTTNG_VIEWER_SEEK_BEGINNING, nb_total=nb_total@entry=0x7f6ea9607b00, nb_unsent=nb_unsent@entry=0x7f6ea9607aec, nb_created=nb_created@entry=0x7f6ea9607ae8, closed=<optimized out>) at live.c:405
#3 0x000055fbb86061d9 in viewer_get_new_streams (conn=0x7f6e94000fc0) at live.c:1155
#4 process_control (conn=0x7f6e94000fc0, recv_hdr=0x7f6ea9607af0) at live.c:2353
#5 thread_worker (data=<optimized out>) at live.c:2515
#6 0x00007f6eae86a609 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#7 0x00007f6eae78f293 in clone () from /lib/x86_64-linux-gnu/libc.so.6
The race window during which this occurs seems very small as it can take
hours to reproduce this crash. However, a minimal reproducer could be
identified, as stated in the bug report.
Essentially, the same crash can be reproduced by attaching a live viewer
to a session that has seen events being produced, been stopped and been
cleared.
Cause
=====
The crash occurs as an attempt is made to take a reference to a viewer
session’s trace chunk as viewer streams are created. The crux of the
problem is that the code doesn’t expect a viewer session’s trace chunk
to be NULL.
The viewer session’s current trace chunk is initially set, when a viewer
attaches to the viewer session, to a copy the corresponding
relay_session’s current trace chunk.
A live session always attempts to "catch-up" to the newest available
trace chunk. This means that when a viewer reaches the end of a trace
chunk, the viewer session may not transition to the "next" one: it jumps
to the most recent trace chunk available (the one being produced by the
relay_session). Hence, if the producer performs multiple rotations
before a viewer completes the consumption of a trace chunk, it will skip
over those "intermediary" trace chunks.
A viewer session updates its current trace chunk when:
1) new viewer streams are created,
2) a new index is requested,
3) metadata is requested.
Hence, as a general principle, the viewer session will reference the
most recent trace chunk available _even if its streams do not point to
it_. It indicates which trace chunk viewer streams should transition to
when the end of their current trace chunk is reached.
The live code properly handles transitions to a null chunk. This can be
verified by attaching a viewer to a live session, stopping the session,
clearing it (thus entering a null trace chunk), and resuming tracing.
The only issue is that the case where the first trace chunk of a viewer
session is "null" (no active trace chunk) is mishandled in two places:
1) in make_viewer_streams(), where the crash is observed,
2) in viewer_get_metadata().
Solution
========
In make_viewer_streams(), it is assumed that a viewer session will have
a non-null trace chunk whenever a rotation is not ongoing. This is
reflected by the fact that a reference is always acquired on the viewer
session’s trace chunk.
That code is one of the three places that can cause a viewer session’s
trace chunk to be updated. We still want to update the viewer session to
the most recently seen trace chunk (null, in this case). However, there
is no reference to acquire and the trace chunk to use for the creation
of the viewer stream is NULL. This is properly handled by
viewer_stream_create().
The second site to change is viewer_get_metadata() which doesn’t handle
a viewer metadata stream not having an active trace chunk at all.
Thankfully, the protocol allows us to express this condition by
returning the LTTNG_VIEWER_NO_NEW_METADATA status code when a viewer
metadata stream doesn’t have an open file and doesn’t have a current
trace chunk.
Surprisingly, this bug didn’t trigger in the case where a transition to
a null chunk occurred _after_ attaching to a viewer session.
This is because viewers will typically ask for metadata as a result of an
LTTNG_VIEWER_FLAG_NEW_METADATA reply to the GET_NEXT_INDEX command. When
a session is stopped and all data was consumed, this command returns
that no new data is available, causing the viewers to wait and ask again
later.
However, when attaching, babeltrace2 (at least, and probably babeltrace 1.x)
always asks for an initial segment of metadata before asking for an
index.
Known drawbacks
===============
None.
Fixes: #1323
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I516fca60755e6897f6b7170c12d706ef57ad61a5
Jérémie Galarneau [Mon, 1 Nov 2021 19:44:04 +0000 (15:44 -0400)]
Docs: relayd: document the lifetime of viewer session trace chunks
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I167ae4e651099c824bb24d5497199d1414b6ca1a
Francis Deslauriers [Fri, 8 Oct 2021 13:41:04 +0000 (09:41 -0400)]
Tests: use babeltrace2 for all tests
- Change value of `BABELTRACE_BIN` to `babeltrace2`,
- Replace all direct calls to babeltrace with calls to `BABELTRACE_BIN`
variable,
- Add `bail_out_if_no_babeltrace` bash function to fail the test if
Babeltrace 2 is needed but not found.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ieeffe4977660f47561732223e94a1b8f5a00ef0e
Francis Deslauriers [Thu, 7 Oct 2021 19:25:28 +0000 (15:25 -0400)]
Tests: port validate_select_poll_epoll.py to bt2 python bindings
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6ed3177e5bd31a56f4a1e84b53bf1d9025ac3161
Francis Deslauriers [Thu, 30 Sep 2021 18:43:11 +0000 (14:43 -0400)]
Fix: configure.ac: reporting SDT uprobe as a UST feature
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I86638d6a148b04e7131e4af7ec830c5e56817fdc
Francis Deslauriers [Thu, 28 Oct 2021 19:36:49 +0000 (15:36 -0400)]
Cleanup: Tests: remove trace after test
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I28024856d22f395922996fbb69c490fb1783b2e9
Francis Deslauriers [Wed, 20 Oct 2021 14:46:39 +0000 (10:46 -0400)]
Cleanup: Remove unused `live_find_viewer_stream_by_id()` func declaration
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I78f977a94ea3609fbee1176608ff97399e314e7f
Francis Deslauriers [Thu, 7 Oct 2021 18:52:27 +0000 (14:52 -0400)]
Fix: Tests: leaking epoll fd
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5ec4fcdb87159f35932c20e7314cda764d14967c
Francis Deslauriers [Mon, 25 Oct 2021 15:32:24 +0000 (11:32 -0400)]
Typo: occurences -> occurrences
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I719e26febd639f3b047b6aa6361fc6734088e871
Jérémie Galarneau [Mon, 18 Oct 2021 18:43:18 +0000 (14:43 -0400)]
Fix: lttng: add-context: silence coverity warning
Coverity reports that:
1464653 Uninitialized scalar field
The field will contain an arbitrary value left over from earlier
computations.
In ctx_opts::ctx_opts(): A scalar field is not initialized by the
constructor.
uninit_member: Non-static class member hide_help is not initialized
in this constructor nor in any functions that it calls.
In our case it doesn't matter since a nullptr symbol indicates the end
the ctx_opts array. An "unknown" value is added to the context array
and used to initialize the end-of-list item to an invalid value.
Calling the ctx_opts(const char *symbol_, context_type ctx_type_, bool
hide_help_ = false) constructor causes `hide_help` to be initialized.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie685fef5712705a1938d9e6b8a63d23ab5bc6b39
Jérémie Galarneau [Fri, 15 Oct 2021 21:03:38 +0000 (17:03 -0400)]
Build fix: Missing message in LTTNG_DEPRECATED invocation
Coverity scan build jobs fail since LTTNG_DEPRECATED expects a string
and none is provided at the lttng_metadata_regenerate use site.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7e6701abd24c679f578b0adead771ac93b6566cd
Simon Marchi [Fri, 3 Sep 2021 21:31:28 +0000 (17:31 -0400)]
bin: compile lttng as C++
Compile the code of the lttng binary as C++ source.
- start by renaming all files under src/bin/lttng to have the .cpp
extension, adjust Makefile.am accordingly
- apply the sophisticated algorithm:
while does_not_build():
fix_error()
until completion
Fixes fall in these categories:
- add extern "C" to headers of functions implemented in C. This is
likely temporary: at some point some of these things will be
implemented in C++, at which point we'll remove the extern "C".
- rename mi_lttng_version to mi_lttng_version_data, to avoid a -Wshadow
warning about the mi_lttng_version function hiding the
mi_lttng_version's struct constructor
- we have the same warning about lttng_calibrate, but we can't rename
it, it's exposed in a public header. Add some pragmas to disable the
warning around there. We will need more macro smartness in case we
need to support a compiler that doesn't understand these pragmas.
- in filter-ast.h, add a dummy field to the empty struct, to avoid a
-Wextern-c-compat warning with clang++ (it warns us that the struct
has size 0 in C but size 1 in C++).
- in add_context.cpp, we can't initialize ctx_opts' union field like we
did in C. Fix that by adding a ctx_opts constructor for each kind of
context and implement the PERF_* macros to use them.
- need to explicitly cast void pointer to type of the destination, for
example the eturn value of allocation functions, or parameter of
"destroy" functions
- need to explicitly cast when passing an int to an enum parameter, for
example an lttng_error_code parameter
- remove use of designated array initializers, for example for
schedule_type_str in disable_rotation.cpp
- fix order of struct initializers to match order of field
declarations, for example in list_triggers.cpp, function
cmd_list_triggers
- rename some things to avoid clashing with keywords, for example in
runas.h
Change-Id: Id743b141552a412b4104af4dda8969eef5032388
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Wed, 6 Oct 2021 16:30:25 +0000 (12:30 -0400)]
Cleanup: always use sysconf to get the page size
Use 'sysconf(_SC_PAGE_SIZE)' across the code base which is works on all
our supported platforms.
Change-Id: I4231d45e0b03301de1274c0a5a4903cd17b4a80a
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Fri, 16 Jul 2021 17:58:10 +0000 (13:58 -0400)]
Cleanup: namespace 'align' macros
Remove the duplicate ALIGN() and ALIGN_TO() macro and replace them with
namespaced variants 'lttng_align_ceil()' and 'lttng_align_floor()'.
Change-Id: I683baccb4e97874e647cf557bad9653a336f4a6d
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 7 Oct 2021 20:19:41 +0000 (16:19 -0400)]
Fix: sessiond: previously created channel cannot be enabled
Observed issue
==============
A previously created channel cannot be enabled back once a session is
started.
Cause
=====
The check validating that the session was started is to early in the
`cmd_enable_channel` function.
Solution
========
Move the check at the creation code path when the channel is not found.
Known drawbacks
=========
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I8e7d62b7e97246e65f1cf9022270293a6dd34cc9
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Thu, 8 Jul 2021 18:17:51 +0000 (14:17 -0400)]
Fix: ust: app stuck on recv message during UST comm timeout scenario
Observed issue
==============
The following scenario lead to the UST thread to be "stuck" on recvmsg
on the notify socket.
The problem manifest itself when an application is unresponsive during
the ustctl_start_session call. Note that the default timeout for ust
communication is 5 seconds.
# Start an instrumented app
./app
gdb lttng-sessiond
# put a breakpoint on ustctl_start_session
lttng create my_session
lttng enable-event -u -a
lttng start
# The tracepoint should hit. Do not continue.
kill -s SIGSTOP $(pgrep app)
# Continue lttng-sessiond.
sleep 5 # This make sure lttng-sessiond unregister the app from its point of view
kill -s SIGCONT $(pgrep app)
gdb -p $(pgrep app)
thread apply all bt
App stack trace:
Thread 3 (Thread 0x7fe2c6f58700 (LWP 48172)):
#0 __libc_recvmsg (flags=0, msg=0x7fe2c6f56ac0, fd=4) at ../sysdeps/unix/sysv/linux/recvmsg.c:28
#1 __libc_recvmsg (fd=fd@entry=4, msg=msg@entry=0x7fe2c6f56ac0, flags=flags@entry=0) at ../sysdeps/unix/sysv/linux/recvmsg.c:25
#2 0x00007fe2c7a010ba in ustcomm_recv_unix_sock (sock=sock@entry=4, buf=buf@entry=0x7fe2c6f56ea0, len=len@entry=48) at lttng-ust-comm.c:308
#3 0x00007fe2c7a037c3 in ustcomm_register_channel (sock=4, session=session@entry=0x7fe2c0000ba0, session_objd=<optimized out>, channel_objd=<optimized out>, nr_ctx_fields=nr_ctx_fields@entry=0, ctx_fields=<optimized out>, chan_id=0x7fe2
c6f5716c, header_type=0x7fe2c0012b18) at lttng-ust-comm.c:1544
#4 0x00007fe2c7a10787 in lttng_session_enable (session=0x7fe2c0000ba0) at lttng-events.c:444
#5 0x00007fe2c7a0b785 in lttng_session_cmd (objd=1, cmd=128, arg=
140611977311672, uargs=0x7fe2c6f57800, owner=0x7fe2c7a5da00 <local_apps>) at lttng-ust-abi.c:576
#6 0x00007fe2c7a07d6d in handle_message (lum=0x7fe2c6f57590, sock=3, sock_info=0x7fe2c7a5da00 <local_apps>) at lttng-ust-comm.c:1003
#7 ust_listener_thread (arg=0x7fe2c7a5da00 <local_apps>) at lttng-ust-comm.c:1712
#8 0x00007fe2c7993609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#9 0x00007fe2c78ba293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
...
Cause
=====
When the app continues after the timeout from lttng-sessiond side, the
actual start_session message is received on the application side then
UST, app side, send commands on the notify socket. On lttng-sessiond
side, the command is received but no reply is sent.
This is due to the fact that the lookup against the
ust_app_ht_by_notify_sock hash table (find_app_by_notify_sock)
return nothing since the app is unregistered at this point and the hash
table node was removed on unregistration.
Solution
========
When the app lookup fails, return an error that will trigger the cleanup
of the notify socket.
Known drawbacks
=========
None
Note
=========
Subsequent error path in reply_ust_register_channel,
add_event_ust_registry, and add_enum_ust_registry might lead to the same
type of problem since no reply is sent to the app. Still, for those
cases the complete application/notify socket should not be destroyed
since the error path relate to either a session or a sub object of a
session.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Iea0dc027ca1ee772e84c7e545114f1be69fd1f63
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 23 Jun 2021 02:17:03 +0000 (22:17 -0400)]
Fix: ust: UST communication can return -EAGAIN
Observed issue
==============
The following scenario lead to an abort on event creation. The
problem manifest itself when an application is unresponsive. Note that
the default timeout for ust communication is 5 seconds.
# Start an instrumented app
./app
gdb lttng-sessiond
# put a breakpoint on ustctl_create_event.
lttng create my_session
lttng enable-event -u -a
lttng start
# The tracepoint should hit. Do not continue.
kill -s SIGSTOP $(pgrep app)
# Continue lttng-sessiond.
# lttng-sessiond will abort.
Note that for UST this is not an expected behaviour. Expected
communication failure with a single app should not invalidate the
complete channel, compromise its setup or result in an abort.
Note that a similar scenario for the following ustctl call sites also
lead to scenario where failure of a single app lead to error reporting
and/or error propagation to upper level object.
Problematic callsites:
ustctl_set_exclusion
ustctl_set_filter
ustctl_disable_channel
These callsites are also fixed by this patch.
Cause
=====
For an unresponsive application, EAGAIN is returned and is treated as an
"unknown" hard error.
In this particular case the abort() call was introduced by commit:
88e3c2f5610b9ac89b0923d448fee34140fc46fb [1]. It is not clear if this is
a leftover from debugging session since this is the only callsite where
an abort is issued on communication failure via ustctl.
Solution
========
Handle EAGAIN coming from ustctl_* and treat it the same way a
dying application is handled. The only minor difference is that we WARN
on communication time out. Albeit not the most useful thing for a CLI
client, it could help overall user of lttng-sessiond in time out
situation.
Most call site already handled "unknown" error correctly. For those call
site we simply end up bringing more info in regards to the timeout
issue instead of mentioning that "-11" was returned.
Note, the reclamation of "app" is handled by the poll loop and
ust_app_unregister since the socket is shutdown by lttng-ust internally
on error, including EAGAIN.
Note that the application will try to register itself back to the
lttng-sessiond based on its configuration.
Known drawbacks
=========
None
Note
==========
Some logging call sites used the ppid of the app instead of the pid.
Those have been changed to pid.
References
==========
[1] https://github.com/lttng/lttng-tools/commit/
88e3c2f5610b9ac89b0923d448fee34140fc46fb
Fixes: #1384
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: If364b5d48e7fd2b664276a0fb1b7eec2c45ed683
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 12 Jul 2021 20:44:38 +0000 (16:44 -0400)]
Fix: ust: segfault on lttng start on filter bytecode copy
Observed issue
==============
A segmentation fault is observed for multiple UST timeout scenarios.
Backtrace:
#0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:384
#1 0x0000557fe0395df9 in copy_filter_bytecode (orig_f=0x7f9c5802b790) at ust-app.c:1196
#2 0x0000557fe0397702 in shadow_copy_event (ua_event=0x7f9c58025ff0, uevent=0x7f9c58033560) at ust-app.c:1824
#3 0x0000557fe039ac46 in create_ust_app_event (ua_sess=0x7f9c5802ec20, ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, app=0x7f9c5c001da0) at ust-app.c:3192
#4 0x0000557fe03a054d in ust_app_channel_synchronize_event (ua_chan=0x7f9c58025cc0, uevent=0x7f9c58033560, ua_sess=0x7f9c5802ec20, app=0x7f9c5c001da0) at ust-app.c:5096
#5 0x0000557fe03a0772 in ust_app_synchronize (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5173
#6 0x0000557fe03a0a70 in ust_app_global_update (usess=0x7f9c580074a0, app=0x7f9c5c001da0) at ust-app.c:5255
#7 0x0000557fe03a00e0 in ust_app_start_trace_all (usess=0x7f9c580074a0) at ust-app.c:4987
#8 0x0000557fe0355c6a in cmd_start_trace (session=0x7f9c5800a190) at cmd.c:2668
#9 0x0000557fe0382e70 in process_client_msg (cmd_ctx=0x7f9c58003d70, sock=0x7f9c74bf44e0, sock_error=0x7f9c74bf44e4) at client.c:1527
#10 0x0000557fe03848a2 in thread_manage_clients (data=0x557fe06d9440) at client.c:2200
#11 0x0000557fe037d1cb in launch_thread (data=0x557fe06d94b0) at thread.c:75
#12 0x00007f9c796af609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#13 0x00007f9c795b6293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
The scenario:
# Start an instrumented app
./app
gdb lttng-sessiond
# put a breakpoint on ustctl_set_filter
lttng create my_session
lttng enable-event -u tp:tp_test
lttng start
lttng enable-event -u __dummy --filter 'my_field == "user34"'
# The tracepoint should hit. Do not continue.
kill -s SIGSTOP $(pgrep app)
# Continue lttng-sessiond.
# enable-event will return an error. This a bug in itself, still let's
# continue with the current bug.
lttng stop
# Start a new app that will register.
./app &
sleep 1
lttng start
# lttng-sessiond should segfault.
Cause
=====
During the "lttng enable-event" command, the timeout error bubbles up
all the way to event_ust_enable_tracepoint and is different from
LTTNG_UST_ERR_EXIST. `trace_ust_destroy_event` is called and frees the
`uevent` object. Note that contrary to the comment `uevent` is added to
the channel event hash table at this point.
On the next `lttng start` command, the event node is still present in
the hash table and is iterated on. lttng-sessiond segfault on the first
data access of the previously freed memory.
The problem was introduced by commit
88e3c2f5610b9ac89b0923d448fee34140fc46fb [1]. Which essentially move the
callsite of `add_unique_ust_event` before `ust_app_*_event_glb` calls.
Solution
========
Go to `end` label to prevent freeing of the uevent object.
Note that app synchronization should not force an error at the channel
level, since a single app can fail but the whole channel should not.
The `error` label is now obsolete.
Known drawbacks
=========
None.
References
==========
[1] https://github.com/lttng/lttng-tools/commit/
88e3c2f5610b9ac89b0923d448fee34140fc46fb
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: Ifaf3f4c71bb2da869c7b441aaa4b367f8f7cbdd6
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Mon, 27 Sep 2021 13:42:54 +0000 (09:42 -0400)]
Fix: notification-thread: handling event from a removed tracer event src
Issue
=====
The issue is caused by a race condition where the `lttng_poll_wait()`
returns a _REMOVE_TRACER_EVENT_SOURCE event followed by an actual
notification event on the removed event source fd.
This causes the notification thread to remove the fd from the potential
notification sources list and later fail to find that same fd in the
next iteration.
This race condition can lead to the notification thread to hang
indefinitely or to failed assertions within the `fini_thread_state()`
function.
Fix
===
When removing an tracer event source, force the notification thread
`lttng_poll_wait()` loop to restart to ignore events from the removed
fd.
Use the `restart_poll` for that purpose (see note below).
Reproducer
==========
It's easy to reproduce this issue by adding a `usleep(5000)` just before
the `lttng_poll_wait()` call in the notification thread.
Note
====
It's the second time that I fix this issue.
It was first fixed by this commit by adding the `restart_poll` flag:
commit
8b5240601e4ddf6127e4291b7194dd5179cb35b5
Author: Francis Deslauriers <francis.deslauriers@efficios.com>
Date: Thu Dec 10 15:41:29 2020 -0500
notification-thread: drain all tracer notification on removal
and later, that other commit refactored that code but accidently removed
the use of the `restart_poll`:
commit
34bf4f69e49d8a69331a6aa6826ef1f155e20ede
Author: Francis Deslauriers <francis.deslauriers@efficios.com>
Date: Wed May 26 16:05:16 2021 -0400
notification-thread: remove fd from pollset on LPOLLHUP and friends
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6da0ed4374b612934adc72fb88d5c142505c5d53
Simon Marchi [Fri, 3 Sep 2021 21:31:28 +0000 (17:31 -0400)]
common/macros: don't define min and max macros in C++
Later in the series, the min macro conflicted with a C++ header file.
Since the min and max macros are not useful in C++ (std::min/std::max
are preferred, don't define them when building a C++ file. Don't define
min_t and max_t either, they too can be replaced with std::min and
std::max, which are templated / type-safe.
Change-Id: I3d56d325f6508c32baba674c335c3f4ab0ecc582
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Fri, 3 Sep 2021 21:31:28 +0000 (17:31 -0400)]
Add mandatory C++ compiler build dependency
This change is in preparation of converting some internal code to C++.
- Add ax_cxx_compile_stdcxx.m4, which provides AX_CXX_COMPILE_STDCXX, a
macro to find a C++ compiler matching certain characteristics.
- Find which warning flags work for the C++ compiler (certain warning
flags may be C or C++ specific).
- Do a bit of reorganizing to group all C compiler things together, all
C++ compiler things together.
Change-Id: I35a16996fa9ba1fbbb040f7fa5f826e6bc95ea29
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Tue, 21 Sep 2021 13:40:37 +0000 (09:40 -0400)]
liblttng-ctl: use export list to define exported symbols
Symbols are currently exported by default by liblttng-ctl.so (usable by
other shared libraries / programs using liblttng-ctl.so), so we must use
LTTNG_HIDDEN on all symbols that are meant to be internal to
liblttng-ctl.so. Of course, this is easy to forget, so over the years
many symbols that were not meant to be exported were exported, and must
now stay exported to avoid breaking the ABI.
As explained here [1], a better method is to make symbols hidden by
default, and mark those we want to be exported as such. I have tried to
use this, but when subsequently converting the code to C++, I have
noticed that some symbols related to the STL were exported anyway, which
is bad.
The other alternative, implemented in this patch, is to use an explicit
symbol export list [2], using libtool's -export-symbols (which uses the
linker's -version-script option). Only the symbols listed here are
exported.
So, in practice, this patch:
- Adds an liblttng-ctl.sym file with the list of exported symbols and
adjusts the Makefile to use the -export-symbol option
- Removes LTTNG_HIDDEN and all its uses
abidiff shows no changes for liblttng-ctl.so between before and after
this patch.
[1] https://gcc.gnu.org/wiki/Visibility
[2] https://www.gnu.org/software/libtool/manual/libtool.html#Link-mode
Change-Id: I5d8c558303894b0ad8113c6e52f79a053bb580e1
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 29 Sep 2021 13:37:01 +0000 (09:37 -0400)]
tests: add some diags in utils.sh
When debugging test cases and trying to reproduce manually, it helps me
a lot to see what commands are executed by the test cases.
Change utils.sh such that all invocations of the lttng binary go through
a function that logs (using `diag`) the executed command.
Also add some diagnostics when starting lttng-sessiond, to make that
easier to replicate by hand.
It's not perfect, if an argument contains some spaces, the diag output
will not be quoted properly, so you won't be able to copy paste the
command directly. But from what I saw, a lot of things in our testsuite
would not handle spaces (in session names for example) properly, so it's
not likely to happen often.
Here's an example of the result:
$ ./tests/regression/ust/multi-lib/test_multi_lib
1..55
# UST - Dynamic loading and unloading of libraries
# export LTTNG_SESSION_CONFIG_XSD_PATH=/home/smarchi/build/lttng-tools/tests/../src/common/config/
# env /home/smarchi/build/lttng-tools/tests/../src/bin/lttng-sessiond/lttng-sessiond --background --consumerd64-path=/home/smarchi/build/lttng-tools/tests/../src/bin/lttng-consumerd/lttng-consumerd 1
ok 1 - Start session daemon
# ./tests/regression/ust/multi-lib//../../../../src/bin/lttng/lttng create multi_lib -o /tmp/tmp.test_multi_lib_ust_trace_path.SRYlHj
ok 2 - Create session multi_lib in -o /tmp/tmp.test_multi_lib_ust_trace_path.SRYlHj
# dlopen 2 providers, same event name, same payload
# ./tests/regression/ust/multi-lib//../../../../src/bin/lttng/lttng enable-event multi:tp -s multi_lib -u
ok 3 - Enable ust event multi:tp for session multi_lib
# ./tests/regression/ust/multi-lib//../../../../src/bin/lttng/lttng start multi_lib
ok 4 - Start tracing for session multi_lib
# ./tests/regression/ust/multi-lib//../../../../src/bin/lttng/lttng stop multi_lib
ok 5 - Stop lttng tracing for session multi_lib
ok 6 - Trace match with 2 event multi:tp
ok 7 - Metadata match with the metadata of 1 event(s) named multi:tp
# ./tests/regression/ust/multi-lib//../../../../src/bin/lttng/lttng destroy multi_lib
ok 8 - Destroy session multi_lib
# ./tests/regression/ust/multi-lib//../../../../src/bin/lttng/lttng create multi_lib -o /tmp/tmp.test_multi_lib_ust_trace_path.dQ0kMN
ok 9 - Create session multi_lib in -o /tmp/tmp.test_multi_lib_ust_trace_path.dQ0kMN
# dlopen 2 providers, same event name, different payload
# ./tests/regression/ust/multi-lib//../../../../src/bin/lttng/lttng enable-event multi:tp -s multi_lib -u
ok 10 - Enable ust event multi:tp for session multi_lib
...
Change-Id: I312fc0890a2dfaedf199b9021baebac8d3bf632b
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 6 Oct 2021 15:41:19 +0000 (11:41 -0400)]
include: add missing "extern"
Change-Id: I37574b25adede7c639a04c508f6e4be8256339d9
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Wed, 6 Oct 2021 14:57:24 +0000 (10:57 -0400)]
include: remove spurious spaces in condition/session-rotation.h
Change-Id: Ia525d24c3b4098dff5c50fb2c5d93c16f6e08f5c
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Tue, 5 Oct 2021 20:10:18 +0000 (16:10 -0400)]
tests: fix header of regression/ust/getcpu-override/run-getcpu-override
The "SPDX-License-Identifier:" header is not in a comment, so is
interpreted as a bash command. This is harmless, but it appears in the
test output:
ok 13 - Start tracing for session sequence-cpu
# Launching app with getcpu-plugin wrapper
./tests/regression/ust/getcpu-override//run-getcpu-override: 2: SPDX-License-Identifier:: not found
ok 14 - Application with wrapper done
Fix that, and add a proper copyright notice, based on the other files
that were added at the same time as this one.
Change-Id: Icdf5e2fd5aec4080b2e5cad10cca4813bad26394
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Tue, 5 Oct 2021 17:49:09 +0000 (13:49 -0400)]
cleanup: remove urcu_bp dependency from ust tests
lttng-ust >= 2.13 has its own internal copy of urcu and doesn't require
applications to link with urcu_bp anymore.
Change-Id: I0a11af7cf284952dbc26d0657eb490040acb7438
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Michael Jeanson [Thu, 5 Aug 2021 20:48:51 +0000 (16:48 -0400)]
fix: wrong define used for GCC version check
As far as I can tell, the __GNUC_MAJOR__ define has never existed, the
proper define for the major version is __GNUC__. See
https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html for
more details.
Change-Id: I0d47d524e7efd204fd2f8976311c62e872eb6170
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Wed, 29 Sep 2021 14:38:44 +0000 (10:38 -0400)]
Cleanup: unnecessary declaration
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4a5e45d3981064b16d0790d511a1dc7ea33904af
Jonathan Rajotte [Wed, 29 Sep 2021 14:30:31 +0000 (10:30 -0400)]
Cleanup: firing policy source files
This is a leftover of a manipulation error during a rebase.
None of these files are considered by the build system.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8c923d65e36c850e883552a8e46333ee854588a3
Jérémie Galarneau [Mon, 4 Oct 2021 16:41:51 +0000 (12:41 -0400)]
Fix: userspace-probe: unreported error on string copy error
Issue
=====
String copy errors, either due to the length or an allocation failure,
are not reported by
lttng_userspace_probe_location_tracepoint_create_from_payload
and don't log a clear error message.
This allowed truncation bugs like the one fixed in
b45a296 to go
unnoticed.
Fix
===
Return an "invalid" status code and log a more descriptive error
message.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia07cac7cba315ea79337262e9082dd06eb60950f
Francis Deslauriers [Fri, 1 Oct 2021 20:10:24 +0000 (16:10 -0400)]
Fix: userspace-probe: truncating binary path for SDT
Issue
=====
This issue was uncovered when we enabled the testing of the SDT
userspace probe instrumentation on the CI, where the paths to file are
specially long.
The reported error is:
- rule: ma-probe-sdt (type: kernel:uprobe, location type: SDT, location: /root/workspace/dev_gerrit_lttng-tools_rootbuild/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/amd64-rootnode/test_type/base/src/lttng-tools/tests/utils/testapp/userspace-probe-sdt-binary/.libs/userspace-probe-sdt-binary:foobar:tp1)
+ rule: ma-probe-sdt (type: kernel:uprobe, location type: SDT, location: /root/workspace/dev_gerrit_lttng-tools_rootbuild/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/agents/liburcu_version/master/node/amd64-rootnode/test_type/base/src/lttng-tools/tests/utils/testapp/userspace-probe-sdt-binary/.libs/userspace-probe-s:foobar:tp1)
The important part to notice is that the path to the binary is truncated
compared to was is expected by the test case.
The problem is caused by the
`lttng_userspace_probe_location_tracepoint_create_from_payload()`
function that strdup() the path string using the wrong defined value.
Fix
===
Use LTTNG_PATH_MAX rather then LTTNG_SYMBOL_NAME_LEN to copy the binary
path.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I24cbf413baba405bf4c4b534ccbc2b18f8d5d43f
Jérémie Galarneau [Tue, 7 Sep 2021 21:37:37 +0000 (17:37 -0400)]
Clean-up: sessiond: remove unused ust_app_{lock, unlock}_list declarations
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic01a48881756322ffa5ba6a977dde1bc81f4a0d0
Jérémie Galarneau [Thu, 1 Jul 2021 19:50:47 +0000 (15:50 -0400)]
Fix: lttng: add-trigger: don't provide a default event rule type
There is no reason for an event rule to have a default type. The
--type parameter is required.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic7f03453fac410c96ca6bb3b3ca0bdfb297a10d1
Francis Deslauriers [Fri, 20 Aug 2021 19:26:01 +0000 (15:26 -0400)]
Force usage of assert() condition when NDEBUG is defined
Reuse the BT2 approach to force the usage of the assertion condition
even when assert() are removed by the NDEBUG define.
See `BT_USE_EXPR()` macro and documentation in Babeltrace commit[0]:
commit
1778c2a4134647150b199b2b57130817144446b0
Author: Philippe Proulx <eeppeliteloop@gmail.com>
Date: Tue Apr 21 11:15:42 2020 -0400
lib: assign a unique ID to each pre/postcond. and report it on failure
0: https://github.com/efficios/babeltrace/commit/
1778c2a4134647150b199b2b57130817144446b0
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3844b6ae7e95952d90033898397ac936540b785c
Francis Deslauriers [Thu, 19 Aug 2021 21:14:46 +0000 (17:14 -0400)]
Fix: statements with side-effects in assert statements
Background
==========
When building with the NDEBUG definition the `assert()` statements are
removed.
Issue
=====
Currently, a few `assert()` statements in the code base contain
statements that have side effects and removing them changes the
behavior for the program.
Fix
===
Extract the statements with side effects out of the `assert()`
statements.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0b11c8e25c3380563332b4c0fad15f70b09a7335
Jonathan Rajotte [Thu, 16 Sep 2021 15:20:07 +0000 (11:20 -0400)]
Fix: lttng_trace_archive_location_serialize is called on freed memory
Observed issue
==============
The following backtrace have been reported [1].
#0 __GI_raise (sig=sig@entry=6) at /usr/src/debug/glibc/2.31+gitAUTOINC+
f84949f1c4-r0/git/sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000003123025528 in __GI_abort () at /usr/src/debug/glibc/2.31+gitAUTOINC+
f84949f1c4-r0/git/stdlib/abort.c:79
#2 0x0000000000419884 in lttng_trace_archive_location_serialize (location=0x7f1c9c001160, buffer=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/location.c:230
#3 0x00000000004c8f06 in lttng_evaluation_session_rotation_serialize (evaluation=0x7f1cb000a7f0, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/conditions/session-rotation.c:539
#4 0x00000000004a80fa in lttng_evaluation_serialize (evaluation=0x7f1cb000a7f0, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/evaluation.c:42
#5 0x00000000004bc24f in lttng_notification_serialize (notification=0x7f1cb961c310, payload=0x7f1cb961c320) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/common/notification.c:63
#6 0x0000000000458b7d in notification_client_list_send_evaluation (client_list=0x7f1cb0008f90, trigger=0x7f1ca40113d0, evaluation=<optimized out>, source_object_creds=0x7f1cb000a874, client_report=0x475840 <client_handle_transmission_status>, user_data=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/notification-thread-events.c:4379
#7 0x0000000000476586 in action_executor_generic_handler (item=0x7f1cb0009600, work_item=0x7f1cb000a820, executor=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:696
#8 action_work_item_execute (work_item=0x7f1cb000a820, executor=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:715
#9 action_executor_thread (_data=0x7f1cb0006010) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/action-executor.c:797
#10 0x0000000000462327 in launch_thread (data=0x7f1cb00060b0) at /usr/src/debug/lttng-tools/2.13.0-r0/lttng-tools-2.13.0/src/bin/lttng-sessiond/thread.c:66
#11 0x0000003123408ea4 in start_thread (arg=<optimized out>) at /usr/src/debug/glibc/2.31+gitAUTOINC+
f84949f1c4-r0/git/nptl/pthread_create.c:477
#12 0x00000031230f8dcf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
This can be easily reproduced with the following session and trigger
configuration:
lttng create test
lttng enable-event -u -a
lttng start
# Register two similar triggers via a dummy C program since rotation
# completed condition is not exposed on the CLI for now. Yielding the
# following triggers:
lttng list-triggers
- name: trigger0
owner uid: 1000
condition: session rotation completed
session name: test
errors: none
action:notify
errors: none
- name: trigger1
owner uid: 1000
condition: session rotation completed
session name: test
errors: none
action:notify
errors: none
lttng rotate <- abort happens here.
Cause
=====
The problem lies in how the location (`lttng_trace_archive_location`)
object is assigned to the `lttng_evaluation` objects. A single location
object can end up being shared between multiple `lttng_evaluation` objects
since we iterate over all triggers and create an `lttng_evaluation` object
with the location each time as needed.
See `src/bin/lttng-sessiond/notification-thread-events.c:1956`.
The location object is then freed when the first notification is
completely serialized. The second serialization end up having a
reference to a freed `lttng_trace_archive_location` object.
Solution
========
Implement ref counting for the lttng_trace_archive_location object.
Note
=======
This also fixes a leak that was present in `cmd_destroy_session_reply`.
The location is created by `session_get_trace_archive_location` and is
never `destroyed`/`put`.
Known drawbacks
=========
None.
References
==========
[1] https://bugs.lttng.org/issues/1325
Fixes: #1325
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Change-Id: I99dc595ee5b0288c727b193ed061f5273752bd24
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jonathan Rajotte [Mon, 13 Sep 2021 20:49:48 +0000 (16:49 -0400)]
Fix: sessiond: ust session is inactive during ust_app_global_update
Observed issue
==============
The following scenario leads to an abort of lttng-sessiond.
lttng-sessiond (with kernel tracing available)
lttng create system-trace --snapshot -U /tmp/snapshot
lttng enable-channel -k system-trace --subbuf-size=4k --num-subbuf=256
lttng enable-event -c system-trace -k 'sched_wak*' -s system-trace
lttng start system-trace
lttng enable-event -u -a
Fails as expected with:
Error: Events: The command tried to enable an event in a new domain for
a session that has already been started once. (channel channel0,
session system-trace)
Launch any ust app such as easy_ust from the lttng-ust repository.
The following backtrace is generated:
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7af0859 in __GI_abort () at abort.c:79
#2 0x00007ffff7af0729 in __assert_fail_base (fmt=0x7ffff7c86588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55555564b765 "usess->active", file=0x555555649a60 "ust-app.c", line
#3 0x00007ffff7b01f36 in __GI___assert_fail (assertion=0x55555564b765 "usess->active", file=0x555555649a60 "ust-app.c", line=5123, function=0x55555564ecf0 <__PRETTY_FUNCTION__.14199> "ust_
#4 0x00005555555d1f5e in ust_app_global_update (usess=0x7fffe001fb90, app=0x7fffac000b80) at ust-app.c:5123
#5 0x00005555555b60d4 in update_ust_app (app_sock=82) at dispatch.c:71
#6 0x00005555555b7025 in thread_dispatch_ust_registration (data=0x5555556a07f0) at dispatch.c:409
#7 0x00005555555ad5ab in launch_thread (data=0x5555556a0810) at thread.c:65
#8 0x00007ffff7ce6609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#9 0x00007ffff7bed293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
This also happens for the track command. You can replace the `lttng
enable-event -u -a` with `lttng track --userspace --vuid=0` then launch
an app and the same backtrace gets generated.
Cause
=====
During `process_client_msg` the `create_ust_session` function is called
and a ust session is assigned to the "system_trace" session with a
state of `active` set to 0 (false). This is not a problem.
The problem seems to lie with a single call site for
`ust_app_global_update` in `update_ust_app`. The status of the ust
session is not checked before calling the `ust_app_global_update`. It is
important to note that all `ust_app_global_update_all` callsites guard
the call with a check against the status of the session.
Solution
========
Guard the call to `ust_app_global_update` with a check of the ust
session active state.
Known drawbacks
=========
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I14d25d99d0609689247cdfa86130bd0219613581
Jonathan Rajotte [Tue, 14 Sep 2021 20:10:36 +0000 (16:10 -0400)]
Fix: common: error query for trigger action protocol error
Observed issue
==============
When listing a trigger with a single non-list action the CLI reports an
error in the protocol resulting in an output with no error accounting
for the action.
$ lttng list-triggers
- name: trigger0
owner uid: 1000
condition: session rotation ongoing
session name: test
errors: none
action:notify
Error: Failed to query errors of trigger 'trigger0' (owner uid: 1000): Protocol error occurred
Cause
=====
The `action_path` associated with the query has an index count of 0 as
it should considering that the single root element action element is not
a `list` object.
Inside `lttng_action_path_create_from_payload` a payload view is
initialized with a `len` of 0 since `header->index_count` is 0 as it
should.
The payload view is then validated and is considered invalid since the
validation check for `len` > 0. The error then bubbles up.
Solution
========
Since that the payload view is considered invalid when it is equal to
zero simply handle this special case and call directly
`lttng_action_path_create` with the appropriate parameter.
Known drawbacks
=========
None.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8f302c3aa78835342c665793908dc02f0a9dece4
Simon Marchi [Tue, 21 Sep 2021 14:31:55 +0000 (10:31 -0400)]
Fix: common: un-hide two rate policy functions
These functions are part of the liblttng-ctl API/ABI, they should not be
hidden.
Change-Id: Ic04bb4e7a0bfd0c7d661228b7ccf5d17dccfd9ba
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Tue, 21 Sep 2021 13:30:09 +0000 (09:30 -0400)]
Fix: include: remove unneeded declaration of lttng_session_descriptor_get_session_name
There is a declaration of lttng_session_descriptor_get_session_name in
both session-descriptor.h and session-descriptor-internal.h. Since this
is a function exposed by the API, the one in -internal.h is not needed,
remove it.
Since the removed declaration had LTTNG_HIDDEN, this has the effect of
making the lttng_session_descriptor_get_session_name symbol of
liblttng-ctl exported / part of the ABI. I think it was a mistake that
it wasn't previously exported.
Change-Id: I79d383f012d161a6df42240c6849b1b3af109def
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Francis Deslauriers [Wed, 8 Sep 2021 14:16:23 +0000 (10:16 -0400)]
Fix: Tests: race condition in test_ns_contexts_change
Issue
=====
The test script doesn't wait for the test application to complete before
stopping the tracing session. The race is that depending on the
scheduling the application is not always done generating events when the
session is stopped.
Fix
===
Make the test script wait for the termination of the test app before
stopping the session.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I29d9b41d2a2ed60a6c42020509c2067442ae332c
Francis Deslauriers [Tue, 7 Sep 2021 21:10:31 +0000 (17:10 -0400)]
Fix: Tests: race condition in test_event_tracker
Background
==========
The `test_event_tracker` file contains test cases when the event
generating app in executed in two distinct steps. Those two steps are
preparation and execution.
1. the preparation is the launching the app in the background, and
2. the execution is actually generating the event that should or
should not be traced depending on the test case.
This is useful to test the tracker feature since we want to ensure that
already running apps are notified properly when changing their tracking
status.
Issue
=====
The `test_event_vpid_track_untrack` test case suffers from a race
condition that is easy to reproduce on Yocto.
The issue is that sometimes events are end up the trace when none is
expected.
This is due to the absence of synchronization point at the launch of the
app which leads to the app being scheduled in-between the track-untrack
calls leading to events being recorded to the trace.
It's easy to reproduce this issue on my machine by adding a `sleep 5`
between the track and untrack calls and setting the `NR_USEC_WAIT`
variable to 1.
Fix
===
Using the testapp `--sync-before-last-event-touch` flag to make the app
create a file when all but the last event are executed. We then have the
app wait until we create a file (`--sync-before-last-event`) to generate
that last event. This way, we are sure no event will be generated when
running the track and untrack commands.
Notes
=====
- This issue affects other test cases in this file.
- This commit fixes a typo in the test header.
- This commit adds `diag` calls to help tracking to what test the output
relates to when reading the log.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia2b68128dc9a805526f9748f31ec2c2d95566f31
Simon Marchi [Fri, 20 Aug 2021 19:36:50 +0000 (15:36 -0400)]
tests: tap: remove semicolons in pass / fail definitions
I wanted to write something like this in a test:
if (cond)
fail("message 1");
else
fail("message 2");
And was met with
CC test_action.o
/home/simark/src/lttng-tools/tests/unit/test_action.c: In function ‘main’:
/home/simark/src/lttng-tools/tests/unit/test_action.c:544:9: error: ‘else’ without a previous ‘if’
544 | else
| ^~~~
I then remembered that it was in our coding style to use braces:
if (cond) {
fail("message 1");
} else {
fail("message 2");
}
... which avoids the error. But still, the former form should work.
Fix this by removing the semi-colons in the pass and fail definitions, I
don't think they belong there. Doing so finds a spot in ini_config.c
where a semi-colon is missing, add it.
Change-Id: I6ff09d496a0b12f34baa6f993cffc69eef611df0
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 23 Aug 2021 18:01:17 +0000 (14:01 -0400)]
configure: add -Wno-gnu-folding-constant
When building with clang 12, I get:
CC tp.o
In file included from /home/simark/src/lttng-tools/tests/utils/testapp/gen-ust-events/tp.c:10:
In file included from /home/simark/src/lttng-tools/tests/utils/testapp/gen-ust-events/./tp.h:75:
In file included from /tmp/lttng/include/lttng/tracepoint-event.h:69:
In file included from /tmp/lttng/include/lttng/ust-tracepoint-event.h:1021:
/home/simark/src/lttng-tools/tests/utils/testapp/gen-ust-events/./tp.h:30:1: error: variable length array folded to constant array as an extension [-Werror,-Wgnu-folding-constant]
TRACEPOINT_EVENT(tp, tptest,
^
/tmp/lttng/include/lttng/tracepoint.h:791:27: note: expanded from macro 'TRACEPOINT_EVENT'
#define TRACEPOINT_EVENT LTTNG_UST_TRACEPOINT_EVENT
^
/tmp/lttng/include/lttng/ust-tracepoint-event.h:84:2: note: expanded from macro 'LTTNG_UST_TRACEPOINT_EVENT'
LTTNG_UST__TRACEPOINT_EVENT_CLASS(_provider, _name, \
^
/tmp/lttng/include/lttng/ust-tracepoint-event.h:940:10: note: expanded from macro 'LTTNG_UST__TRACEPOINT_EVENT_CLASS'
size_t __dynamic_len[__num_fields]; \
^
From what I understand, this warning simply says that the compiler did
figure out that __num_fields could be known at compile-time, so
__dynamic_len ends up as a regular static array and not a variable
length array (which is what we want). And it warns us that this
behavior is not standard C, but an extension that originated from GNU.
So I think it's fine to ignore it, as it simply warns us that the
behavior we want happens.
Change-Id: Ib7273e7f86c6b04742f8463f925cdbb1fa14041d
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Fri, 13 Aug 2021 15:15:15 +0000 (11:15 -0400)]
Fix: man: lttng-rotate: trace file count/size limitation does not apply
Reported-by: Zach Kramer <Zach.Kramer@cognex.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I337fd06a12d145bdd97c14b4b1894e3676945f63
Francis Deslauriers [Fri, 6 Aug 2021 13:40:20 +0000 (09:40 -0400)]
Fix: runas: less-than-zero comparison of an unsigned value
Fixes two defects found by Coverity related to unsigned integers being
treated as signed.
Reported by Coverity:
CID
1461333: Control flow issues (NO_EFFECT)
This less-than-zero comparison of an unsigned value is never true. "buf_size < 0UL".
CID
1461332: Integer handling issues (NEGATIVE_RETURNS)
"buf_size" is passed to a parameter that cannot be negative.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id6d4a71960f2ef34f14c05e66ef5d934b7a3e524
Jérémie Galarneau [Thu, 5 Aug 2021 21:30:28 +0000 (17:30 -0400)]
run-as: clean-up: handle_one_cmd: mark initial uid/gid as const
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia4fdf5b52a89bc499433550c0de8aeb50d3dea91
Francis Deslauriers [Fri, 23 Jul 2021 20:27:00 +0000 (16:27 -0400)]
Fix: runas: supplementary groups are ignored on lttng save
Observed issue
==============
On `lttng save` the following is reported to the user:
$ sudo -u my_user lttng save -o /tmp/my_dir my_session_name
Error: Permission denied
Note that:
* the running lttng-sessiond is root,
* "my_user" is part of the tracing group,
* "my_user" primary group is "my_user" and is part of group "my_dummy_group"
* The "/tmp/my_dir" has the following permissions:
drwxrwx--- 2 root my_dummy_group 4096 Jul 26 16:39 /tmp/my_dir/
Cause
=====
The supplementary groups are not initialized when the run-as process
demote itself to the user "my_user" to perform the recursive mkdir
required by the `lttng save` command.
From the point of the view the kernel, at the moment of performing the
mkdir call the permissions looks like this:
euid: uid of "my_user"
egid: primary gid of "my_user"
supplementary group list: "root"
Note that the kernel does not treat the presence of the root group in
the supplementary group list in any special way. Since "root gid" !=
"my_dummy_group gid" the directory creation is refused.
Solution
========
Use initgroups(3) to initialize the supplementary group list.
Known drawbacks
=========
None.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I58656a3107e4f7b59a2391a4759988401cad7a2b
Jérémie Galarneau [Tue, 3 Aug 2021 18:27:03 +0000 (14:27 -0400)]
Docs: lttng-event-rule(7): --exclude does not exist, use --exclude-name
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I92bb8e1b362d121172368897e6a9d4f538d4c68d
Francis Deslauriers [Thu, 8 Jul 2021 19:46:02 +0000 (15:46 -0400)]
sessiond: logging typo: {triger, triggger} -> trigger
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ida8faafc4c12f9817d3ee097bb648c10bd5ff854
Simon Marchi [Mon, 2 Aug 2021 01:02:39 +0000 (21:02 -0400)]
Fix: lttng: free sessions in cmd_destroy
When doing `lttng destroy`, I get:
Direct leak of 4385 byte(s) in 1 object(s) allocated from:
#0 0x7f74ae025459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x7f74add4129a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
#2 0x7f74add42b9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
#3 0x7f74add42f9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
#4 0x7f74add41714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
#5 0x7f74add41747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
#6 0x7f74add4a922 in lttng_list_sessions /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2105
#7 0x56472bcbdf80 in cmd_destroy /home/simark/src/lttng-tools/src/bin/lttng/commands/destroy.c:330
#8 0x56472bd00764 in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
#9 0x56472bd01218 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
#10 0x56472bd0151a in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
#11 0x7f74ad963b24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
This is due to cmd_destroy not free'ing the result of
lttng_list_sessions. Fix that.
Change-Id: Iff2e75e6ec1cdcd0bdfdbbc3d5099422e592905b
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Simon Marchi [Mon, 2 Aug 2021 00:33:23 +0000 (20:33 -0400)]
Fix: lttng: free domains and channels in get_session_stats_str
When doing `lttng stop`, I get:
Direct leak of 656 byte(s) in 1 object(s) allocated from:
#0 0x7f970719e459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x7f9706eba29a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
#2 0x7f9706ebbb9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
#3 0x7f9706ebbf9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
#4 0x7f9706eba714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
#5 0x7f9706eba747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
#6 0x7f9706ec4604 in lttng_list_channels /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2262
#7 0x55837235c4e7 in get_session_stats_str /home/simark/src/lttng-tools/src/bin/lttng/utils.c:499
#8 0x55837235bf73 in print_session_stats /home/simark/src/lttng-tools/src/bin/lttng/utils.c:445
#9 0x55837231cc12 in stop_tracing /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:138
#10 0x55837231d062 in cmd_stop /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:229
#11 0x55837235e63e in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
#12 0x55837235f0f2 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
#13 0x55837235f3f4 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
#14 0x7f9706adcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
Direct leak of 308 byte(s) in 1 object(s) allocated from:
#0 0x7f970719e459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x7f9706eba29a in zmalloc /home/simark/src/lttng-tools/src/common/macros.h:45
#2 0x7f9706ebbb9d in recv_sessiond_optional_data /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:494
#3 0x7f9706ebbf9a in lttng_ctl_ask_sessiond_fds_varlen /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:596
#4 0x7f9706eba714 in lttng_ctl_ask_sessiond_varlen_no_cmd_header /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:58
#5 0x7f9706eba747 in lttng_ctl_ask_sessiond /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl-helper.h:78
#6 0x7f9706ec421c in lttng_list_domains /home/simark/src/lttng-tools/src/lib/lttng-ctl/lttng-ctl.c:2220
#7 0x55837235c3d3 in get_session_stats_str /home/simark/src/lttng-tools/src/bin/lttng/utils.c:484
#8 0x55837235bf73 in print_session_stats /home/simark/src/lttng-tools/src/bin/lttng/utils.c:445
#9 0x55837231cc12 in stop_tracing /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:138
#10 0x55837231d062 in cmd_stop /home/simark/src/lttng-tools/src/bin/lttng/commands/stop.c:229
#11 0x55837235e63e in handle_command /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:237
#12 0x55837235f0f2 in parse_args /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:426
#13 0x55837235f3f4 in main /home/simark/src/lttng-tools/src/bin/lttng/lttng.c:475
#14 0x7f9706adcb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
This is due to the get_session_stats_str function not free'ing the
results of lttng_list_channels and lttng_list_domains. Fix that.
Change-Id: I4c200d3df41bf09bdce8eadb000abbff7fe5a751
Signed-off-by: Simon Marchi <simon.marchi@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Mon, 19 Jul 2021 21:21:17 +0000 (17:21 -0400)]
Tests fix: unix socket: leaked socket of connection to child
The child_connection socket is only used by the parent in the
credentials passing test. The teardown assumes the reverse which causes
the socket to be leaked.
1458471 Resource leak
The system resource will not be reclaimed and reused, reducing the
future availability of the resource.
In test_creds_passing: Leak of memory or pointers to system
resources (CWE-404)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2ead9abbfc189ffbdd71a27f6376d0b001cdc2a3
Jérémie Galarneau [Mon, 19 Jul 2021 21:17:39 +0000 (17:17 -0400)]
Fix: sessiond: notification: missing unlock on client skip
Skipping a client must be performed by using the dedicated "skip_client"
label which will unlock the client's lock before continuing the loop
rather than using 'continue' directly.
Currently, a client will remain locked when an hidden trigger emits
a notification to which it is subscribed.
1458230 Missing unlock
May result in deadlock if there is another attempt to acquire the lock.
In notification_client_list_send_evaluation: Missing a release of a lock
on a path (CWE-667)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8b69395b91b0ea59ae5e0beadebd9099db623121
Jérémie Galarneau [Fri, 16 Jul 2021 18:47:53 +0000 (14:47 -0400)]
liblttng-ctl: hide logger_thread_name
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4eb5a86029c6220ad4f48d382ec26126fd82e443
Jérémie Galarneau [Fri, 16 Jul 2021 18:42:24 +0000 (14:42 -0400)]
liblttng-ctl: hide MI trigger command variables
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I45eec5bb0fd3353c8f1257b3c94ef08440114b21
Francis Deslauriers [Tue, 25 May 2021 19:57:59 +0000 (15:57 -0400)]
Cleanup: rename `get_domain_str()` -> `lttng_domain_type_str()`
Both functions currently exist in the code base and accomplish the same
goal. Let's keep only one of them.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2254b846f0b5bdc883c86d970fde7daffa9e6155
Jérémie Galarneau [Fri, 16 Jul 2021 17:29:07 +0000 (13:29 -0400)]
.gitignore: Add hidden trigger test
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iab0fe77c0d4607d5469a7aa57d6bd784d47d8609
Jérémie Galarneau [Thu, 15 Jul 2021 00:44:44 +0000 (20:44 -0400)]
Test: unix socket: test credential passing
Since the credential passing over UNIX sockets now makes use of the pid,
the compatiblity wrappers have become more complex as each platform
appears to define its own way of accessing this information.
This new test:
- creates a named unix socket,
- forks,
- gets the parents and child to connect,
- sends the child's credentials as a data payload and as credentials
verified by the kernel
- the parent checks that the two sets of credentials are equal.
This is more of a sanity check for the compatibility wrappers used on
non-Linux platforms.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic0a6213afca7cc95a00617b052e7a145fc88625c
Jérémie Galarneau [Wed, 14 Jul 2021 19:19:15 +0000 (15:19 -0400)]
Build fix: retrieve unix socket peer PID on non-unix platforms
The previous attempt at extending the credential retrieval wrapper was
broken and didn't build on FreeBSD, macOS, and cygwin.
A platform-specific way of retrieving the PID of a unix peer is
implemented for FreeBSD (getsockopt using LOCAL_PEERCRED, note that the
cr_pid field is only available from FreeBSD 13 and up),
macOS (getsockopt using LOCAL_PEERPID, macOS 10.8+), and
Solaris (getpeerucreds).
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifcf522c70ee4c2e0799293ae0961f41aebff5056
Jérémie Galarneau [Mon, 12 Jul 2021 22:42:57 +0000 (18:42 -0400)]
Fix: sessiond: notification: find_tracer_event_source returns NULL
Due to a bad edit of the original patch (my bad!)
find_tracer_event_source_element always returns NULL.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7febee1d803034a06d5063a2cc9179c4edef4809
Francis Deslauriers [Thu, 8 Jul 2021 16:35:58 +0000 (12:35 -0400)]
Tests: MI: add `diag` statements to test functions
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie56e23a3d0796d1edb07e2fd7cdc259816ac0133
Francis Deslauriers [Thu, 21 Jan 2021 17:00:03 +0000 (12:00 -0500)]
Cleanup: fix comments in `duplicate_{stream,channel}_object()`
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5089d09880d21842bf264f6c30ec7fd5e72b93df
Jérémie Galarneau [Fri, 9 Jul 2021 17:00:48 +0000 (13:00 -0400)]
Tests: add hidden trigger visibility test
Add a regression test for the previous commit that verifies that
internal triggers used by the session daemon to implement various
features (automatic session rotations based on their consumed size, in
this instance) are not visible to users of liblttng-ctl.
The test is written in C to use the library directly. This is needed
since the `lttng` client filters-out anonymous triggers and thus, would
not allow us to see those triggers since they are anonymous by default.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1b8fca648953b8cba49a9888593b3486457d01b2
Jérémie Galarneau [Fri, 9 Jul 2021 17:00:56 +0000 (13:00 -0400)]
Fix: sessiond: list-triggers: don't return internal triggers
The session daemon uses triggers internally. For instance, the trigger
and notification subsystem is used to implement the automatic rotation
of sessions based on a size threshold.
Currently, a user of the C API will see those internal triggers if it is
running as the same user as the session daemon. This can be unexpected
by user code that assumes it will be alone in creating triggers.
Moreover, it is possible for external users to unregister those triggers
which would cause bugs.
As the triggers gain more capabilities, it is likely that the session
daemon will keep using them to implement features internally. Thus,
an internal "is_hidden" property is introduced in lttng_trigger.
A "hidden" trigger is a trigger that is not returned by the listings.
It is used to hide triggers that are used internally by the session
daemon so that they can't be listed nor unregistered by external
clients.
This is a property that can only be set internally by the session
daemon. As such, it is not serialized nor set by a
"create_from_buffer" constructor.
The hidden property is preserved by copies.
Note that notifications originating from an "hidden" trigger will not
be sent to clients that are not within the session daemon's process.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I61b7949075172fcd428289e2eb670d03c19bdf71
Jérémie Galarneau [Thu, 8 Jul 2021 21:57:45 +0000 (17:57 -0400)]
unix: receive pid on non-linux platforms
Add a `pid` to the lttng_sock_cred structure definition used on
non-Linux platforms and receive the peer's PID when receiving
credentials.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9c92f6dda6441deca58f9cc85f846f5031cceb6e
Jérémie Galarneau [Thu, 8 Jul 2021 18:39:59 +0000 (14:39 -0400)]
Clean-up: sessiond: return an lttng_error_code from list_triggers
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5d44b508a2a5211894c0cc7b6d51a9a03dc8b3f2
Francis Deslauriers [Wed, 26 May 2021 20:05:16 +0000 (16:05 -0400)]
notification-thread: remove fd from pollset on LPOLLHUP and friends
When an app dies, it's possible that the notification thread gets an
epoll event (`LPOLLHUP`) that the socket was closed before it gets the
_REMOVE_TRACER_SOURCE command for that source.
In such cases, the notification thread should simply remove the file
descriptor from the pollset and drain the notification on that file
descriptor. It should _not_ remove the _source_element object from the
list.
The removal from the list should only be done when it receives the
_REMOVE_TRACER_SOURCE command.
Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9525315f9e92d0f6ae5e84e26b83a6b7207dce54
Jérémie Galarneau [Wed, 7 Jul 2021 18:59:02 +0000 (14:59 -0400)]
Tests: fix: list triggers: bc missing on system
`bc` is not part of the test suite's dependancies and can be replaced,
in this instance, by a use of `printf`.
This use of `bc` caused a number of failures on the CI's Lava workers.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1a1b24a23325754c26ebedfdb6b7728378381d97
Jérémie Galarneau [Mon, 5 Jul 2021 18:25:40 +0000 (14:25 -0400)]
Clean-up: event-expr: remove unreachable code
1452699 Logically dead code
The indicated dead code may have performed some action; that action will
never occur.
In lttng_event_expr_array_field_element_create: Code can never be
reached because of a logical contradiction (CWE-561)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I301e73c8e0cc7b9c4fb889e5bf7ef30d6ecf7d9f
Jérémie Galarneau [Mon, 5 Jul 2021 18:21:19 +0000 (14:21 -0400)]
Fix: lttng: remove-trigger: null dereference on MI initialization error
Failures to create an MI writer instance will result in a dereference of
the MI writer when attempting to close the command's output element.
1457842 Dereference after null check
Either the check against null is unnecessary, or there may be a null
pointer dereference.
In cmd_add_trigger: Pointer is checked against null but then
dereferenced anyway (CWE-476)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0bc71bf6c83df7d9d938cf93a12d5f6cf6d7ae36
Jérémie Galarneau [Mon, 5 Jul 2021 18:18:27 +0000 (14:18 -0400)]
Fix: lttng: list-trigger: leak of error query in query callbacks
1457841 Resource leak
The system resource will not be reclaimed and reused, reducing the
future availability of the resource.
In mi_error_query_trigger_callback: Leak of memory or pointers to system
resources (CWE-404)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4e2cde41d77e5299d1758e8c9387b0a1c63efd17
Jérémie Galarneau [Mon, 5 Jul 2021 18:16:00 +0000 (14:16 -0400)]
Fix: lttng: add-trigger: null dereference on MI initialization error
Failures to create an MI writer instance will result in a dereference of
the MI writer when attempting to close the command's output element.
1457842 Dereference after null check
Either the check against null is unnecessary, or there may be a null
pointer dereference.
In cmd_add_trigger: Pointer is checked against null but then
dereferenced anyway (CWE-476)
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I98b844d2f1c7abd43bd42ee472759de57b34484e
Jérémie Galarneau [Wed, 30 Jun 2021 22:41:24 +0000 (18:41 -0400)]
lttng: add-trigger: print generated trigger name
Print the generated trigger name when `add-trigger` succeeds.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id858880260513b9a10c4ce5022a95c476e3e32aa
Jérémie Galarneau [Wed, 30 Jun 2021 22:47:52 +0000 (18:47 -0400)]
sessiond: generate trigger name: name triggers with the 'trigger' prefix
Generated trigger names currently have the form TN, where N is the
number of generated trigger names over the lifetime of the session
daemon.
The form 'triggerN' seems more in line with autogenerated names such
as channel names (e.g. 'channel0').
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id61cd4716bb4c080d9242853c366e14542f60f7c
Jérémie Galarneau [Thu, 1 Jul 2021 13:12:21 +0000 (09:12 -0400)]
Revert "lttng: add-trigger: print generated trigger name"
This reverts commit
8310270a50784aced2af5b21ab23bc7bd9dee47f.
This change is still under review.
Change-Id: If75aa02e2e5daa0bfbcf30bea0a2b54c4aca1fd4
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Jérémie Galarneau [Wed, 30 Jun 2021 22:41:24 +0000 (18:41 -0400)]
lttng: add-trigger: print generated trigger name
Print the generated trigger name when `add-trigger` succeeds. Also,
no message is emited when a trigger is successfully registered as
the command will print an error message if any error occurs.
There is also no need to parrot the trigger's name if it was specified
by the user.
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9607fbd358298b036bd533834143eb5e9d185cd0
Jonathan Rajotte [Mon, 7 Jun 2021 22:11:01 +0000 (18:11 -0400)]
MI: xsd: bump to 4.1
No breaking change were done to the xsd. Only objects related to
triggers, event-rules, actions, condition, and error-query were added.
They do not interfere with the current MI.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia057c0fbea34f8e5c48cb8d8d307f004acc95a00
Jonathan Rajotte [Mon, 7 Jun 2021 22:03:17 +0000 (18:03 -0400)]
Tests: trigger: mi: use utils.sh xsd versions for xml diff
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic1536218b468d300ceb3d16ca160b8a8b891edfc
Jonathan Rajotte [Mon, 7 Jun 2021 21:56:37 +0000 (17:56 -0400)]
Tests: utils: regroup xml utils to utils.sh
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idfa0f05d1bde75f4b02c903699281a86494b435f
Jonathan Rajotte [Wed, 26 May 2021 22:08:14 +0000 (18:08 -0400)]
Tests: MI: {add, list, remove}-trigger
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ica66a759d961cc122c1a1b81ce69fa54b0e78c78
Jonathan Rajotte [Thu, 27 May 2021 01:53:19 +0000 (21:53 -0400)]
MI: xsd: add objects type definition related to trigger
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If28306f8aaf24890a6d834e9ff69bd00de3da295
Jonathan Rajotte [Thu, 27 May 2021 01:51:26 +0000 (21:51 -0400)]
MI: xsd: sort output_type
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If2206e4a1c7a54d6d6bc4887c1925b12f035232b
Jonathan Rajotte [Thu, 27 May 2021 01:48:20 +0000 (21:48 -0400)]
MI: xsd: sort command_string_type
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8df9d69aeaf93050c405ff876ad697efac7c4021
Jonathan Rajotte [Wed, 26 May 2021 21:17:02 +0000 (17:17 -0400)]
Add pretty_xml utils
This util reads on stdin and outputs an indented/formatted xml.
It is equivalent to "xmllint --format -".
It will be used for MI trigger testing. For testing we will essentially
diff the output of the command against the expected output. While a
nicely formatted multi-line output is not necessary for a machine to
do the diff, the human that will have to debug it will surely appreciate
it.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie1597644941c55ce3e59f7ff16f196ac36325179
Jonathan Rajotte [Wed, 26 May 2021 20:39:12 +0000 (16:39 -0400)]
Move xml utils from mi subfolder to xml-utils folder
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I268dc544bf4f72f61a701ac3efd0b12488cc2f64
Jonathan Rajotte [Fri, 28 May 2021 18:32:37 +0000 (14:32 -0400)]
Fix: lttng_triggers count is not equal to the size of the sorted trigger array
Since anonymous triggers can be present in the original lttng_triggers
and that we do not add them to the sorting list, the count to be used
while iterating on the sorted list must be the size of the list itself
and not that of lttng_triggers.
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifb1802345199cb20fbb6d401f316be918b8a6443
Jonathan Rajotte [Wed, 26 May 2021 17:04:41 +0000 (13:04 -0400)]
MI: {add, list, remove} trigger
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie16c5c3a894b921e032a99ed3deda4ed5da17e78
Jonathan Rajotte [Fri, 7 May 2021 01:26:17 +0000 (21:26 -0400)]
MI: implement all objects related to trigger machine interface
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idb2045135b1ba87853d6214b149afbe27bb7a1ca
Jonathan Rajotte [Thu, 11 Feb 2021 15:40:39 +0000 (10:40 -0500)]
Move event-expr-to-bytecode to event-expr
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I74a4b823ae7bbcbb062dbb9a2a0f84785bca287a
Jonathan Rajotte [Thu, 11 Feb 2021 15:18:38 +0000 (10:18 -0500)]
Move event-expr from liblttng-ctl to libcommon
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I31c65cd7f63fa4e1c918285b02ab2ab2e82549f6
Jonathan Rajotte [Thu, 4 Feb 2021 20:57:56 +0000 (15:57 -0500)]
MI: support double element
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I97411fea238d8b1275028d3d04a6f4f376624001
Jérémie Galarneau [Wed, 16 Jun 2021 19:08:21 +0000 (15:08 -0400)]
Fix: rotation client example: leak of handle on error
1452927 Resource leak
The system resource will not be reclaimed and reused, reducing the
future availability of the resource.
In setup_session: Leak of memory or pointers to system
resources (CWE-404)
CID
1452927 (#1 of 1): Resource leak (RESOURCE_LEAK)8. leaked_storage:
Variable chan_handle going out of scope leaks the storage it points to
Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I4c215ac4a86f9f70fd5c9d3aa13f944d3d7a2cc7
Michael Jeanson [Mon, 14 Jun 2021 15:18:19 +0000 (11:18 -0400)]
Silence warnings on GCC 4.8 with -Wmaybe-uninitialized
We still build on SLES12 with GCC 4.8 in which '-Wmaybe-uninitialized'
doesn't seem to be the sharpest tool in the shed. Add explicit
initialization of 'ret' to silence the warnings.
Change-Id: I1f9de535b6be48357735af106ff555ab9eceb730
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Philippe Proulx [Tue, 15 Jun 2021 03:07:32 +0000 (23:07 -0400)]
doc/man/common-footer.txt: add missing non-breaking space
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibefd4e7448920f0f346697eea5e1b5d250a93d1f
Philippe Proulx [Tue, 15 Jun 2021 02:52:02 +0000 (22:52 -0400)]
Rename "tracing session" -> "recording session"
Starting from LTTng 2.13, _tracing_ is defined as attempting to execute
one or more actions when emitting an event, which is very close to the
trigger definition.
To highlight that a tracing session is only about event recording,
rename this concept to _recording session_.
This patch mostly changes the manual pages, although I also updated some
C source and other files which contain user-facing text to use the new
term.
I didn't update logging messages because debugging scripts could still
refer to "tracing sessions".
The lttng-concepts(7) manual page mentions that the "recording session"
term was "tracing session" before LTTng 2.13.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I620d6b6be9e0f1dac14c0bc5e26094c3b3711c75
Philippe Proulx [Mon, 14 Jun 2021 17:05:37 +0000 (13:05 -0400)]
doc/man: use double quotes when referring to internal section
This patch adds double quotes to all the manual page internal section
references using their full name. Those references often have the
following AsciiDoc form:
See the <<id,Full section name>> section below.
With this patch, this would be converted to:
See the ``<<id,Full section name>>'' section below.
In the rendered manual page, before this patch:
See the Full section name section below.
¯¯¯¯ ¯¯¯¯¯¯¯ ¯¯¯¯
With this patch:
See the “Full section name” section below.
The purpose of this patch is, thanks to the change in
`doc/man/manpage.xsl`, to remove the italic style for the text of
internal links. Because there's no way to create dynamic internal links
in a manual page, this style causes internal links to look weird when
they're not a full section name, for example:
Note that the trigger doesn't need to [...]
¯¯¯¯¯¯¯
The HTML rendering of LTTng-tools manual pages can still benefit from
internal links. This patch makes it possible to add more internal links
without degrading the visual style of manual pages when rendered in a
terminal.
Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1a5ef7eab7ff1e66c137e16b51a9c9074e43f583
This page took 0.054572 seconds and 4 git commands to generate.