lttng-tools.git
7 years agoUniformize the printing of units in session listing
Jérémie Galarneau [Thu, 27 Jul 2017 21:55:14 +0000 (17:55 -0400)] 
Uniformize the printing of units in session listing

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: lost packet accounting always lost on snapshot
Julien Desfossez [Tue, 25 Jul 2017 19:23:49 +0000 (15:23 -0400)] 
Fix: lost packet accounting always lost on snapshot

Because of the continue when we fail to get a subbuff, the lost_packet
count is always reset to 0 before we can account it in the channel. Now
we account it directly before the continue.

Reported-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: report error on session listing
Jonathan Rajotte [Fri, 21 Jul 2017 15:09:14 +0000 (11:09 -0400)] 
Fix: report error on session listing

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTests: don't assume a 4K page size in test_notification
Jérémie Galarneau [Thu, 27 Jul 2017 20:48:44 +0000 (16:48 -0400)] 
Tests: don't assume a 4K page size in test_notification

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix live-comm: merge TCP socket write-write sequence in a single write
Jonathan Rajotte [Mon, 24 Jul 2017 20:07:00 +0000 (16:07 -0400)] 
Fix live-comm: merge TCP socket write-write sequence in a single write

The live protocol implementation is often sending content
on TCP sockets in two separate writes. One to send a command header,
and the second one sending the command's payload. This was presumably
done under the assumption that it would not result in two separate
TCP packets being sent on the network (or that it would not matter).

Delayed ACK-induced delays were observed [1] on the second write of the
"write header, write payload" sequence and result in problematic
latency build-ups for live clients connected to moderately/highly
active sessions.

Fundamentaly, this problem arises due to the combination of Nagle's
algorithm and the delayed ACK mechanism which make write-write-read
sequences on TCP sockets problematic as near-constant latency is
expected when clients can keep-up with the event production rate.

In such a write-write-read sequence, the second write is held up until
the first write is acknowledged (TCP ACK). The solution implemented
by this patch bundles the writes into a single one [2].

[1] https://github.com/tbricks/wireshark-lttng-plugin
    Basic Wireshark dissector for lttng-live by Anto Smyk from Itiviti
[2] https://lists.freebsd.org/pipermail/freebsd-net/2006-January/009527.html

Reported-by: Anton Smyk <anton.smyk@itiviti.com>
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoDocs: move notification thread documentation to header
Jérémie Galarneau [Wed, 26 Jul 2017 18:46:35 +0000 (14:46 -0400)] 
Docs: move notification thread documentation to header

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoDocs: grammar fix in comment
Jérémie Galarneau [Wed, 26 Jul 2017 18:46:09 +0000 (14:46 -0400)] 
Docs: grammar fix in comment

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: evaluate condition/trigger on subscription
Jonathan Rajotte [Tue, 4 Jul 2017 18:58:43 +0000 (14:58 -0400)] 
Fix: evaluate condition/trigger on subscription

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTest: Trigger conditions is evaluated on subscription
Jonathan Rajotte [Tue, 4 Jul 2017 18:58:42 +0000 (14:58 -0400)] 
Test: Trigger conditions is evaluated on subscription

It is expected that on subscription a trigger condition is evaluated and
the trigger fired if necessary. Currently evaluation is performed on
channel sampling and result in action only if the evaluation state flip.

This test hang if no evaluation is performed on notification client
subscription.

Ref #1102

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agosave/load: add blocking_timeout attribute to channel
Jonathan Rajotte [Thu, 6 Jul 2017 15:08:43 +0000 (11:08 -0400)] 
save/load: add blocking_timeout attribute to channel

Fixes #1119

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoIntroduce monitor_timer_interval to session configuration schema
Jonathan Rajotte [Thu, 6 Jul 2017 15:08:42 +0000 (11:08 -0400)] 
Introduce monitor_timer_interval to session configuration schema

Session configuration schema version is bumped to 2.10

Fixes #1099

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTest: Reduce scope of variables used in multi app notification test
Jonathan Rajotte [Tue, 4 Jul 2017 18:58:41 +0000 (14:58 -0400)] 
Test: Reduce scope of variables used in multi app notification test

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoHide internal buffer-view symbols
Jérémie Galarneau [Wed, 21 Jun 2017 13:36:05 +0000 (09:36 -0400)] 
Hide internal buffer-view symbols

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoHide internal session configuration symbols
Jérémie Galarneau [Wed, 21 Jun 2017 13:35:47 +0000 (09:35 -0400)] 
Hide internal session configuration symbols

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoHide internal dynamic-buffer symbols
Jérémie Galarneau [Wed, 21 Jun 2017 13:35:29 +0000 (09:35 -0400)] 
Hide internal dynamic-buffer symbols

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoHide internal string-utils symbols
Jérémie Galarneau [Wed, 21 Jun 2017 13:35:12 +0000 (09:35 -0400)] 
Hide internal string-utils symbols

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTypo: occured -> occurred
Michael Jeanson [Fri, 16 Jun 2017 18:09:21 +0000 (14:09 -0400)] 
Typo: occured -> occurred

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: join consumer timer thread
Mathieu Desnoyers [Fri, 16 Jun 2017 21:23:13 +0000 (17:23 -0400)] 
Fix: join consumer timer thread

Detaching the timer thread has the unfortunate side-effect of letting
the health management data structures be freed by main() while the timer
thread may still be using them (if, e.g., main() exits quickly).

Overcome this situation by tearing down and joining the timer thread.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: use CMM accessors for consumer_quit variable
Mathieu Desnoyers [Fri, 16 Jun 2017 21:23:12 +0000 (17:23 -0400)] 
Cleanup: use CMM accessors for consumer_quit variable

Use CMM_LOAD_SHARED and CMM_STORE_SHARED, which are strictly
equivalent to a volatile variable, in line with the rest of the
lttng-tools project.

Also move its declaration to a header, rather than having multiple
declarations in C files, now following our coding style.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: test_utils_expand_path passes NULL to sprintf
Jérémie Galarneau [Tue, 13 Jun 2017 18:50:05 +0000 (14:50 -0400)] 
Fix: test_utils_expand_path passes NULL to sprintf

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: reject triggers if they depend on an unavailable feature
Jérémie Galarneau [Tue, 13 Jun 2017 18:49:32 +0000 (14:49 -0400)] 
Fix: reject triggers if they depend on an unavailable feature

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: check lttng-modules ABI version for RING_BUFFER_SNAPSHOT_SAMPLE_POSITIONS support
Jonathan Rajotte [Mon, 15 May 2017 19:37:21 +0000 (15:37 -0400)] 
Fix: check lttng-modules ABI version for RING_BUFFER_SNAPSHOT_SAMPLE_POSITIONS support

The RING_BUFFER_SNAPSHOT_SAMPLE_POSITIONS was introduced in
lttng-modules ABI version 2.3. When interacting with a kernel tracer
with ABI versions < 2.3, pass zero as monitor_timer_interval to disable
the monitoring.

Warn during sessiond startup and channel enabling if not supported.

Fixes #1101

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: Send remove channel to notification thread only when necessary
Jonathan Rajotte [Tue, 16 May 2017 20:55:56 +0000 (16:55 -0400)] 
Fix: Send remove channel to notification thread only when necessary

v2: missing "channel" in commit title.

Keep the publishing state to the notification thread of the channel
object. Issue remove command if the channel was previously
published.

Fixes #1103

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: notification test: resources leak and return handling
Jonathan Rajotte [Fri, 2 Jun 2017 18:52:30 +0000 (14:52 -0400)] 
Fix: notification test: resources leak and return handling

Fixes CID #137591313759121375911 1375910 1375909 1375908

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: parse monitor timer parameter as an unsigned 64-bit integer
Jérémie Galarneau [Mon, 12 Jun 2017 21:56:01 +0000 (17:56 -0400)] 
Fix: parse monitor timer parameter as an unsigned 64-bit integer

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoIntroduce "--blocking-timeout" channel parameter
Mathieu Desnoyers [Sat, 27 May 2017 06:17:39 +0000 (08:17 +0200)] 
Introduce "--blocking-timeout" channel parameter

Introduce the blocking timeout channel parameter to control blocking
behavior for lttng-ust buffers. It only affects applications launched
with the LTTNG_UST_ALLOW_BLOCKING environment variable.

The blocking timeout parameter expects:

- 0 (default) which does not block,
- a timeout value in usec,
- -1 (block forever).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: lttng list of channels should return errors
Mathieu Desnoyers [Fri, 26 May 2017 16:14:19 +0000 (18:14 +0200)] 
Fix: lttng list of channels should return errors

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: discard event/lost packet counters
Mathieu Desnoyers [Fri, 26 May 2017 16:14:18 +0000 (18:14 +0200)] 
Fix: discard event/lost packet counters

For per-pid buffers, we need to sum the counters for each application.

For per-uid buffers, if no application has launched yet, it should not
be considered as an error (which stops iteration on all other channels),
but rather as values of 0.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: test: proper redirection of stderr to stdout
Jonathan Rajotte [Thu, 1 Jun 2017 22:16:46 +0000 (18:16 -0400)] 
Fix: test: proper redirection of stderr to stdout

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoDocs: notification comment refers to a structure by its former name
Jérémie Galarneau [Tue, 6 Jun 2017 16:01:02 +0000 (12:01 -0400)] 
Docs: notification comment refers to a structure by its former name

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: missing errno.h include in time.h compat header
Jérémie Galarneau [Fri, 2 Jun 2017 18:49:20 +0000 (14:49 -0400)] 
Fix: missing errno.h include in time.h compat header

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: remove stale file from .gitignore
Jérémie Galarneau [Thu, 1 Jun 2017 20:47:12 +0000 (16:47 -0400)] 
Cleanup: remove stale file from .gitignore

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoDisable binaries on platforms where they are not supported
Michael Jeanson [Wed, 24 May 2017 17:38:08 +0000 (13:38 -0400)] 
Disable binaries on platforms where they are not supported

They can still be enabled with the appropriate configure flag like
--enable-bin-lttng

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: add silent rules support for docs
Michael Jeanson [Fri, 12 May 2017 19:30:09 +0000 (15:30 -0400)] 
Cleanup: add silent rules support for docs

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: popt library detection
Michael Jeanson [Fri, 12 May 2017 16:07:40 +0000 (12:07 -0400)] 
Cleanup: popt library detection

Simplify popt detection code and use a variable to store the detected
lib instead of using the global LIBS variable.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: remove duplicated pthread detection code
Michael Jeanson [Thu, 11 May 2017 21:17:33 +0000 (17:17 -0400)] 
Cleanup: remove duplicated pthread detection code

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: remove unused m4/libxml.m4
Michael Jeanson [Thu, 11 May 2017 21:12:16 +0000 (17:12 -0400)] 
Cleanup: remove unused m4/libxml.m4

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: bison and flex detection
Michael Jeanson [Thu, 11 May 2017 20:50:56 +0000 (16:50 -0400)] 
Cleanup: bison and flex detection

The previous detection code was quite convoluted and required to have
two variables set to the same value to work. For example, the lexer in
autotools is configured through the LEX variable, even if we require a
specific implementation we have to use it and base our detection code on
it.

The situation is now :

  - the FLEX and BISON variables are now ignored.
  - the LEX and YACC variables need to be set if required like before.

Also some minor cleanups.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: merge two instances of AC_CHECK_FUNCS
Michael Jeanson [Thu, 11 May 2017 19:20:43 +0000 (15:20 -0400)] 
Cleanup: merge two instances of AC_CHECK_FUNCS

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: lttng-ust library detection
Michael Jeanson [Thu, 11 May 2017 18:30:09 +0000 (14:30 -0400)] 
Cleanup: lttng-ust library detection

Simplify lttng-ust detection code.

Also remove the --with-lttng-ust-prefix configure option since we don't
offer it for other libs and it's based on user variables which the build
system shouldn't be messing with.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: kmod library detection
Michael Jeanson [Thu, 11 May 2017 16:13:16 +0000 (12:13 -0400)] 
Cleanup: kmod library detection

Simplify kmod detection code and use a variable to store the detected
lib instead of using the global LIBS variable.

Also remove the --with-kmod-prefix configure option since we don't offer
it for other libs and it's based on user variables which the build
system shouldn't be messing with.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: dlmopen detection
Michael Jeanson [Wed, 10 May 2017 22:00:20 +0000 (18:00 -0400)] 
Cleanup: dlmopen detection

Simplify dlmopen detection code and use a variable to store the detected
lib instead of adding conditionnal code to each Makefile.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: uuid library detection
Michael Jeanson [Wed, 10 May 2017 21:01:36 +0000 (17:01 -0400)] 
Cleanup: uuid library detection

Simplify libuuid detection code and use a variable to store the detected
lib instead of adding conditionnal code to each Makefile.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: Don't override user variables within the build system
Michael Jeanson [Tue, 2 May 2017 17:18:33 +0000 (13:18 -0400)] 
Fix: Don't override user variables within the build system

Instead use the appropriatly prefixed AM_* variables as to not interfere
when a user variable is passed to a make command. The proper use of flag
variables is documented at :

https://www.gnu.org/software/automake/manual/automake.html#Flag-Variables-Ordering

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: registry can be null on lookup
Jonathan Rajotte [Mon, 6 Feb 2017 20:28:52 +0000 (15:28 -0500)] 
Fix: registry can be null on lookup

A session teardown can be initiated by a dying application. Hence, a
session object can exist without a valid registry. As a result,
get_session_registry can return null. To prevent this, the UST
application session lock should be held, when possible, when looking up
the registry to ensure synchronization. Otherwise the presence of a
registry is not guaranteed. In such case, handling a null return value
from look-up registry function is necessary.

Core dumps, triggered by the "assert(registry)" statement found in
reply_ust_register_channel, were observed when killing instrumented
applications. In this occurrence, obtaining the UST application lock
result in a deadlock since the lock is already held during
ust_app_global_create. Handling the null value is simpler and
corresponds with the handling of previous look-up done during the
function.

Handling of null value is also applied to:
add_event_ust_registry
add_enum_ust_registry
ust_app_snapshot_record

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTest: Replace test relying on pselect6(2) man page ambiguity
Francis Deslauriers [Wed, 31 May 2017 21:08:23 +0000 (17:08 -0400)] 
Test: Replace test relying on pselect6(2) man page ambiguity

The `pselect_fd_too_big` test is checking for the case where the `nfds`
is larger than the number of open files allowed for this process
(RLIMIT_NOFILE).

According to the ERRORS section of the pselect6(2) kernel man page[1], if
`nfds` > RLIMIT_NOFILE is evaluate to true the pselect6 syscall should
return EINVAL but the BUGS section mentions that the current
implementation ignores any FD larger than the highest numbered FD of the
current process.

This is in fact what happens. The Linux implementation of the pselect6
syscall[2] does not compare the `nfds` and RLIMIT_NOFILE, but rather caps
`nfds` to the highest numbered FD of the current process as the BUGS
kernel man page mentionned.

It was observed elsewhere that there is a discrepancy between the manual
page and the implementation[3].

As a solution, replace the current testcase with one that checks the
behaviour of the syscall when an invalid FD is passed.

[1]:http://man7.org/linux/man-pages/man2/pselect6.2.html
[2]:https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/select.c#n619
[3]:https://patchwork.kernel.org/patch/9345805/

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Julien Desfossez <jdesfossez@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTests: channel subbuffers must be larger or equal to PAGE_SIZE
Jérémie Galarneau [Thu, 1 Jun 2017 19:26:47 +0000 (15:26 -0400)] 
Tests: channel subbuffers must be larger or equal to PAGE_SIZE

The multi-app notification test creates channel with 4096 byte
subbuffers. However, this is not supported on architectures
with larger pages, such as PPC64el.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTests: regression testing for notification API
Jonathan Rajotte [Wed, 31 May 2017 17:25:29 +0000 (13:25 -0400)] 
Tests: regression testing for notification API

This test suite includes tests for low and high buffer usage conditions,
triggers, and multi application client scenarios.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTests: add consumer testpoint to pause data consumption
Jérémie Galarneau [Thu, 25 May 2017 09:15:52 +0000 (05:15 -0400)] 
Tests: add consumer testpoint to pause data consumption

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: explicitly send client credentials during handshake
Jérémie Galarneau [Sun, 28 May 2017 17:35:40 +0000 (13:35 -0400)] 
Fix: explicitly send client credentials during handshake

The notification client does not send its credentials during
the handshake. However, the session daemon will still receive
them except in very rare, and hard to reproduce, cases.

It appears that the kernel will provide the credential cmsg
regardless of whether or not the client has actually sent them.

Inspecting the kernel source (af_unix.c) seems to indicate that
the credentials will be passed on sendmsg whenever one of the
sockets involved has set the SO_PASSCRED flag. It also seems to
maintain compatibility with applications that expect write() to
pass credentials by default. This explains why the explicit
passing didn't seem needed.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTest: utils: introduce LTTNG_SESSIOND_ENV_VARS
Jonathan Rajotte [Thu, 20 Apr 2017 21:19:35 +0000 (17:19 -0400)] 
Test: utils: introduce LTTNG_SESSIOND_ENV_VARS

When LTTNG_SESSIOND_ENV_VARS is set when calling start_lttng_sessiond_*
the value from LTTNG_SESSIOND_ENV_VARS will be passer to the "env"
command while launching the sessiond.

Allow the use of LD_PRELOAD, LTTNG_ENABLE_TESTPOINT and others.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTest: utils.sh: use getconf to start either 32 or 64 consumerd
Jonathan Rajotte [Thu, 20 Apr 2017 21:16:20 +0000 (17:16 -0400)] 
Test: utils.sh: use getconf to start either 32 or 64 consumerd

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoOptimization: remove unnecessary buffer resizes on partial recvs
Jérémie Galarneau [Sat, 27 May 2017 15:54:30 +0000 (11:54 -0400)] 
Optimization: remove unnecessary buffer resizes on partial recvs

Using the dynamic buffer's size to express the current offset
results in unnecessary resized and re-zeroing of areas of the
buffer.

The reception buffer's size is now used to express the total
size of the expected incoming message. The offset can be inferred
from the "bytes_left_to_receive" variable and message size. It
also, arguably, makes the code simpler to follow.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoDocs: improve the documentation of the dynamic buffer interface
Jérémie Galarneau [Sat, 27 May 2017 11:32:50 +0000 (07:32 -0400)] 
Docs: improve the documentation of the dynamic buffer interface

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoAdd comment to round_to_power_of_2()
Jérémie Galarneau [Sat, 27 May 2017 11:18:35 +0000 (07:18 -0400)] 
Add comment to round_to_power_of_2()

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: simplify the implementation of dynamic buffer set_capacity
Jérémie Galarneau [Sat, 27 May 2017 10:59:18 +0000 (06:59 -0400)] 
Clean-up: simplify the implementation of dynamic buffer set_capacity

Only use realloc() to implement set_capacity's logic. In the case
where buf is NULL, realloc acts like malloc() anyhow.

Moreover, the memory does not need to be zeroed on allocation since
size increases provide this guarantee.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: space left in buffer may be uninitilized on capacity increase
Jérémie Galarneau [Sat, 27 May 2017 10:26:27 +0000 (06:26 -0400)] 
Fix: space left in buffer may be uninitilized on capacity increase

In the following case of dynamic buffer resize:

|---------|---------------------|------------------------|
          ^                     ^                        ^
 (a) original_size     (b) original_capacity     (c) new_capacity

The code (correctly) assumes that the space between b and c is
zero-initialized. However, the space between a and b will be left
uninitialized.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoAssert that dynamic buffer size <= capacity
Jérémie Galarneau [Sat, 27 May 2017 10:20:04 +0000 (06:20 -0400)] 
Assert that dynamic buffer size <= capacity

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: improve readability of dynamic buffer append condition
Jérémie Galarneau [Sat, 27 May 2017 10:19:30 +0000 (06:19 -0400)] 
Clean-up: improve readability of dynamic buffer append condition

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: inbound buffer may be set too short on partial command reception
Jérémie Galarneau [Sat, 27 May 2017 10:17:48 +0000 (06:17 -0400)] 
Fix: inbound buffer may be set too short on partial command reception

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: fix misleading code alignment
Jérémie Galarneau [Sat, 27 May 2017 10:14:39 +0000 (06:14 -0400)] 
Clean-up: fix misleading code alignment

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: remove useless comment
Jérémie Galarneau [Sat, 27 May 2017 10:14:23 +0000 (06:14 -0400)] 
Clean-up: remove useless comment

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: return LTTNG_ERR_INVALID_TRIGGER on validation failure
Jérémie Galarneau [Thu, 25 May 2017 09:17:14 +0000 (05:17 -0400)] 
Fix: return LTTNG_ERR_INVALID_TRIGGER on validation failure

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: missing includes in buffer-usage.h
Jérémie Galarneau [Thu, 25 May 2017 09:16:38 +0000 (05:16 -0400)] 
Fix: missing includes in buffer-usage.h

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoUnit tests for notification api
Jonathan Rajotte [Fri, 24 Mar 2017 15:30:34 +0000 (11:30 -0400)] 
Unit tests for notification api

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoRun unit tests before regression tests
Jonathan Rajotte [Fri, 24 Mar 2017 15:29:34 +0000 (11:29 -0400)] 
Run unit tests before regression tests

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: semaphore semantics are expected from notification command eventfd
Jérémie Galarneau [Tue, 23 May 2017 14:15:59 +0000 (10:15 -0400)] 
Fix: semaphore semantics are expected from notification command eventfd

The notification command queue currently expects eventfd() to
behave according to EFD_SEMAPHORE semantics. Right now, multiple
commands could be enqueued and reading the eventfd resets its
internal counter to 0. This will cause the notification thread
to never process the next command.

EFD_SEMAPHORE will ensure that poll/epoll signals that there is
info available for reading until the eventfd's internal counter
returns to 0.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agorelay: use urcu_ref_get_unless_zero
Mathieu Desnoyers [Thu, 17 Sep 2015 16:24:32 +0000 (12:24 -0400)] 
relay: use urcu_ref_get_unless_zero

This allows removing the reflock be performing this check and increment
atomically.

The minimum version of userspace-rcu is bumped to 0.9.0 as
urcu_ref_get_unless_zero() was introduced as part of that
release.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: use "flush empty" ioctl for snapshots
Mathieu Desnoyers [Thu, 11 May 2017 21:53:58 +0000 (17:53 -0400)] 
Fix: use "flush empty" ioctl for snapshots

When the flush empty ioctl is available, use it to produce an empty
packet at the end of the snapshot, which ensures the stream intersection
feature works.

If this specific ioctl is not available, fallback on the "flush" ioctl,
which does not produce empty packets.

In that situation, there were two prior behaviors possible for
lttng-modules: earlier versions implement a "snapshot" command which
does not perform an implicit "flush_empty". In that case, the stream
intersection feature may not be reliable. In more recent lttng-modules
versions (included stable branch) which did not implement the
flush_empty ioctl, the snapshot ioctl implicitly performed a
flush_empty, which makes the stream intersection feature work, but has
side-effects on the snapshot ioctl performed by the live timer (produces
a stream of empty packets in live mode).

[ Please apply to master, 2.10, 2.9, 2.8 branches. ]

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: lttng-consumerd: cpu hotplug: send "streams_sent" command
Mathieu Desnoyers [Thu, 11 May 2017 20:00:56 +0000 (16:00 -0400)] 
Fix: lttng-consumerd: cpu hotplug: send "streams_sent" command

When creating a new channel, the streams being sent to the relayd are
kept invisible to the live client until the "streams_sent" command is
received. This ensures the client does not see a partial stream set.

This "streams_sent" command needs to be sent on CPU hotplug too,
otherwise the live client handling within relayd is not aware of those
streams (they are never published).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: lttng-sessiond: cpu hotplug: send channel to consumer only once
Mathieu Desnoyers [Thu, 11 May 2017 20:00:55 +0000 (16:00 -0400)] 
Fix: lttng-sessiond: cpu hotplug: send channel to consumer only once

On CPU hotplug, we currently send a duplicate of the channel key, which
allocates its own object (duplicated) within the consumerd. We want the
newly added stream to map to the pre-existing channel key, so don't send
the channel duplicate.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: lttng-sessiond: cpu hotplug stream number mismatch
Mathieu Desnoyers [Thu, 11 May 2017 20:00:54 +0000 (16:00 -0400)] 
Fix: lttng-sessiond: cpu hotplug stream number mismatch

The counter should be always increasing (kept in the channel), rather
than local to the function. This causes cpu hotplug handling to
disregard further streams that should be added to the consumer output
on CPU hotplug.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoTests: use SIGKILL to shutdown daemons in test_thread_exit and test_tp_fail
Jérémie Galarneau [Fri, 19 May 2017 15:19:16 +0000 (11:19 -0400)] 
Tests: use SIGKILL to shutdown daemons in test_thread_exit and test_tp_fail

A current design limitation of the lttng-consumerd will cause it to
hang on shutdown if the timer management thread exits as the teardown
of channels switches off the channel's timers. The timer thread is
then expected to purge timer signals and signal when it is done.

Obviously this state will never be reached as signals are no longer
being processed. This is not dramatic as this is not what this test
is meant to test; we only want to make sure the health check signals that
something went wrong.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: consumer_timer_signal_thread_qs waits on LTTNG_CONSUMER_SIG_SWITCH
Jérémie Galarneau [Thu, 18 May 2017 20:15:20 +0000 (16:15 -0400)] 
Fix: consumer_timer_signal_thread_qs waits on LTTNG_CONSUMER_SIG_SWITCH

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoRevert "Fix: futex can be free'd while used by waker thread"
Jérémie Galarneau [Thu, 18 May 2017 15:40:03 +0000 (11:40 -0400)] 
Revert "Fix: futex can be free'd while used by waker thread"

This reverts commit dce89628fd85a875a8dc511d861057f218f3c1c8.

7 years agoFix: thread exit vs futex wait/wakeup race
Mathieu Desnoyers [Wed, 17 May 2017 22:36:54 +0000 (18:36 -0400)] 
Fix: thread exit vs futex wait/wakeup race

relayd_live_stop performs, in this order:

        CMM_STORE_SHARED(live_dispatch_thread_exit, 1);   [A]
        futex_nto1_wake(&viewer_conn_queue.futex);        [B]

whereas thread_dispatcher does:

   while (!CMM_LOAD_SHARED(live_dispatch_thread_exit)) {  [1]

     [...]
     futex_nto1_prepare(&viewer_conn_queue.futex);        [2]
     [...]
     futex_nto1_wait(&viewer_conn_queue.futex);           [3]

Unfortunately, on the following sequence:

[1] [A] [B] [2] [3]

thread_dispatcher will end up hanging.

We need to move the live_dispatch_thread_exit load between "prepare" and
"wait" to fix this.

There are similar scenarios with relay_thread_dispatcher, and the
session daemon thread_dispatch_ust_registration, which are also fixed
here.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: use lttng_waiter instead of futex in notification thread
Jérémie Galarneau [Wed, 17 May 2017 20:03:13 +0000 (16:03 -0400)] 
Fix: use lttng_waiter instead of futex in notification thread

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoAdd lttng_waiter utils
Jérémie Galarneau [Wed, 17 May 2017 15:16:33 +0000 (11:16 -0400)] 
Add lttng_waiter utils

This utils is adapted from userspace-rcu's urcu-wait.h

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: pthread_join on thread start error
Jérémie Galarneau [Mon, 15 May 2017 19:14:07 +0000 (15:14 -0400)] 
Fix: pthread_join on thread start error

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: missing check on notification thread join
Jérémie Galarneau [Mon, 15 May 2017 15:16:45 +0000 (11:16 -0400)] 
Fix: missing check on notification thread join

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: status_loc argument of waitpid() is used on error
Jérémie Galarneau [Mon, 15 May 2017 14:37:18 +0000 (10:37 -0400)] 
Fix: status_loc argument of waitpid() is used on error

waitpid() may leave stat_loc uninitialized on error (depending
on errno's value, see WAIT(3)).

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: leak of deserialized trigger sent from client
Jérémie Galarneau [Thu, 11 May 2017 20:16:12 +0000 (16:16 -0400)] 
Fix: leak of deserialized trigger sent from client

Deserialized triggers may be leaked on error when
registered or unregistered by the session daemon.

Reported-by: Coverity Scan
CID 1374801 (#1 of 1): Resource leak (RESOURCE_LEAK)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: missing static qualifier on internal function
Jérémie Galarneau [Thu, 11 May 2017 20:13:25 +0000 (16:13 -0400)] 
Clean-up: missing static qualifier on internal function

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: duplicate values used in lttng_evaluation_status enum
Jérémie Galarneau [Thu, 11 May 2017 14:02:48 +0000 (10:02 -0400)] 
Fix: duplicate values used in lttng_evaluation_status enum

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: missing header inclusions in buffer-usage.h
Jérémie Galarneau [Thu, 11 May 2017 14:02:19 +0000 (10:02 -0400)] 
Fix: missing header inclusions in buffer-usage.h

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: COMPAT_EPOLL_PROC_PATH is available from Linux 2.6.28
Jonathan Rajotte [Tue, 9 May 2017 19:46:35 +0000 (15:46 -0400)] 
Fix: COMPAT_EPOLL_PROC_PATH is available from Linux 2.6.28

v2: Typo in commit message "per see" -> "per se"

Failing on opening [1] is not an error per se. [1] was
introduced in Linux 2.6.28 but epoll is available since
2.5.44. Hence, goto end and set a default value without
setting error return value.

[1] /proc/sys/fs/epoll/max_user_watches

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: skip empty revents in notificationthread
Jérémie Galarneau [Wed, 10 May 2017 20:42:09 +0000 (16:42 -0400)] 
Fix: skip empty revents in notificationthread

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: warning caused by unused label
Jérémie Galarneau [Wed, 10 May 2017 19:49:57 +0000 (15:49 -0400)] 
Clean-up: warning caused by unused label

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: futex can be free'd while used by waker thread
Jérémie Galarneau [Wed, 10 May 2017 19:36:23 +0000 (15:36 -0400)] 
Fix: futex can be free'd while used by waker thread

The futex_nto1 utils assume that the futex it operates on
has a program-long lifetime (or that is is protected by a
third-party).

The notification command system uses a futex allocated on the
waiter's stack. However, the waiter could never enter the
futex() syscall (due to of the opportunist check before the futex
call). In this case, the waiter's stack-allocated futex becomes
invalid, but will be used by the waker to perform the FUTEX_WAKE
operation.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: missing header causes build failure with --disable-epoll
Jérémie Galarneau [Tue, 9 May 2017 12:50:39 +0000 (08:50 -0400)] 
Fix: missing header causes build failure with --disable-epoll

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: return NULL pointer on memory allocation failure
Mathieu Desnoyers [Mon, 8 May 2017 11:48:52 +0000 (07:48 -0400)] 
Fix: return NULL pointer on memory allocation failure

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: unused variable warning in poll compat
Jérémie Galarneau [Tue, 9 May 2017 12:20:17 +0000 (08:20 -0400)] 
Clean-up: unused variable warning in poll compat

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agodoc: how to trace consumerd with valgrind
Mathieu Desnoyers [Mon, 8 May 2017 12:38:37 +0000 (08:38 -0400)] 
doc: how to trace consumerd with valgrind

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: initialize kernel ioctl ABI structures to 0
Mathieu Desnoyers [Mon, 8 May 2017 12:34:57 +0000 (08:34 -0400)] 
Cleanup: initialize kernel ioctl ABI structures to 0

Valgrind complains that we pass uninitialized data to the kernel.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoCleanup: initialize data to 0
Mathieu Desnoyers [Mon, 8 May 2017 12:15:20 +0000 (08:15 -0400)] 
Cleanup: initialize data to 0

Valgrind catches read of uninitialized data caused by the on-stack
"data" argument which ends up not being fully initialized (it contains a
union).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: consumer data lock deadlock caused by monitor timer
Jérémie Galarneau [Mon, 8 May 2017 19:06:25 +0000 (15:06 -0400)] 
Fix: consumer data lock deadlock caused by monitor timer

The execution of the monitor timer takes the consumer data lock
which causes three threads to deadlock.

The consumer_thread_data_poll_thread takes the lock during
the teardown of a channel. This teardown stops the channel's
timers and, to ensure that the timers are not fired on a free'd
channel, uses a custom SIG_TEARDOWN signal as a "bubble" inserted
the signal processing "queue". It then waits until this signal
has been processed to release the consumer data lock.

The sessiond_poll_thread is creating a channel and waits on
the consumer data lock.

Meanwhile, the timer thread is blocked on this same lock
during the processing of the monitor timer signal which
prevents the queue from being flushed, causing the destruction
of the channel to never reach completion.

There is no need to take the consumer data lock in the monitor
timer code since the channel's existence is guaranteed by
the SIG_TEARDOWN mechanism.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: assert() on null index_file in lttng_index_file_write()
Jonathan Rajotte [Mon, 24 Apr 2017 19:59:20 +0000 (15:59 -0400)] 
Fix: assert() on null index_file in lttng_index_file_write()

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: fail on relayd lookup when finding a relayd is expected
Jonathan Rajotte [Mon, 24 Apr 2017 19:32:15 +0000 (15:32 -0400)] 
Fix: fail on relayd lookup when finding a relayd is expected

An actual relayd lookup error leads to using the code path of a local
handling. Since stream->index_file is NULL when expecting a relayd, using
the code path for local handling results in an invalid access.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoClean-up: use lttng_read() wrapper instead of read()
Jérémie Galarneau [Mon, 8 May 2017 15:09:41 +0000 (11:09 -0400)] 
Clean-up: use lttng_read() wrapper instead of read()

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
7 years agoFix: NULL pointer dereference in lttng_condition_serialize
Jérémie Galarneau [Sun, 7 May 2017 19:51:42 +0000 (15:51 -0400)] 
Fix: NULL pointer dereference in lttng_condition_serialize

Reported-by: Coverity Scan
*** CID 1374823:  Null pointer dereferences  (REVERSE_INULL)

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
This page took 0.044834 seconds and 4 git commands to generate.