Fix: relayd: rotation failure for multi-domain session
authorJérémie Galarneau <jeremie.galarneau@efficios.com>
Wed, 12 Jan 2022 20:48:00 +0000 (15:48 -0500)
committerJérémie Galarneau <jeremie.galarneau@efficios.com>
Fri, 21 Jan 2022 23:01:51 +0000 (18:01 -0500)
Observed issue
==============

Rotating a multi-domain streaming session results in the following
error:

  $ lttng rotate
  Waiting for rotation to complete...
  Error: Failed to retrieve rotation state.

Meanwhile, the relay daemon logs indicate the following:

  DBG1 - 14:56:04.213163667 [265774/265778]: lttng_trace_chunk_rename_path from .tmp_new_chunk to (null) (in lttng_trace_chunk_rename_path_no_lock() at trace-chunk.cpp:759)
  PERROR - 14:56:04.213242941 [265774/265778]: Failed to move trace chunk directory ".tmp_new_chunk" to "20220112T145604-0500-1": No such file or directory (in lttng_trace_chunk_rename_path_no_lock() at trace-chunk.cpp:799)
  DBG1 - 14:56:04.213396931 [265774/265778]: aborting session 2 (in session_abort() at session.cpp:588)
  DBG1 - 14:56:04.213512198 [265774/265778]: Control connection closed with 22 (in relay_thread_close_connection() at main.cpp:3874)

The 'abort' of session 2 here causes the kernel consumer to fail to
consume subbuffers:

  Error: Relayd send index failed. Cleaning up relayd 3.
  Error: Error consuming subbuffer: (0)
  [...]

Cause
=====

Following the flow of execution in the relay daemon shows that different
trace chunks are used by the two relay sessions that result from the
streaming of a single multi-domain session. Both trace chunks "own" the
same output directory.

When a rotation is performed, the first trace chunk to be closed will
move the directory. Then, the second trace chunk to be closed will
attempt to do the same, failing to do so as seen in the relay daemon
log.

Solution
========

Using different trace chunk instances for relay sessions belonging to a
single sessiond session goes against the intended use of the sessiond
trace chunk registry.

A sessiond trace chunk registry allows the relay daemon to share trace
chunks used by different "relay sessions" when they were created for the
same user-visible session daemon session. Tracing multiple domains (e.g.
ust and kernel) results in per-domain relay sessions being created.

Sharing trace chunks, and their output directory more specifically, is
essential to properly implement session rotations. The sharing of output
directory handles allows directory renames to be performed once and
without races that would stem from from multiple renames.

The reason why sessiond trace chunk registry returns different trace
chunk instances for two relay sessions is that the wrong session `id` is
used to publish trace chunks. The `id` that must be used to share trace
chunks accross the relay sessions that belong to the same sessiond
session is `id_sessiond`.

`id_sessiond` is optional as it is only provided by consumers v2.11+.
Otherwise, it is fine to use the relay session `id`: it is a unique id
for a given session daemon instance and those consumers will not issue a
session rotation (or clear) as the feature didn't exist.

A reference counting bug revealed by this change is also fixed in the
implementation of the sessiond trace chunk registry.

When the trace chunk is first published, two references to the published
chunks exist. One is taken by the registry while the other is being
returned to the caller. In the use case of the relay daemon, the
reference held by the registry itself is undesirable.

We want the trace chunk to be removed from the registry as soon as it is
not being used by the relay daemon (through a session or a stream). This
differs from the behaviour of the consumer daemon which relies on an
explicit command from the session daemon to release the registry's
reference.

In cases where the trace chunk had already been published, the reference
belonging to the sessiond trace chunk registry instance has already been
'put' by the firt publication. We must simply return the published trace
chunk with a reference taken on behalf of the caller.

Known drawbacks
===============

None.

Change-Id: Ic33443b114a87574a1b26ac5ccb022e47f886ddd
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
src/bin/lttng-relayd/main.c
src/bin/lttng-relayd/sessiond-trace-chunks.c
src/common/trace-chunk-registry.h
src/common/trace-chunk.c

index 4f7358e623b9dfee4899ffd1e9d46298d52ed41c..d3f9fe202a6c73f7a9ba2a7f7281fccb0a1fa7b7 100644 (file)
@@ -2655,7 +2655,10 @@ static int relay_rotate_session_streams(
                 */
                next_trace_chunk = sessiond_trace_chunk_registry_get_chunk(
                                sessiond_trace_chunk_registry,
-                               session->sessiond_uuid, session->id,
+                               session->sessiond_uuid,
+                               conn->session->id_sessiond.is_set ?
+                                       conn->session->id_sessiond.value :
+                                       conn->session->id,
                                rotate_streams.new_chunk_id.value);
                if (!next_trace_chunk) {
                        char uuid_str[LTTNG_UUID_STR_LEN];
@@ -2877,7 +2880,9 @@ static int relay_create_trace_chunk(const struct lttcomm_relayd_hdr *recv_hdr,
        published_chunk = sessiond_trace_chunk_registry_publish_chunk(
                        sessiond_trace_chunk_registry,
                        conn->session->sessiond_uuid,
-                       conn->session->id,
+                       conn->session->id_sessiond.is_set ?
+                               conn->session->id_sessiond.value :
+                               conn->session->id,
                        chunk);
        if (!published_chunk) {
                char uuid_str[LTTNG_UUID_STR_LEN];
@@ -2985,7 +2990,9 @@ static int relay_close_trace_chunk(const struct lttcomm_relayd_hdr *recv_hdr,
        chunk = sessiond_trace_chunk_registry_get_chunk(
                        sessiond_trace_chunk_registry,
                        conn->session->sessiond_uuid,
-                       conn->session->id,
+                       conn->session->id_sessiond.is_set ?
+                               conn->session->id_sessiond.value :
+                               conn->session->id,
                        chunk_id);
        if (!chunk) {
                char uuid_str[LTTNG_UUID_STR_LEN];
index 1065234739a8d6e529c1dc58ba3dde24140c2666..7005ef4965b5b99ce40e7ea931222edf3ccbf4b1 100644 (file)
@@ -360,6 +360,7 @@ struct lttng_trace_chunk *sessiond_trace_chunk_registry_publish_chunk(
        char uuid_str[LTTNG_UUID_STR_LEN];
        char chunk_id_str[MAX_INT_DEC_LEN(typeof(chunk_id))] = "-1";
        struct lttng_trace_chunk *published_chunk = NULL;
+       bool trace_chunk_already_published;
 
        lttng_uuid_to_str(sessiond_uuid, uuid_str);
        lttng_uuid_copy(key.sessiond_uuid, sessiond_uuid);
@@ -393,21 +394,30 @@ struct lttng_trace_chunk *sessiond_trace_chunk_registry_publish_chunk(
                goto end;
        }
 
-       published_chunk = lttng_trace_chunk_registry_publish_chunk(
-                       element->trace_chunk_registry, session_id, new_chunk);
+       published_chunk = lttng_trace_chunk_registry_publish_chunk_published(
+                       element->trace_chunk_registry, session_id, new_chunk,
+                       &trace_chunk_already_published);
        /*
-        * At this point, two references to the published chunks exist. One
-        * is taken by the registry while the other is being returned to the
-        * caller. In the use case of the relay daemon, the reference held
-        * by the registry itself is undesirable.
+        * When the trace chunk is first published, two references to the
+        * published chunks exist. One is taken by the registry while the other
+        * is being returned to the caller. In the use case of the relay daemon,
+        * the reference held by the registry itself is undesirable.
         *
         * We want the trace chunk to be removed from the registry as soon
         * as it is not being used by the relay daemon (through a session
         * or a stream). This differs from the behaviour of the consumer
         * daemon which relies on an explicit command from the session
         * daemon to release the registry's reference.
+        *
+        * In cases where the trace chunk had already been published,
+        * the reference belonging to the sessiond trace chunk
+        * registry instance has already been 'put'. We simply return
+        * the published trace chunk with a reference taken on behalf of the
+        * caller.
         */
-       lttng_trace_chunk_put(published_chunk);
+       if (!trace_chunk_already_published) {
+               lttng_trace_chunk_put(published_chunk);
+       }
 end:
        trace_chunk_registry_ht_element_put(element);
        return published_chunk;
index 00dd7fcf0343ebd8a38aa50f8189fdb91bdb55eb..b7905cac95d55a6d3b12f8b17c14649e44c35aa6 100644 (file)
@@ -61,6 +61,30 @@ struct lttng_trace_chunk *lttng_trace_chunk_registry_publish_chunk(
                struct lttng_trace_chunk_registry *registry,
                uint64_t session_id, struct lttng_trace_chunk *chunk);
 
+/*
+ * Adds the `previously_published` parameter which allows the caller
+ * to know if a trace chunk equivalent to `chunk` was previously published.
+ * 
+ * The registry holds a reference to the published trace chunks it contains.
+ * Trace chunks automatically unpublish themselves from their registry on
+ * destruction.
+ *
+ * This information is necessary to drop the reference of newly published
+ * chunks when a user doesn't wish to explicitly maintain all references
+ * to a given trace chunk.
+ * 
+ * For instance, the relay daemon doesn't need the registry to hold a
+ * reference since it controls the lifetime of its trace chunks.
+ * Conversely, the consumer daemons rely on the session daemon to inform
+ * them of the end of life of a trace chunk and the trace chunks don't
+ * belong to a specific top-level object: they are always retrieved from
+ * the registry by `id`.
+ */
+struct lttng_trace_chunk *lttng_trace_chunk_registry_publish_chunk_published(
+               struct lttng_trace_chunk_registry *registry,
+               uint64_t session_id, struct lttng_trace_chunk *chunk,
+               bool *previously_published);
+
 /*
  * Look-up a trace chunk by session_id and chunk_id.
  * A reference is acquired on behalf of the caller.
index 327612ce283b1db9e00de1ea6bf8ec2fb81fb77d..c41c23249dd8ffbed1d2fd6930d2207d18733645 100644 (file)
@@ -2000,7 +2000,20 @@ LTTNG_HIDDEN
 struct lttng_trace_chunk *
 lttng_trace_chunk_registry_publish_chunk(
                struct lttng_trace_chunk_registry *registry,
-               uint64_t session_id, struct lttng_trace_chunk *chunk)
+               uint64_t session_id,
+               struct lttng_trace_chunk *chunk)
+{
+       bool unused;
+
+       return lttng_trace_chunk_registry_publish_chunk_published(
+                       registry, session_id, chunk, &unused);
+}
+
+struct lttng_trace_chunk *
+lttng_trace_chunk_registry_publish_chunk_published(
+               struct lttng_trace_chunk_registry *registry,
+               uint64_t session_id, struct lttng_trace_chunk *chunk,
+               bool *previously_published)
 {
        struct lttng_trace_chunk_registry_element *element;
        unsigned long element_hash;
@@ -2035,6 +2048,7 @@ lttng_trace_chunk_registry_publish_chunk(
                        element->registry = registry;
                        /* Acquire a reference for the caller. */
                        if (lttng_trace_chunk_get(&element->chunk)) {
+                               *previously_published = false;
                                break;
                        } else {
                                /*
@@ -2061,6 +2075,7 @@ lttng_trace_chunk_registry_publish_chunk(
                if (lttng_trace_chunk_get(published_chunk)) {
                        lttng_trace_chunk_put(&element->chunk);
                        element = published_element;
+                       *previously_published = true;
                        break;
                }
                /*
This page took 0.030039 seconds and 4 git commands to generate.