From: Mathieu Desnoyers Date: Tue, 30 Jun 2020 18:24:29 +0000 (-0400) Subject: Add lttng-modules design document X-Git-Tag: v2.13.0-rc1~203 X-Git-Url: http://git.liburcu.org/?p=lttng-modules.git;a=commitdiff_plain;h=c719e4809ccdd37295616cc35b9ae4f633c4ddc5 Add lttng-modules design document Signed-off-by: Mathieu Desnoyers --- diff --git a/doc/lttng-modules-design.txt b/doc/lttng-modules-design.txt new file mode 100644 index 00000000..5b896d57 --- /dev/null +++ b/doc/lttng-modules-design.txt @@ -0,0 +1,184 @@ +LTTng modules design +--------------------- + +by Mathieu Desnoyers +June 30, 2020 + +This document covers the high level design of lttng-modules. + +LTTng modules is a kernel tracer for the Linux kernel. It can be either +loaded as a set of kernel modules, or built into a Linux kernel. + +Here are its key components: + +* LTTng modules ABI + + Files: + - src/lttng-abi.c + - include/lttng/abi.h + + This ABI consists of ioctls with code 0xF6. It extensively uses + anonymous file descriptors to represent the tracer "objects". Only + root is allowed to interact with those ioctls. + + +* LTTng session, channels, contexts and events management + - src/lttng-events.c + - include/lttng/lttng-events.h + + Current state about configured tracing sessions, channels, contexts + and events. The session, channel, context and event state is + manipulated through the LTTng modules ABI. A session contains 0 or + more channels, through which data is traced. A channel is associated + with an instance of a lib ring buffer client. Channels have 0 or more + events, which are associated to kernel instrumentation as event + sources. + + +* lib ring buffer + + Generic ring buffer library (kernel implementation). Note, there is + a very similar copy of this implementation within the lttng-ust + user-space tracer. The overall goal of this library is to support + both kernel and user-space tracing. + + Files: + - src/lib/ringbuffer/* + - include/ringbuffer/* + + Those include ring buffer ABI meant for consuming the buffer data + from user-space. It is implemented in: + + - src/lib/ringbuffer/ring_buffer_vfs.c (open, release, poll, ioctl) + - src/lib/ringbuffer/ring_buffer_mmap.c (mmap) + - src/lib/ringbuffer/ring_buffer_splice.c (splice) + - include/ringbuffer/vfs.h: lib ring buffer ioctl commands (code 0xF6). + + The ring buffer library can be configured to be used in various + use-cases by creating a specialized ring buffer "client" (template). + include/ringbuffer/config.h details the various configurations which + are supported. + + +* LTTng modules ring buffer clients + + Files: + - src/lttng-ring-buffer-client-discard.c + - src/lttng-ring-buffer-client-mmap-discard.c + - src/lttng-ring-buffer-client-mmap-overwrite.c + - src/lttng-ring-buffer-client-overwrite.c + - src/lttng-ring-buffer-metadata-client.c + - src/lttng-ring-buffer-metadata-mmap-client.c + - src/lttng-ring-buffer-client.h + - src/lttng-ring-buffer-metadata-client.h + + Those are the users of lib ring buffer, with specialized instances of + the ring buffer for each use-case supported by LTTng. Those are + hand-crafted templates in C. The fast-paths are inlined within each + client, and the slow paths are kept in the common library to minimize + code memory usage. + + +* LTTng filter + + The filter in lttng-modules is meant to quickly discard events which + do not match an expression. The expression parsing is all done in + userspace within lttng-tools. The filter is received by lttng-modules + as a bytecode. The frequent case for which a filter is optimized is to + discard most of the events. The filter operates on input arguments + received on the stack, before the ring buffer is touched. + + Files: + - include/lttng/filter-bytecode.h: LTTng filter bytecode. + - src/lttng-filter-validator.c: Validation pass on bytecode reception + - src/lttng-filter.c: Filter linker code: link a bytecode onto a given + event (knowing its fields offsets). + - src/lttng-filter-specialize.c: Specialize the bytecode, transforming + generic instructions into + type-specific (faster) instructions. + - src/lttng-filter-interpreter.c: Bytecode interpreter, called by + instrumentation to filter events. + +* LTTng contexts + + LTTng-modules supports the notion of "contexts" which can be attached either + to specific events or to all events in a channel. Those are additional + data which can be saved prior to the event payload, e.g. current + thread ID, process name, performance counters, and more. + + Files: + - src/lttng-context.c: Context state associated to a channel or event, + and helpers. + - src/lttng-context-*.c: Implementation of all supported contexts: + callstack, cgroup-ns, cpu-id, egid, euid, gid, hostname, + interruptible, ipc-ns, migratable, mnt-ns, need-reschedule, net-ns, + nice, perf-counters, pid, pis-ns, ppid, preemptible, prio, procname, + sgid, suid, tid, uid, user-ns, uts-ns, vegid, veuid, vgid, vpid, vppid, + vsgid, vtid, vuid. + + +* LTTng tracepoint instrumentation + + The LTTng tracer attaches "probes" to kernel subsystems. A probe is a + set of tracepoint callbacks matching the tracepoint instrumentation + for a kernel subsystem. Each probe can be loaded separately. + + Due to limitations in the kernel TRACE_EVENT macros, LTTng + implements its own LTTNG_TRACEPOINT_EVENT macros. It uses the + upstream kernel TRACE_EVENT macros only to validate the prototype + of its callbacks. Also, LTTng exposes an event field semantic which + matches what is exposed to user-space through /proc in the traces, + which requires different field layout implementation than what the + upstream kernel exposes to user-space. + + Files: + src/lttng-tracepoint.c: Mapping between tracepoint instrumentation and LTTng + events. + src/lttng-probes.c: LTTng probes registry. + include/instrumentation/events/*: LTTng tracepoint instrumentation + headers for all kernel subsystems. + + +* LTTng system call instrumentation + + The LTTng tracer gathers both input and output arguments from each + system call, for all supported architectures. This means the system + call probe callbacks read from user-space memory when needed. + + Files: + - src/lttng-syscalls.c: LTTng system call instrumentation callbacks and + tables. + - include/instrumentation/syscall/*: generated and override system + call instrumentation headers. + + +* LTTng statedump + + Dump kernel state at trace start or when an explicit "statedump" is + requested. Useful to reconstruct the entire kernel state at + post-processing. Dumps: threads scheduling state, file + descriptor tables, interrupt handlers, network interfaces, block + devices, cpu topology. Also performs a "fence" on all CPUs to reach + a quiescent state on all CPUs before start and end of statedump. + + Files: + - src/lttng-statedump-impl.c + + +* LTTng tracker + + User ID and Process ID trackers, for filtering of entire sessions + based on UID, GID, and PID. + + Files: + - src/lttng-tracker-id.c + + +* LTTng clock + + Clock plugin registration. The clock used by the LTTng modules kernel + tracer can be overridden by a plugin module. + + Files: + - src/lttng-clock.c + - include/lttng/clock.h