LTTng modules design --------------------- by Mathieu Desnoyers June 30, 2020 This document covers the high level design of lttng-modules. LTTng modules is a kernel tracer for the Linux kernel. It can be either loaded as a set of kernel modules, or built into a Linux kernel. Here are its key components: * LTTng modules ABI Files: - src/lttng-abi.c - include/lttng/abi.h This ABI consists of ioctls with code 0xF6. It extensively uses anonymous file descriptors to represent the tracer "objects". Only root is allowed to interact with those ioctls. * LTTng session, channels, contexts and events management - src/lttng-events.c - include/lttng/lttng-events.h Current state about configured tracing sessions, channels, contexts and events. The session, channel, context and event state is manipulated through the LTTng modules ABI. A session contains 0 or more channels, through which data is traced. A channel is associated with an instance of a lib ring buffer client. Channels have 0 or more events, which are associated to kernel instrumentation as event sources. * lib ring buffer Generic ring buffer library (kernel implementation). Note, there is a very similar copy of this implementation within the lttng-ust user-space tracer. The overall goal of this library is to support both kernel and user-space tracing. Files: - src/lib/ringbuffer/* - include/ringbuffer/* Those include ring buffer ABI meant for consuming the buffer data from user-space. It is implemented in: - src/lib/ringbuffer/ring_buffer_vfs.c (open, release, poll, ioctl) - src/lib/ringbuffer/ring_buffer_mmap.c (mmap) - src/lib/ringbuffer/ring_buffer_splice.c (splice) - include/ringbuffer/vfs.h: lib ring buffer ioctl commands (code 0xF6). The ring buffer library can be configured to be used in various use-cases by creating a specialized ring buffer "client" (template). include/ringbuffer/config.h details the various configuration parameters which are supported. * LTTng modules ring buffer clients Files: - src/lttng-ring-buffer-client-discard.c - src/lttng-ring-buffer-client-mmap-discard.c - src/lttng-ring-buffer-client-mmap-overwrite.c - src/lttng-ring-buffer-client-overwrite.c - src/lttng-ring-buffer-metadata-client.c - src/lttng-ring-buffer-metadata-mmap-client.c - src/lttng-ring-buffer-client.h - src/lttng-ring-buffer-metadata-client.h Those are the users of lib ring buffer, with specialized instances of the ring buffer for each use-case supported by LTTng. Those are hand-crafted templates in C. The fast-paths are inlined within each client, and the slow paths are kept in the common library to minimize code memory usage. * LTTng filter The filter in lttng-modules is meant to quickly discard events which do not match an expression. The expression parsing is all done in userspace within lttng-tools. The filter is received by lttng-modules as a bytecode. The frequent case for which a filter is optimized is to discard most of the events. The filter operates on input arguments received on the stack, before the ring buffer is touched. Files: - include/lttng/filter-bytecode.h: LTTng filter bytecode. - src/lttng-filter-validator.c: Validation pass on bytecode reception - src/lttng-filter.c: Filter linker code: link a bytecode onto a given event (knowing its fields offsets). - src/lttng-filter-specialize.c: Specialize the bytecode, transforming generic instructions into type-specific (faster) instructions. - src/lttng-filter-interpreter.c: Bytecode interpreter, called by instrumentation to filter events. * LTTng contexts LTTng-modules supports the notion of "contexts" which can be attached either to specific events or to all events in a channel. Those are additional data which can be saved prior to the event payload, e.g. current thread ID, process name, performance counters, and more. Files: - src/lttng-context.c: Context state associated to a channel or event, and helpers. - src/lttng-context-*.c: Implementation of all supported contexts: callstack, cgroup-ns, cpu-id, egid, euid, gid, hostname, interruptible, ipc-ns, migratable, mnt-ns, need-reschedule, net-ns, nice, perf-counters, pid, pis-ns, ppid, preemptible, prio, procname, sgid, suid, tid, uid, user-ns, uts-ns, vegid, veuid, vgid, vpid, vppid, vsgid, vtid, vuid. * LTTng tracepoint instrumentation The LTTng tracer attaches "probes" to kernel subsystems. A probe is a set of tracepoint callbacks matching the tracepoint instrumentation for a kernel subsystem. Each probe can be loaded separately. Due to limitations in the kernel TRACE_EVENT macros, LTTng implements its own LTTNG_TRACEPOINT_EVENT macros. It uses the upstream kernel TRACE_EVENT macros only to validate the prototype of its callbacks. Also, LTTng exposes an event field semantic which matches what is exposed to user-space through /proc in the traces, which requires different field layout implementation than what the upstream kernel exposes to user-space. Files: src/lttng-tracepoint.c: Mapping between tracepoint instrumentation and LTTng events. src/lttng-probes.c: LTTng probes registry. include/instrumentation/events/*: LTTng tracepoint instrumentation headers for all kernel subsystems. * LTTng system call instrumentation The LTTng tracer gathers both input and output arguments from each system call, for all supported architectures. This means the system call probe callbacks read from user-space memory when needed. Files: - src/lttng-syscalls.c: LTTng system call instrumentation callbacks and tables. - include/instrumentation/syscall/*: generated and override system call instrumentation headers. * LTTng statedump Dump kernel state at trace start or when an explicit "statedump" is requested. Useful to reconstruct the entire kernel state at post-processing. Dumps: threads scheduling state, file descriptor tables, interrupt handlers, network interfaces, block devices, cpu topology. Also performs a "fence" on all CPUs to reach a quiescent state on all CPUs before start and end of statedump. Files: - src/lttng-statedump-impl.c * LTTng tracker User ID and Process ID trackers, for filtering of entire sessions based on UID, GID, and PID. Files: - src/lttng-tracker-id.c * LTTng clock Clock plugin registration. The clock used by the LTTng modules kernel tracer can be overridden by a plugin module. Files: - src/lttng-clock.c - include/lttng/clock.h