From: Mathieu Desnoyers Date: Thu, 2 Feb 2023 15:25:57 +0000 (-0500) Subject: Fix: use unaligned pointer accesses for lttng_inline_memcpy X-Git-Url: http://git.liburcu.org/?p=lttng-ust.git;a=commitdiff_plain;h=7fdb690ce0a7187b23f439d569ddd629a702871f Fix: use unaligned pointer accesses for lttng_inline_memcpy lttng_inline_memcpy receives pointers which can be unaligned. This causes issues (traps) specifically on arm 32-bit with 8-byte strings (including \0). Use unaligned pointer accesses for loads/stores within lttng_inline_memcpy instead. There is an impact on code generation on some architectures. Using the following test code on godbolt.org: void copy16_aligned(void *dest, void *src) { *(uint16_t *)dest = *(uint16_t *) src; } void copy16_unaligned(void *dest, void *src) { STORE_UNALIGNED_INT(uint16_t, dest, LOAD_UNALIGNED_INT(uint16_t, src)); } void copy32_aligned(void *dest, void *src) { *(uint32_t *)dest = *(uint32_t *) src; } void copy32_unaligned(void *dest, void *src) { STORE_UNALIGNED_INT(uint32_t, dest, LOAD_UNALIGNED_INT(uint32_t, src)); } void copy64_aligned(void *dest, void *src) { *(uint64_t *)dest = *(uint64_t *) src; } void copy64_unaligned(void *dest, void *src) { STORE_UNALIGNED_INT(uint64_t, dest, LOAD_UNALIGNED_INT(uint64_t, src)); } The resulting assembler (gcc 12.2.0 in -O2) between aligned and unaligned: - x86-32: unchanged. - x86-64: unchanged. - powerpc32: unchanged. - powerpc64: unchanged. - arm32: 16 and 32-bit copy: unchanged. Added code for 64-bit unaligned copy. - aarch64: unchanged. - mips32: added code for unaligned. - mips64: added code for unaligned. - riscv: added code for unaligned. If we want to improve the situation on mips and riscv, this would require introducing a new "lttng_inline_integer_copy" and expose additional ring buffer client APIs in addition to event_write() which take integers as inputs. Let's not introduce that complexity yet until it is justified. Signed-off-by: Mathieu Desnoyers Change-Id: I1e6471d4607ac6aff89f16ef24d5370e804b7612 --- diff --git a/src/common/ringbuffer/backend_internal.h b/src/common/ringbuffer/backend_internal.h index bd422177..a3303b72 100644 --- a/src/common/ringbuffer/backend_internal.h +++ b/src/common/ringbuffer/backend_internal.h @@ -595,6 +595,21 @@ int update_read_sb_index(const struct lttng_ust_ring_buffer_config *config, #define inline_memcpy(dest, src, n) memcpy(dest, src, n) #endif +#define LOAD_UNALIGNED_INT(type, p) \ + ({ \ + struct packed_struct { type __v; } __attribute__((packed)); \ + (((const struct packed_struct *) (p))->__v); \ + }) + +#define STORE_UNALIGNED_INT(type, p, v) \ + do { \ + struct packed_struct { type __v; } __attribute__((packed)); \ + ((struct packed_struct *) (p))->__v = (v); \ + } while (0) + +/* + * Copy from src into dest, assuming unaligned src and dest. + */ static inline void lttng_inline_memcpy(void *dest, const void *src, unsigned long len) @@ -608,13 +623,13 @@ void lttng_inline_memcpy(void *dest, const void *src, *(uint8_t *) dest = *(const uint8_t *) src; break; case 2: - *(uint16_t *) dest = *(const uint16_t *) src; + STORE_UNALIGNED_INT(uint16_t, dest, LOAD_UNALIGNED_INT(uint16_t, src)); break; case 4: - *(uint32_t *) dest = *(const uint32_t *) src; + STORE_UNALIGNED_INT(uint32_t, dest, LOAD_UNALIGNED_INT(uint32_t, src)); break; case 8: - *(uint64_t *) dest = *(const uint64_t *) src; + STORE_UNALIGNED_INT(uint64_t, dest, LOAD_UNALIGNED_INT(uint64_t, src)); break; default: inline_memcpy(dest, src, len);