X-Git-Url: https://git.liburcu.org/?p=urcu.git;a=blobdiff_plain;f=README;h=8d0b18e73e691bf282e1ad0c345ee07dfbc9f6e8;hp=d454757fceffa30e7622d64dfee07290537bb0ee;hb=92af1a30ca6a70945b167c31631c8598a626c71a;hpb=c51e75e62494c2f6aae4a88aa4499444716846d7 diff --git a/README b/README index d454757..8d0b18e 100644 --- a/README +++ b/README @@ -8,21 +8,87 @@ BUILDING ./configure make make install + ldconfig - Note: Forcing 32-bit build: - * CFLAGS=-m32 ./configure + Hints: Forcing 32-bit build: + * CFLAGS="-m32 -g -O2" ./configure Forcing 64-bit build: - * CFLAGS=-m64 ./configure + * CFLAGS="-m64 -g -O2" ./configure + + Forcing a 32-bit build with 386 backward compatibility: + * CFLAGS="-m32 -g -O2" ./configure --host=i386-pc-linux-gnu + + Forcing a 32-bit build for Sparcv9 (typical for Sparc v9) + * CFLAGS="-m32 -Wa,-Av9a -g -O2" ./configure + ARCHITECTURES SUPPORTED ----------------------- -Currently, x86 (only Pentium and +), x86 64, PowerPC 32/64 and S390 are -supported. The current use of sys_futex() makes it Linux-dependent, although -this portability limitation might go away in a near future by using the pthread -cond vars. Also, the restriction against i386 and i486 might go away if we -integrate some of glibc runtime CPU-detection tests. +Currently, Linux x86 (i386, i486, i586, i686), x86 64-bit, PowerPC 32/64, +S390, S390x, ARM, MIPS, Alpha, ia64 and Sparcv9 32/64 are supported. +Tested on Linux, FreeBSD 8.2/8.3/9.0/9.1/10.0 i386/amd64, and Cygwin. +Should also work on: Android, NetBSD 5, OpenBSD, Darwin (more testing +needed before claiming support for these OS). + +Linux ARM depends on running a Linux kernel 2.6.15 or better, GCC 4.4 or +better. + +The gcc compiler versions 3.3, 3.4, 4.0, 4.1, 4.2, 4.3, 4.4 and 4.5 are +supported, with the following exceptions: + +- gcc 3.3 and 3.4 have a bug that prevents them from generating volatile + accesses to offsets in a TLS structure on 32-bit x86. These versions are + therefore not compatible with liburcu on x86 32-bit (i386, i486, i586, i686). + The problem has been reported to the gcc community: + http://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg281255.html +- gcc 3.3 cannot match the "xchg" instruction on 32-bit x86 build. + See: http://kerneltrap.org/node/7507 +- Alpha, ia64 and ARM architectures depend on gcc 4.x with atomic builtins + support. For ARM this was introduced with gcc 4.4: + http://gcc.gnu.org/gcc-4.4/changes.html + +Clang version 3.0 (based on LLVM 3.0) is supported. + +Building on MacOS X (Darwin) requires a work-around for processor +detection: + # 32-bit + ./configure --build=i686-apple-darwin11 + # 64-bit + ./configure --build=x86_64-apple-darwin11 + +For developers using the git tree: + +This source tree is based on the autotools suite from GNU to simplify +portability. Here are some things you should have on your system in order to +compile the git repository tree : + +- GNU autotools (automake >=1.10, autoconf >=2.50, autoheader >=2.50) + (make sure your system wide "automake" points to a recent version!) +- GNU Libtool >=2.2 + (for more information, go to http://www.gnu.org/software/autoconf/) + +If you get the tree from the repository, you will need to use the "bootstrap" +script in the root of the tree. It calls all the GNU tools needed to prepare the +tree configuration. + +Test scripts provided in the tests/ directory of the source tree depend +on "bash" and the "seq" program. + + +API +--- + +See the relevant API documentation files in doc/. The APIs provided by +Userspace RCU are, by prefix: + +- rcu_ : Read-Copy Update (see doc/rcu-api.txt) +- cmm_ : Concurrent Memory Model +- caa_ : Concurrent Architecture Abstraction +- cds_ : Concurrent Data Structures (see doc/cds-api.txt) +- uatomic_: Userspace Atomic (see doc/uatomic-api.txt) + QUICK START GUIDE ----------------- @@ -35,24 +101,29 @@ Usage of all urcu libraries instead of inlines, so your application can link with the library. * Linking with one of the libraries below is always necessary even for LGPL and GPL applications. + * Define URCU_INLINE_SMALL_FUNCTIONS before including Userspace RCU + headers if you want Userspace RCU to inline small functions (10 + lines or less) into the application. It can be used by applications + distributed under any kind of license, and does *not* make the + application a derived work of Userspace RCU. -Usage of liburcu + Those small inlined functions are guaranteed to match the library + content as long as the library major version is unchanged. + Therefore, the application *must* be compiled with headers matching + the library major version number. Applications using + URCU_INLINE_SMALL_FUNCTIONS may be unable to use debugging + features of Userspace RCU without being recompiled. - * #include - * Link the application with "-lurcu". - * This is the preferred version of the library, both in terms of speed - and flexibility. Requires a signal, typically SIGUSR1. Can be - overridden with -DSIGURCU by modifying Makefile.build.inc. -Usage of liburcu-mb +Usage of liburcu * #include - * Compile any _LGPL_SOURCE code using this library with "-DURCU_MB". - * Link with "-lurcu-mb". - * This version of the urcu library does not need to - reserve a signal number. URCU_MB uses full memory barriers for - readers. This eliminates the need for signals but results in slower - reads. + * Link the application with "-lurcu". + * This is the preferred version of the library, in terms of + grace-period detection speed, read-side speed and flexibility. + Dynamically detects kernel support for sys_membarrier(). Falls back + on urcu-mb scheme if support is not present, which has slower + read-side. Usage of liburcu-qsbr @@ -64,13 +135,30 @@ Usage of liburcu-qsbr the threads are not active. It provides the fastest read-side at the expense of more intrusiveness in the application code. +Usage of liburcu-mb + + * #include + * Compile any _LGPL_SOURCE code using this library with "-DRCU_MB". + * Link with "-lurcu-mb". + * This version of the urcu library uses memory barriers on the writer + and reader sides. This results in faster grace-period detection, but + results in slower reads. + +Usage of liburcu-signal + + * #include + * Compile any _LGPL_SOURCE code using this library with "-DRCU_SIGNAL". + * Link the application with "-lurcu-signal". + * Version of the library that requires a signal, typically SIGUSR1. Can + be overridden with -DSIGRCU by modifying Makefile.build.inc. + Usage of liburcu-bp * #include * Link with "-lurcu-bp". * The BP library flavor stands for "bulletproof". It is specifically designed to help tracing library to hook on applications without - requiring to modify these applications. urcu_init(), + requiring to modify these applications. rcu_init(), rcu_register_thread() and rcu_unregister_thread() all become nops. The state is dealt with by the library internally at the expense of read-side and write-side performance. @@ -97,12 +185,36 @@ Writing Usage of liburcu-defer - * #include - * Link with "-lurcu-defer" - * Provides call_rcu() primitive to enqueue delayed callbacks. Queued + * Follow instructions for either liburcu, liburcu-qsbr, + liburcu-mb, liburcu-signal, or liburcu-bp above. + The liburcu-defer functionality is pulled into each of + those library modules. + * Provides defer_rcu() primitive to enqueue delayed callbacks. Queued callbacks are executed in batch periodically after a grace period. - Do _not_ use call_rcu() within a read-side critical section, because + Do _not_ use defer_rcu() within a read-side critical section, because it may call synchronize_rcu() if the thread queue is full. + This can lead to deadlock or worse. + * Requires that rcu_defer_barrier() must be called in library destructor + if a library queues callbacks and is expected to be unloaded with + dlclose(). + * Its API is currently experimental. It may change in future library + releases. + +Usage of urcu-call-rcu + + * Follow instructions for either liburcu, liburcu-qsbr, + liburcu-mb, liburcu-signal, or liburcu-bp above. + The urcu-call-rcu functionality is provided for each of + these library modules. + * Provides the call_rcu() primitive to enqueue delayed callbacks + in a manner similar to defer_rcu(), but without ever delaying + for a grace period. On the other hand, call_rcu()'s best-case + overhead is not quite as good as that of defer_rcu(). + * Provides call_rcu() to allow asynchronous handling of RCU + grace periods. A number of additional functions are provided + to manage the helper threads used by call_rcu(), but reasonable + defaults are used if these additional functions are not invoked. + See rcu-api.txt in userspace-rcu documentation for more details. Being careful with signals @@ -114,17 +226,76 @@ Being careful with signals signal(7). The liburcu-mb and liburcu-qsbr versions of the Userspace RCU library do not require any signal. - Read-side critical sections are allowed in a signal handler with - liburcu and liburcu-mb. Be careful, however, to disable these signals + Read-side critical sections are allowed in a signal handler, + except those setup with sigaltstack(2), with liburcu and + liburcu-mb. Be careful, however, to disable these signals between thread creation and calls to rcu_register_thread(), because a - signal handler nesting on an unregistered thread would not be allowed to - call rcu_read_lock(). + signal handler nesting on an unregistered thread would not be + allowed to call rcu_read_lock(). Read-side critical sections are _not_ allowed in a signal handler with liburcu-qsbr, unless signals are disabled explicitly around each rcu_quiescent_state() calls, when threads are put offline and around calls to synchronize_rcu(). Even then, we do not recommend it. +Interaction with mutexes + + One must be careful to do not cause deadlocks due to interaction of + synchronize_rcu() and RCU read-side with mutexes. If synchronize_rcu() + is called with a mutex held, this mutex (or any mutex which has this + mutex in its dependency chain) should not be acquired from within a RCU + read-side critical section. + + This is especially important to understand in the context of the + QSBR flavor: a registered reader thread being "online" by + default should be considered as within a RCU read-side critical + section unless explicitly put "offline". Therefore, if + synchronize_rcu() is called with a mutex held, this mutex, as + well as any mutex which has this mutex in its dependency chain + should only be taken when the RCU reader thread is "offline" + (this can be performed by calling rcu_thread_offline()). + +Interaction with fork() + + Special care must be taken for applications performing fork() without + any following exec(). This is caused by the fact that Linux only clones + the thread calling fork(), and thus never replicates any of the other + parent thread into the child process. Most liburcu implementations + require that all registrations (as reader, defer_rcu and call_rcu + threads) should be released before a fork() is performed, except for the + rather common scenario where fork() is immediately followed by exec() in + the child process. The only implementation not subject to that rule is + liburcu-bp, which is designed to handle fork() by calling + rcu_bp_before_fork, rcu_bp_after_fork_parent and + rcu_bp_after_fork_child. + + Applications that use call_rcu() and that fork() without + doing an immediate exec() must take special action. The parent + must invoke call_rcu_before_fork() before the fork() and + call_rcu_after_fork_parent() after the fork(). The child + process must invoke call_rcu_after_fork_child(). + Even though these three APIs are suitable for passing to + pthread_atfork(), use of pthread_atfork() is *STRONGLY + DISCOURAGED* for programs calling the glibc memory allocator + (malloc(), calloc(), free(), ...) within call_rcu callbacks. + This is due to limitations in the way glibc memory allocator + handles calls to the memory allocator from concurrent threads + while the pthread_atfork() handlers are executing. + Combining e.g.: + * call to free() from callbacks executed within call_rcu worker + threads, + * executing call_rcu atfork handlers within the glibc pthread + atfork mechanism, + will sometimes trigger interesting process hangs. This usually + hangs on a memory allocator lock within glibc. + +Thread Local Storage (TLS) + + Userspace RCU can fall back on pthread_getspecific() to emulate + TLS variables on systems where it is not available. This behavior + can be forced by specifying --disable-compiler-tls as configure + argument. + Usage of DEBUG_RCU DEBUG_RCU is used to add internal debugging self-checks to the @@ -136,3 +307,34 @@ Usage of DEBUG_YIELD DEBUG_YIELD is used to add random delays in the code for testing purposes. + +SMP support + + By default the library is configured to use synchronization primitives + adequate for SMP systems. On uniprocessor systems, support for SMP + systems can be disabled with: + + ./configure --disable-smp-support + + theoretically yielding slightly better performance. + +MAKE TARGETS +------------ + +In addition to the usual "make check" target, Userspace RCU features +"make regtest" and "make bench" targets. + +make check: Short tests, meant to be run when rebuilding or porting + Userspace RCU. + +make regtest: Long (many hours) test, meant to be run when modifying + Userspace RCU or porting it to a new architecture or + operating system. + +make bench: Long (many hours) benchmarks. + +CONTACTS +-------- + +You can contact the maintainers on the following mailing list: +lttng-dev@lists.lttng.org