summaryrefslogtreecommitdiff
path: root/src/thread
AgeCommit message (Collapse)AuthorLines
2019-01-16fix unintended linking dependency of pthread_key_create on __synccallRich Felker-0/+6
commit 84d061d5a31c9c773e29e1e2b1ffe8cb9557bc58 attempted to do this already, but omitted from pthread_key_create.c the weak definition of __pthread_key_delete_synccall, so that the definition provided by pthread_key_delete.c was always pulled in. based on patch by Markus Wichmann, but with a weak alias rather than weak reference for consistency/policy about dependence on tooling features.
2018-12-19make sem_wait and sem_timedwait interruptible by signalsRich Felker-1/+1
this reverts commit c0ed5a201b2bdb6d1896064bec0020c9973db0a1, which was based on a mistaken reading of POSIX due to inconsistency between the description (which requires return upon interruption by a signal) and the errors list (which wrongly lists EINTR as "may fail"). since the previously-introduced behavior was a workaround for an old kernel bug to ensure safety of correct programs that were not hardened against the bug, an effort has been made to preserve it for programs which do not use interrupting signal handlers. the stage for this was set in commit a63c0104e496f7ba78b64be3cd299b41e8cd427f, which makes the futex __timedwait backend suppress EINTR if it's seen when no interrupting signal handlers have been installed. based loosely on a patch submitted by Orivej Desh, but with unnecessary additional changes removed.
2018-12-18don't fail pthread_sigmask/sigprocmask on invalid how when set is nullRich Felker-1/+1
the resolution of Austin Group issue #1132 changes the requirement to fail so that it only applies when the set argument (new mask) is non-null. this change was made for consistency with the description, which specified "if set is a null pointer, the value of the argument how is not significant".
2018-12-18add __timedwait backend workaround for old kernels where futex EINTRsRich Felker-0/+8
prior to linux 2.6.22, futex wait could fail with EINTR even for non-interrupting (SA_RESTART) signals. this was no problem provided the caller simply restarted the wait, but sem_[timed]wait is required by POSIX to return when interrupted by a signal. commit a113434cd68ce30642c4995b1caadcd084be6f09 introduced this behavior, and commit c0ed5a201b2bdb6d1896064bec0020c9973db0a1 reverted it based on a mistaken belief that it was not required. this belief stems from a bug in the specification: the description requires the function to return when interrupted, but the errors section marks EINTR as a "may fail" condition rather than a "shall fail" one. since there does seem to be significant value in the change made in commit c0ed5a201b2bdb6d1896064bec0020c9973db0a1, making it so that programs that call sem_wait without checking for EINTR don't silently make forward progress without obtaining the semaphore or treat it as a fatal error and abort, add a behind-the-scenes mechanism in the __timedwait backend to suppress EINTR in programs that have never installed interrupting signal handlers, and have sigaction track and report this state. this way the semaphore code is not cluttered by workarounds and can be updated (to be done in next commit) to reflect the high-level logic for conforming behavior. these changes are based loosely on a patch by Markus Wichmann, with the main changes being atomic update to flag object and moving the workaround from sem_timedwait to the __timedwait futex backend.
2018-10-12combine arch ABI's DTP_OFFSET into DTV pointersRich Felker-2/+2
as explained in commit 6ba5517a460c6c438f64d69464fdfc3269a4c91a, some archs use an offset (typicaly -0x8000) with their DTPOFF relocations, which __tls_get_addr needs to invert. on affected archs, which lack direct support for large immediates, this can cost multiple extra instructions in the hot path. instead, incorporate the DTP_OFFSET into the DTV entries. this means they are no longer valid pointers, so store them as an array of uintptr_t rather than void *; this also makes it easier to access slot 0 as a valid slot count. commit e75b16cf93ebbc1ce758d3ea6b2923e8b2457c68 left behind cruft in two places, __reset_tls and __tls_get_new, from back when it was possible to have uninitialized gap slots indicated by a null pointer in the DTV. since the concept of null pointer is no longer meaningful with an offset applied, remove this cruft. presently there are no archs with both TLSDESC and nonzero DTP_OFFSET, but the dynamic TLSDESC relocation code is also updated to apply an inverted offset to its offset field, so that the offset DTV would not impose a runtime cost in TLSDESC resolver functions.
2018-09-18limit the configurable default stack/guard size for threadsRich Felker-6/+10
limit to 8MB/1MB, repectively. since the defaults cannot be reduced once increased, excessively large settings would lead to an unrecoverably broken state. this change is in preparation to allow defaults to be increased via program headers at the linker level. creation of threads that really need larger sizes needs to be done with an explicit attribute.
2018-09-18remove redundant declarations of __default_stacksize, __default_guardsizeRich Felker-8/+0
these are now declared in pthread_impl.h.
2018-09-18fix benign data race in pthread_attr_initRich Felker-0/+2
access to defaults should be protected against concurrent changes.
2018-09-18fix deletion of pthread tsd keys that still have non-null values storedRich Felker-18/+101
per POSIX, deletion of a key for which some threads still have values stored is permitted, and newly created keys must initially hold the null value in all threads. these properties were not met by our implementation; if a key was deleted with values left and a new key was created in the same slot, the old values were still visible. moreover, due to lack of any synchronization in pthread_key_delete, there was a TOCTOU race whereby a concurrent pthread_exit could attempt to call a null destructor pointer for the newly orphaned value. this commit introduces a solution based on __synccall, stopping the world to zero out the values for deleted keys, but only does so lazily when all key slots have been exhausted. pthread_key_delete is split off into a separate translation unit so that static-linked programs which only create keys but never delete them will not pull in the __synccall machinery. a global rwlock is added to synchronize creation and deletion of keys with dtor execution. since the dtor execution loop now has to release and retake the lock around its call to each dtor, checks are made not to call the nodtor dummy function for keys which lack a dtor.
2018-09-15check for kernel support before allowing robust mutex creationRich Felker-1/+17
on some archs, linux support for futex operations (including robust_list processing) that depend on kernelspace CAS is conditional on a runtime check. as of linux 4.18, this check fails unconditionally on nommu archs that perform it, and spurious failure on powerpc64 was observed but not explained. it's also possible that futex support is omitted entirely, or that the kernel is older than 2.6.17. for most futex ops, ENOSYS does not yield hard breakage; userspace will just spin at 100% cpu load. but for robust mutexes, correct behavior depends on the kernel functionality. use the get_robust_list syscall to probe for support at the first call to pthread_mutexattr_setrobust, and block creation of robust mutexes with a reportable error if they can't be supported.
2018-09-12split internal lock API out of libc.h, creating lock.hRich Felker-1/+8
this further reduces the number of source files which need to include libc.h and thereby be potentially exposed to libc global state and internals. this will also facilitate further improvements like adding an inline fast-path, if we want to do so later.
2018-09-12reduce spurious inclusion of libc.hRich Felker-8/+1
libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
2018-09-12remove unused __futex function and source fileRich Felker-7/+0
the direct syscall or various thin and mostly-inline wrappers around it are used instead internally. at some point a public futex function should be added, but it's not yet clear what the signature should be, and in the mean time this file is not useful.
2018-09-12hide __pthread_once_full symbolRich Felker-1/+1
this is a special case that does not need a declaration, because it's not even a libc-internal interface between translation units. instead it's a poor hack around compilers' inability to shrink-wrap critical code paths. after vis.h was disabled, it became more of a pessimization on many archs due to the extra layer of machinery to support a call through the PLT, but now it should be efficient again.
2018-09-12overhaul internally-public declarations using wrapper headersRich Felker-53/+6
commits leading up to this one have moved the vast majority of libc-internal interface declarations to appropriate internal headers, allowing them to be type-checked and setting the stage to limit their visibility. the ones that have not yet been moved are mostly namespace-protected aliases for standard/public interfaces, which exist to facilitate implementing plain C functions in terms of POSIX functionality, or C or POSIX functionality in terms of extensions that are not standardized. some don't quite fit this description, but are "internally public" interfacs between subsystems of libc. rather than create a number of newly-named headers to declare these functions, and having to add explicit include directives for them to every source file where they're needed, I have introduced a method of wrapping the corresponding public headers. parallel to the public headers in $(srcdir)/include, we now have wrappers in $(srcdir)/src/include that come earlier in the include path order. they include the public header they're wrapping, then add declarations for namespace-protected versions of the same interfaces and any "internally public" interfaces for the subsystem they correspond to. along these lines, the wrapper for features.h is now responsible for the definition of the hidden, weak, and weak_alias macros. this means source files will no longer need to include any special headers to access these features. over time, it is my expectation that the scope of what is "internally public" will expand, reducing the number of source files which need to include *_impl.h and related headers down to those which are actually implementing the corresponding subsystems, not just using them.
2018-09-12use hidden visibility for sh __unmapself backendsRich Felker-2/+3
2018-09-12make arch __set_thread_area backends hiddenRich Felker-0/+9
this is not a public interface, and does not even necessarily match the syscall on all archs that have a syscall by that name. on archs where it's implemented in C, no action on the source file is needed; the hidden declaration in pthread_arch.h suffices.
2018-09-12make arch __clone backends hiddenRich Felker-0/+15
these are not a public interface and are not intended to be callable from anywhere but the public clone function or other places in libc.
2018-09-12move declarations of tls setup/access functions to pthread_impl.hRich Felker-4/+0
it's already included in all places where these are needed, and aside from __tls_get_addr, they're all implementation internals.
2018-09-12for c11 mtx and cnd functions, use externally consistent type namesRich Felker-12/+17
despite looking like undefined behavior, the affected code is correct both before and after this patch. the pairs mtx_t and pthread_mutex_t, and cnd_t and pthread_cond_t, are not mutually compatible within a single translation unit (because they are distinct untagged aggregate instances), but they are compatible with an object of either type from another translation unit (6.2.7 ΒΆ1), and therefore a given translation unit can choose which one it wants to use. in the interest of being able to move declarations out of source files to headers that facilitate checking, use the pthread type names in declaring the namespace-safe versions of the pthread functions and cast the argument pointer types when calling them.
2018-09-12make inadvertently exposed __pthread_{timed,try}join_np functions staticRich Felker-2/+2
these exist for the sake of defining the corresponding weak public aliases (for C11 and POSIX namespace conformance reasons). they are not referenced by anything else in libc, so make them static.
2018-09-12fix issues from public functions defined without declaration visibleRich Felker-0/+1
policy is that all public functions which have a public declaration should be defined in a context where that public declaration is visible, to avoid preventable type mismatches. an audit performed using GCC's -Wmissing-declarations turned up the violations corrected here. in some cases the public header had not been included; in others, a feature test macro needed to make the declaration visible had been omitted. in the case of gethostent and getnetent, the omission seems to have been intentional, as a hack to admit a single stub definition for both functions. this kind of hack is no longer acceptable; it's UB and would not fly with LTO or advanced toolchains. the hack is undone to make exposure of the declarations possible.
2018-09-05define and use internal macros for hidden visibility, weak refsRich Felker-26/+20
this cleans up what had become widespread direct inline use of "GNU C" style attributes directly in the source, and lowers the barrier to increased use of hidden visibility, which will be useful to recovering some of the efficiency lost when the protected visibility hack was dropped in commit dc2f368e565c37728b0d620380b849c3a1ddd78f, especially on archs where the PLT ABI is costly.
2018-09-04fix namespace violation for c11 mutex functionsRich Felker-1/+3
__pthread_mutex_timedlock is used to implement c11 mutex functions, and therefore cannot call pthread_mutex_trylock by name.
2018-09-04in pthread_mutex_timedlock, avoid repeatedly reading mutex type fieldRich Felker-3/+4
compiler cannot cache immutable fields of the mutex object across external calls it can't see, much less across atomics.
2018-09-04in pthread_mutex_trylock, EBUSY out more directly when possibleRich Felker-2/+2
avoid gratuitously setting up and tearing down the robust list pending slot.
2018-08-29fix async thread cancellation on sh-fdpicRich Felker-0/+3
if __cp_cancel was reached via __syscall_cp, r12 will necessarily still contain a GOT pointer (for libc.so or for the static-linked main program) valid for entering __cancel. however, in the case of async cancellation, r12 may contain any scratch value; it's not necessarily even a valid GOT pointer for the code that was interrupted. unlike in commit 0ec49dab6794166d67fae4764ce7fdea42ea6103 where the corresponding issue was fixed for powerpc64, there is fundamentally no way for fdpic code to recompute its GOT pointer. so a new mechanism is introduced for cancel_handler to write a GOT register value into the interrupted context on archs where it is needed.
2018-08-29fix async thread cancellation on powerpc64Rich Felker-0/+7
entering the local entry point for __cancel from __cp_cancel is valid if __cp_cancel was reached from __syscall_cp, since both are in libc and share the same TOC pointer, but it is not valid if __cp_cancel was reached when cancel_handler rewrote the program counter for asynchronous cancellation of code outside libc. to ensure __cancel is entered with a valid TOC pointer, recompute the correct value in a PC-relative manner before jumping.
2018-08-28reject invalid arguments to pthread_barrierattr_setpsharedRich Felker-0/+1
this is a POSIX requirement.
2018-08-28rewrite __aeabi_read_tp in asmSzabolcs Nagy-12/+6
__aeabi_read_tp used to call c code, but that was incorrect as the arm runtime abi specifies special pcs for this function: it is only allowed to clobber r0, ip, lr and cpsr. maintainer's note: the old code explicitly saved and restored all general-purpose registers which are call-clobbered in the normal calling convention, so it's unlikely that any real-world compilers produced code that could break. however theoretically they could have chosen to use floating point registers, in which case the caller's values of those registers would be clobbered.
2018-08-28fix deadlock in async thread self-cancellationRich Felker-1/+5
with async cancellation enabled, pthread_cancel(pthread_self()) deadlocked due to pthread_kill holding killlock which is needed by pthread_exit. this could be solved by making pthread_kill block signals around the critical section, at least when the target thread is itself, but the issue only arises for cancellation, and otherwise would just be imposing unnecessary cost. instead just have pthread_cancel explicitly check for async self-cancellation and call pthread_exit(PTHREAD_CANCELED) directly rather than going through the signal machinery.
2018-08-23fix tls access on arm targets before armv6kSzabolcs Nagy-1/+1
commit 610c5a8524c3d6cd3ac5a5f1231422e7648a3791 changed the thread pointer setup so tp points at the end of the pthread struct on arm, but failed to update __aeabi_read_tp so it was off by 8. this broke tls access in code that is compiled with -mtp=soft, which is the default when target arch is pre armv6k or thumb1. maintainer's note: no release versions are affected.
2018-08-18mips archs: fix runaway execution if start fn passed to clone returnsSegev Finer-3/+12
Call SYS_exit on return from fn in __clone. This is the expected behavior of this function. Without this the child task will crash on return from fn, since it will return to nowhere.
2018-08-16fix pthread_create return value with PTHREAD_EXPLICIT_SCHEDRich Felker-0/+1
due to moved code, commit b8742f32602add243ee2ce74d804015463726899 inadvertently used the return value of __clone, rather than the return value of SYS_sched_setscheduler in the new thread, to check whether it needed to report failure. since a successful __clone returns the tid of the new thread, which is never zero, this caused pthread_create always to return with an invalid error number in the code path for PTHREAD_EXPLICIT_SCHED. this regression was not present in any releases.
2018-07-27make pthread_attr_init honor defaults set by pthread_setattr_default_npRich Felker-4/+11
this fixes a major gap in the intended functionality of pthread_setattr_default_np. if application/library code creating a thread does not pass a null attribute pointer to pthread_create, but sets up an attribute object to change other properties while leaving the stack alone, the created thread will get a stack with size DEFAULT_STACK_SIZE. this makes pthread_setattr_default_np useless for working around stack overflow issues in such applications, and leaves a major risk of regression if previously-working code switches from using a null attribute pointer to an attribute object. this change aligns the behavior more closely with the glibc pthread_setattr_default_np functionality too, albeit via a different mechanism. glibc encodes "default" specially in the attribute object and reads the actual default at thread creation time. with this commit, we now copy the current default into the attribute object at pthread_attr_init time, so that applications that query the properties of the attribute object will see the right values.
2018-06-19add m68k portRich Felker-0/+58
three ABIs are supported: the default with 68881 80-bit fpu format and results returned in floating point registers, softfloat-only with the same format, and coldfire fpu with IEEE single/double only. only the first is tested at all, and only under qemu which has fpu emulation bugs. basic functionality smoke tests have been performed for the most common arch-specific breakage via libc-test and qemu user-level emulation. some sysvipc failures remain, but are shared with other big endian archs and will be fixed separately.
2018-05-09make linking of thread-start with explicit scheduling conditionalRich Felker-28/+28
the wrapper start function that performs scheduling operations is unreachable if pthread_attr_setinheritsched is never called, so move it there rather than the pthread_create source file, saving some code size for static-linked programs.
2018-05-09improve design of thread-start with explicit scheduling attributesRich Felker-21/+39
eliminate the awkward startlock mechanism and corresponding fields of the pthread structure that were only used at startup. instead of having pthread_create perform the scheduling operations and having the new thread wait for them to be completed, start the new thread with a wrapper start function that performs its own scheduling, sending the result code back via a futex. this way the new thread can use storage from the calling thread's stack rather than permanent fields in the pthread structure.
2018-05-05improve joinable/detached thread state handlingRich Felker-19/+22
previously, some accesses to the detached state (from pthread_join and pthread_getattr_np) were unsynchronized; they were harmless in programs with well-defined behavior, but ugly. other accesses (in pthread_exit and pthread_detach) were synchronized by a poorly named "exitlock", with an ad-hoc trylock operation on it open-coded in pthread_detach, whose only purpose was establishing protocol for which thread is responsible for deallocation of detached-thread resources. instead, use an atomic detach_state and unify it with the futex used to wait for thread exit. this eliminates 2 members from the pthread structure, gets rid of the hackish lock usage, and makes rigorous the trap added in commit 80bf5952551c002cf12d96deb145629765272db0 for catching attempts to join detached threads. it should also make attempt to detach an already-detached thread reliably trap.
2018-05-05improve pthread_exit synchronization with functions targeting tidRich Felker-16/+18
if the last thread exited via pthread_exit, the logic that marked it dead did not account for the possibility of it targeting itself via atexit handlers. for example, an atexit handler calling pthread_kill(pthread_self(), SIGKILL) would return success (previously, ESRCH) rather than causing termination via the signal. move the release of killlock after the determination is made whether the exiting thread is the last thread. in the case where it's not, move the release all the way to the end of the function. this way we can clear the tid rather than spending storage on a dedicated dead-flag. clearing the tid is also preferable in that it hardens against inadvertent use of the value after the thread has terminated but before it is joined.
2018-05-04remove incorrect ESRCH error from pthread_killRich Felker-1/+2
posix documents in the rationale and future directions for pthread_kill that, since the lifetime of the thread id for a joinable thread lasts until it is joined, ESRCH is not a correct error for pthread_kill to produce when the target thread has exited but not yet been joined, and that conforming applications cannot attempt to detect this state. future versions of the standard may explicitly require that ESRCH not be returned for this case.
2018-05-02use a dedicated futex object for pthread_join instead of tid fieldRich Felker-4/+5
the tid field in the pthread structure is not volatile, and really shouldn't be, so as not to limit the compiler's ability to reorder, merge, or split loads in code paths that may be relevant to performance (like controlling lock ownership). however, use of objects which are not volatile or atomic with futex wait is inherently broken, since the compiler is free to transform a single load into multiple loads, thereby using a different value for the controlling expression of the loop and the value passed to the futex syscall, leading the syscall to block instead of returning. reportedly glibc's pthread_join was actually affected by an equivalent issue in glibc on s390. add a separate, dedicated join_futex object for pthread_join to use.
2018-02-03store pthread stack guard sizes for pthread_getattr_npWilliam Pitcock-1/+3
2018-01-09revise the definition of multiple basic locks in the codeJens Gustedt-3/+3
In all cases this is just a change from two volatile int to one.
2018-01-09consistently use the LOCK an UNLOCK macrosJens Gustedt-12/+12
In some places there has been a direct usage of the functions. Use the macros consistently everywhere, such that it might be easier later on to capture the fast path directly inside the macro and only have the call overhead on the slow path.
2018-01-09new lock algorithm with state and congestion count in one atomic intJens Gustedt-8/+52
A variant of this new lock algorithm has been presented at SAC'16, see https://hal.inria.fr/hal-01304108. A full version of that paper is available at https://hal.inria.fr/hal-01236734. The main motivation of this is to improve on the safety of the basic lock implementation in musl. This is achieved by squeezing a lock flag and a congestion count (= threads inside the critical section) into a single int. Thereby an unlock operation does exactly one memory transfer (a_fetch_add) and never touches the value again, but still detects if a waiter has to be woken up. This is a fix of a use-after-free bug in pthread_detach that had temporarily been patched. Therefore this patch also reverts c1e27367a9b26b9baac0f37a12349fc36567c8b6 This is also the only place where internal knowledge of the lock algorithm is used. The main price for the improved safety is a little bit larger code. Under high congestion, the scheduling behavior will be different compared to the previous algorithm. In that case, a successful put-to-sleep may appear out of order compared to the arrival in the critical section.
2017-10-13fix read-after-free type error in pthread_detachRich Felker-1/+2
calling __unlock on t->exitlock is not valid because __unlock reads the waiters count after making the atomic store that could allow pthread_exit to continue and unmap the thread's stack and the object t points to. for now, inline the __unlock logic with an unconditional futex wake operation so that the waiters count is not needed. once __lock/__unlock have been made safe for self-synchronized destruction, we could switch back to using them.
2017-09-06fix signal masking race in pthread_create with priority attributesRich Felker-2/+7
if the parent thread was able to set the new thread's priority before it reached the check for 'startlock', the new thread failed to restore its signal mask and thus ran with all signals blocked. concept for patch by Sergei, who reported the issue; unnecessary changes were removed and comments added since the whole 'startlock' thing is non-idiomatic and confusing. eventually it should be replaced with use of idiomatic synchronization primitives.
2017-08-11trap UB from attempts to join a detached threadRich Felker-0/+1
passing to pthread_join the id of a thread which is not joinable results in undefined behavior. in principle the check to trap does not necessarily work if pthread_detach was called after thread creation, since no effort is made here to synchronize access to t->detached, but the check is well-defined and harmless for callers which did not invoke UB, and likely to help catch erroneous code that would otherwise mysteriously hang. patch by William Pitcock.
2017-07-04unify the use of FUTEX_PRIVATEJens Gustedt-3/+3
The flag 1<<7 is used in several places for different purposes that are not always easy to distinguish. Mark those usages that correspond to the flag that is used by the kernel for futexes.