summaryrefslogtreecommitdiff
path: root/src/process
AgeCommit message (Collapse)AuthorLines
2024-02-29posix_spawn: fix child spinning on write to a broken pipeAlexey Izbyshev-1/+6
A child process created by posix_spawn reports errors to its parent via a pipe, retrying infinitely on any write error to prevent falsely reporting success. If the (original) parent dies before write is attempted, there is nobody to report to, but the child will remain stuck in the write loop forever if SIGPIPE is blocked or ignored. Fix this by not retrying write if it fails with EPIPE.
2024-02-22add framework to support archs without a native wait4 syscallRich Felker-1/+1
this commit should make no codegen change for existing archs, but is a prerequisite for new archs including riscv32. the wait4 emulation backend provides both cancellable and non-cancellable variants because waitpid is required to be a cancellation point, but all of our other uses are not, and most of them cannot be. based on patch by Stefan O'Rear.
2023-06-01fix public clone function to be safe and usable by applicationsRich Felker-10/+16
the clone() function has been effectively unusable since it was added, due to producing a child process with inconsistent state. in particular, the child process's thread structure still contains the tid, thread list pointers, thread count, and robust list for the parent. this will cause malfunction in interfaces that attempt to use the tid or thread list, some of which are specified to be async-signal-safe. this patch attempts to make clone() consistent in a _Fork-like sense. as in _Fork, when the parent process is multi-threaded, the child process inherits an async-signal context where it cannot call AS-unsafe functions, but its context is now intended to be safe for calling AS-safe functions. making clone fork-like would also be a future option, if it turns out that this is what makes sense to applications, but it's not done at this time because the changes would be more invasive. in the case where the CLONE_VM flag is used, clone is only vfork-like, not _Fork-like. in particular, the child will see itself as having the parent's tid, and cannot safely call any libc functions but one of the exec family or _exit. handling of flags and variadic arguments is also changed so that arguments are only consumed with flags that indicate their presence, and so that flags which produce an inconsistent state are disallowed (reported as EINVAL). in particular, all libc functions carry a contract that they are only callable with ABI requirements met, which includes having a valid thread pointer to a thread structure that's unique within the process, and whose contents are opaque and only able to be setup internally by the implementation. the only way for an application to use flags that violate these requirements without executing any libc code is to perform the syscall from application-provided asm.
2023-06-01fix broken thread list unlocking after forkRich Felker-1/+1
apparently Linux clears the registered exit futex address on fork. this means that, if after forking the child process becomes multithreaded and the original thread exits, the thread list will never be unlocked, and future attempts to use the thread list will deadlock. re-register the exit futex address after _Fork in the child to ensure that it's preserved.
2023-02-09riscv64: add vforkPedro Falcato-0/+12
Implement vfork() using clone(CLONE_VM | CLONE_VFORK | ...).
2022-10-19fix missing synchronization of pthread TSD keys with MT-forkRich Felker-0/+3
commit 167390f05564e0a4d3fcb4329377fd7743267560 seems to have overlooked the presence of a lock here, probably because it was one of the exceptions not using LOCK() but a rwlock. as such, it can't be added to the generic table of locks to take, so add an explicit atfork function for the pthread keys table. the order it is called does not particularly matter since nothing else in libc but pthread_exit interacts with keys.
2022-10-19fix potential deadlock between multithreaded fork and aioRich Felker-2/+4
as reported by Alexey Izbyshev, there is a lock order inversion deadlock between the malloc lock and aio maplock at MT-fork time: _Fork attempts to take the aio maplock while fork already has the malloc lock, but a concurrent aio operation holding the maplock may attempt to allocate memory. move the __aio_atfork calls in the parent from _Fork to fork, and reorder the lock before most other locks, since nothing else depends on aio(*). this leaves us with the possibility that the child will not be able to obtain the read lock, if _Fork is used directly and happens concurrent with an aio operation. however, in that case, the child context is an async signal context that cannot call any further aio functions, so all we need is to ensure that close does not attempt to perform any aio cancellation. this can be achieved just by nulling out the map pointer. (*) even if other functions call close, they will only need a read lock, not a write lock, and read locks being recursive ensures they can obtain it. moreover, the number of read references held is bounded by something like twice the number of live threads, meaning that the read lock count cannot saturate.
2022-10-19fix potential deadlock in dlerror buffer handling at thread exitRich Felker-2/+0
ever since commit 8f11e6127fe93093f81a52b15bb1537edc3fc8af introduced the thread list lock, this has been wrong. initially, it was wrong via calling free from the context with the thread list lock held. commit aa5a9d15e09851f7b4a1668e9dbde0f6234abada deferred the unsafe free but added a lock, which was also unsafe. in particular, it could deadlock if code holding freebuf_queue_lock was interrupted by a signal handler that takes the thread list lock. commit 4d5aa20a94a2d3fae3e69289dc23ecafbd0c16c4 observed that there was a lock here but failed to notice that it's invalid. there is no easy solution to this problem with locks; any attempt at solving it while still using locks would require the lock to be an AS-safe one (blocking signals on each access to the dlerror buffer list to check if there's deferred free work to be done) which would be excessively costly, and there are also lock order considerations with respect to how the lock would be handled at fork. instead, just use an atomic list.
2022-08-01aarch64: add vforkSzabolcs Nagy-0/+9
The generic vfork implementation uses clone(SIGCHLD) which has fork semantics. Implement vfork as clone(SIGCHLD|CLONE_VM|CLONE_VFORK, 0) instead which has vfork semantics. (stack == 0 means sp is unchanged in the child.) Some users rely on vfork semantics when memory overcommit is disabled or when the vfork child runs code that synchronizes with the parent process (non-conforming).
2021-03-15use internal malloc for posix_spawn file actions objectsRich Felker-0/+5
this makes it possible to perform actions on file actions objects with a libc-internal lock held without creating lock order relationships that are silently imposed on an application-provided malloc.
2021-01-30fail posix_spawn file_actions operations with negative fdsRich Felker-0/+4
these functions are specified to fail with EBADF on negative fd arguments. apart from close, they are also specified to fail if the value exceeds OPEN_MAX, but as written it is not clear that this imposes any requirement when OPEN_MAX is not defined, and it's undesirable to impose a dynamic limit (via setrlimit) here since the limit at the time of posix_spawn may be different from the limit at the time of setting up the file actions. this may require revisiting later.
2020-11-11lift child restrictions after multi-threaded forkRich Felker-0/+70
as the outcome of Austin Group tracker issue #62, future editions of POSIX have dropped the requirement that fork be AS-safe. this allows but does not require implementations to synchronize fork with internal locks and give forked children of multithreaded parents a partly or fully unrestricted execution environment where they can continue to use the standard library (per POSIX, they can only portably use AS-safe functions). up until recently, taking this allowance did not seem desirable. however, commit 8ed2bd8bfcb4ea6448afb55a941f4b5b2b0398c0 exposed the extent to which applications and libraries are depending on the ability to use malloc and other non-AS-safe interfaces in MT-forked children, by converting latent very-low-probability catastrophic state corruption into predictable deadlock. dealing with the fallout has been a huge burden for users/distros. while it looks like most of the non-portable usage in applications could be fixed given sufficient effort, at least some of it seems to occur in language runtimes which are exposing the ability to run unrestricted code in the child as part of the contract with the programmer. any attempt at fixing such contracts is not just a technical problem but a social one, and is probably not tractable. this patch extends the fork function to take locks for all libc singletons in the parent, and release or reset those locks in the child, so that when the underlying fork operation takes place, the state protected by these locks is consistent and ready for the child to use. locking is skipped in the case where the parent is single-threaded so as not to interfere with legacy AS-safety property of fork in single-threaded programs. lock order is mostly arbitrary, but the malloc locks (including bump allocator in case it's used) must be taken after the locks on any subsystems that might use malloc, and non-AS-safe locks cannot be taken while the thread list lock is held, imposing a requirement that it be taken last.
2020-10-26fix reintroduction of errno clobbering by atfork handlersRich Felker-0/+3
commit bd153422f28634bb6e53f13f80beb8289d405267 reintroduced the bug fixed in c21051e90cd27a0b26be0ac66950b7396a156ba1 by refactoring the __syscall_ret into _Fork where it once again runs before the atfork handlers are called. since _Fork is a public interface that sets errno, this can't be fixed the way it was fixed last time without making new internal interfaces. instead, just save errno, and restore it only on error to ensure that a value of 0 is never restored.
2020-10-14move aio implementation details to a proper internal headerRich Felker-0/+1
also fix the lack of declaration (and thus hidden visibility) in __stdio_close's use of __aio_close.
2020-10-14fix posix_spawn interaction with fork and abort by taking lockRich Felker-3/+13
this change prevents the child created concurrently with abort from seeing the SIGABRT disposition change from SIG_IGN to SIG_DFL (other changes are not visible anyway) and prevents leaking the write end of the child pipe to children created by fork in another thread, which may block return of posix_spawn indefinitely if the forked child does not exit or exec. along with other changes, this suggests that __abort_lock should perhaps eventually be renamed to reflect that it's becoming a broader lock on related "process lifetime" state.
2020-10-14implement _Fork and refactor fork using itRich Felker-9/+15
the _Fork interface is defined for future issue of POSIX as the outcome of Austin Group issue 62, which drops the AS-safety requirement for fork, and provides an AS-safe replacement that does not run the registered atfork handlers.
2020-10-14rename fork source fileRich Felker-0/+0
this is in preparation for implementing _Fork from POSIX-future, factored as a separate commit to improve readability of history.
2020-10-14fix missing synchronization of fork with abortRich Felker-0/+3
if the multithreaded parent forked while another thread was calling sigaction for SIGABRT or calling abort, the child could inherit a lock state in which future calls to abort will deadlock, or in which the disposition for SIGABRT has already been reset to SIG_DFL. this is nonconforming since abort is AS-safe and permitted to be called concurrently with fork or in the MT-forked child.
2020-09-28fix fork of processes with active async io contextsRich Felker-0/+3
previously, if a file descriptor had aio operations pending in the parent before fork, attempting to close it in the child would attempt to cancel a thread belonging to the parent. this could deadlock, fail, or crash the whole process of the cancellation signal handler was not yet installed in the parent. in addition, further use of aio from the child could malfunction or deadlock. POSIX specifies that async io operations are not inherited by the child on fork, so clear the entire aio fd map in the child, and take the aio map lock (with signals blocked) across the fork so that the lock is kept in a consistent state.
2020-06-21clear need_locks in child after forkRich Felker-0/+1
the child is single-threaded, but may still need to synchronize with last changes made to memory by another thread in the parent, so set need_locks to -1 whereby the next lock-taker will drop to 0 and prevent further barriers/locking.
2019-08-30add posix_spawn [f]chdir file actionsRich Felker-0/+45
these are presently extensions, thus named with _np to match glibc and other implementations that provide them; however they are likely to be standardized in the future without the _np suffix as a result of Austin Group issue 1208. if so, both names will be kept as aliases.
2019-07-08prevent dup2 action for posix_spawn internal pipe fdRich Felker-0/+4
as reported by Tavian Barnes, a dup2 file action for the internal pipe fd used by posix_spawn could cause it to remain open after execve and allow the child to write an artificial error into it, confusing the parent. POSIX allows internal use of file descriptors by the implementation, with undefined behavior for poking at them, so this is not a conformance problem, but it seems preferable to diagnose and prevent the error when we can do so easily. catch attempts to apply a dup2 action to the internal pipe fd and emulate EBADF for it instead.
2019-07-01fix deadlock in synccall after threaded forkSamuel Holland-0/+1
synccall may be called by AS-safe functions such as setuid/setgid after fork. although fork() resets libc.threads_minus_one, causing synccall to take the single-threaded path, synccall still takes the thread list lock. This lock may be held by another thread if for example fork() races with pthread_create(). After fork(), the value of the lock is meaningless, so clear it. maintainer's note: commit 8f11e6127fe93093f81a52b15bb1537edc3fc8af and e4235d70672d9751d7718ddc2b52d0b426430768 introduced this regression. the state protected by this lock is the linked list, which is entirely replaced in the child path of fork (next=prev=self), so resetting it is semantically sound.
2019-04-02use __strchrnul instead of strchr and strlen in execvpeFrediano Ziglio-2/+1
The result is the same but takes less code. Note that __execvpe calls getenv which calls __strchrnul so even using static output the size of the executable won't grow.
2019-02-15track all live threads in an AS-safe, fully-consistent linked listRich Felker-0/+1
the hard problem here is unlinking threads from a list when they exit without creating a window of inconsistency where the kernel task for a thread still exists and is still executing instructions in userspace, but is not reflected in the list. the magic solution here is getting rid of per-thread exit futex addresses (set_tid_address), and instead using the exit futex to unlock the global thread list. since pthread_join can no longer see the thread enter a detach_state of EXITED (which depended on the exit futex address pointing to the detach_state), it must now observe the unlocking of the thread list lock before it can unmap the joined thread and return. it doesn't actually have to take the lock. for this, a __tl_sync primitive is offered, with a signature that will allow it to be enhanced for quick return even under contention on the lock, if needed. for now, the exiting thread always performs a futex wake on its detach_state. a future change could optimize this out except when there is already a joiner waiting. initial/dynamic variants of detached state no longer need to be tracked separately, since the futex address is always set to the global list lock, not a thread-local address that could become invalid on detached thread exit. all detached threads, however, must perform a second sigprocmask syscall to block implementation-internal signals, since locking the thread list with them already blocked is not permissible. the arch-independent C version of __unmapself no longer needs to take a lock or setup its own futex address to release the lock, since it must necessarily be called with the thread list lock already held, guaranteeing exclusive access to the temporary stack. changes to libc.threads_minus_1 no longer need to be atomic, since they are guarded by the thread list lock. it is largely vestigial at this point, and can be replaced with a cheaper boolean indicating whether the process is multithreaded at some point in the future.
2018-09-12reduce spurious inclusion of libc.hRich Felker-5/+0
libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
2018-09-12remove __vfork aliasRich Felker-28/+7
this was added so that posix_spawn and possibly other functionality could be implemented in terms of vfork, but that turned out to be unsafe. any such usage needs __clone with proper handling of stack lifetime.
2018-09-12overhaul internally-public declarations using wrapper headersRich Felker-4/+0
commits leading up to this one have moved the vast majority of libc-internal interface declarations to appropriate internal headers, allowing them to be type-checked and setting the stage to limit their visibility. the ones that have not yet been moved are mostly namespace-protected aliases for standard/public interfaces, which exist to facilitate implementing plain C functions in terms of POSIX functionality, or C or POSIX functionality in terms of extensions that are not standardized. some don't quite fit this description, but are "internally public" interfacs between subsystems of libc. rather than create a number of newly-named headers to declare these functions, and having to add explicit include directives for them to every source file where they're needed, I have introduced a method of wrapping the corresponding public headers. parallel to the public headers in $(srcdir)/include, we now have wrappers in $(srcdir)/src/include that come earlier in the include path order. they include the public header they're wrapping, then add declarations for namespace-protected versions of the same interfaces and any "internally public" interfaces for the subsystem they correspond to. along these lines, the wrapper for features.h is now responsible for the definition of the hidden, weak, and weak_alias macros. this means source files will no longer need to include any special headers to access these features. over time, it is my expectation that the scope of what is "internally public" will expand, reducing the number of source files which need to include *_impl.h and related headers down to those which are actually implementing the corresponding subsystems, not just using them.
2018-09-12rework mechanism for posix_spawnp calling posix_spawnRich Felker-19/+9
previously, a common __posix_spawnx backend was used that accepted an additional argument for the execve variant to call in the child. this moderately bloated up the posix_spawn function, shuffling arguments between stack and/or registers to call a 7-argument function from a 6-argument one. instead, tuck the exec function pointer in an unused part of the (large) pthread_spawnattr_t structure, and have posix_spawnp duplicate the attributes and fill in a pointer to __execvpe. the net code size change is minimal, but the weight is shifted to the "heavier" function which already pulls in more dependencies. as a bonus, we get rid of an external symbol (__posix_spawnx) that had no really good place for a declaration because it shouldn't have existed to begin with.
2018-09-12declare __syscall_ret as hidden in vfork asmRich Felker-0/+4
without this, it's plausible that assembler or linker could complain about an unsatisfiable relocation.
2018-09-12add arm asm for vforkPatrick Oppenlander-0/+13
2018-09-12move and deduplicate declarations of __procfdname to make it checkableRich Felker-2/+0
syscall.h was chosen as the header to declare it, since its intended usage is alongside syscalls as a fallback for operations the direct syscall does not support.
2018-09-04implement fexecve in terms of execveat when it existsJoseph C. Sible-0/+5
This lets fexecve work even when /proc isn't mounted.
2018-08-28fix return value of system on failure to spawn child processRich Felker-1/+1
the value 0x7f00 (as if by _exit(127)) is specified only for the case where the child is created but then fails to exec the shell, since traditional fork+exec implementations do not admit reporting an error via errno in this case without additional machinery. it's unclear whether an implementation not subject to this failure mode needs to emulate it; one could read the standard as requiring that. if so, additional code will need to be added to map posix_spawn errors into the form system is expected to return. but for now, returning -1 to indicate an error is significantly better behavior than always reporting failures as if the shell failed to exec after fork.
2018-02-21convert execvp error handling to switch statementRich Felker-2/+9
this is more extensible if we need to consider additional errors, and more efficient as long as the compiler does not know it can cache the result of __errno_location (a surprisingly complex issue detailed in commit a603a75a72bb469c6be4963ed1b55fabe675fe15).
2018-02-21fix execvp failing on not-dir entries in PATH.Przemyslaw Pawelczyk-1/+1
It's better to make execvp continue PATH search on ENOTDIR rather than issuing an error. Bogus entries should not render rest of PATH invalid. Maintainer's note: POSIX seems to require the search to continue like this as part of XBD 8.3 Other Environment Variables. Only errors that conclusively determine non-existence are candidates for continuing; otherwise for consistency we have to report the error.
2017-11-10prevent fork's errno from being clobbered by atfork handlersBobby Bingham-3/+3
If the syscall fails, errno must be set correctly for the caller. There's no guarantee that the handlers registered with pthread_atfork won't clobber errno, so we need to ensure it gets set after they are called.
2017-11-05adjust posix_spawn dup2 action behavior to match future requirementsRich Felker-8/+12
the resolution to Austin Group issue #411 defined new semantics for the posix_spawn dup2 file action in the (previously useless) case where src and dest fd are equal. future issues will require the dup2 file action to remove the close-on-exec flag. without this change, passing fds to a child with posix_spawn while avoiding fd-leak races in a multithreaded parent required a complex dance with temporary fds. based on patch by Petr Skocik. changes were made to preserve the 80-column formatting of the function and to remove code that became unreachable as a result of the new functionality.
2017-10-19posix_spawn: use larger stack to cover worst-case in execvpeWill Dietz-1/+1
execvpe stack-allocates a buffer used to hold the full path (combination of a PATH entry and the program name) while searching through $PATH, so at least NAME_MAX+PATH_MAX is needed. The stack size can be made conditionally smaller (the current 1024 appears appropriate) should this larger size be burdensome in those situations.
2017-04-22have posix_spawnattr_setflags check for supported flagsRich Felker-0/+11
per POSIX, EINVAL is not a mandatory error, only an optional one. but reporting unsupported flags allows an application to fallback gracefully when a requested feature is not supported. this is not helpful now, but it may be in the future if additional flags are added. had this checking been present before, applications would have been able to check for the newly-added POSIX_SPAWN_SETSID feature (added in commit bb439bb17108b67f3df9c9af824d3a607b5b059d) at runtime.
2017-04-22implement new posix_spawn flag POSIX_SPAWN_SETSIDRich Felker-0/+4
this functionality has been adopted for inclusion in the next issue of POSIX as the result of Austin Group issue #1044. based on patch by Daurnimator.
2016-11-11add s390x portBobby Bingham-0/+8
2015-06-16switch to using trap number 31 for syscalls on shRich Felker-1/+1
nominally the low bits of the trap number on sh are the number of syscall arguments, but they have never been used by the kernel, and some code making syscalls does not even know the number of arguments and needs to pass an arbitrary high number anyway. sh3/sh4 traditionally used the trap range 16-31 for syscalls, but part of this range overlapped with hardware exceptions/interrupts on sh2 hardware, so an incompatible range 32-47 was chosen for sh2. using trap number 31 everywhere, since it's in the existing sh3/sh4 range and does not conflict with sh2 hardware, is a proposed unification of the kernel syscall convention that will allow binaries to be shared between sh2 and sh3/sh4. if this is not accepted into the kernel, we can refit the sh2 target with runtime selection mechanisms for the trap number, but doing so would be invasive and would entail non-trivial overhead.
2015-06-11add sh asm for vforkRich Felker-0/+23
2015-04-13remove remnants of support for running in no-thread-pointer modeRich Felker-1/+1
since 1.1.0, musl has nominally required a thread pointer to be setup. most of the remaining code that was checking for its availability was doing so for the sake of being usable by the dynamic linker. as of commit 71f099cb7db821c51d8f39dfac622c61e54d794c, this is no longer necessary; the thread pointer is now valid before any libc code (outside of dynamic linker bootstrap functions) runs. this commit essentially concludes "phase 3" of the "transition path for removing lazy init of thread pointer" project that began during the 1.1.0 release cycle.
2015-04-10optimize out setting up robust list with kernel when not neededRich Felker-1/+2
as a result of commit 12e1e324683a1d381b7f15dd36c99b37dd44d940, kernel processing of the robust list is only needed for process-shared mutexes. previously the first attempt to lock any owner-tracked mutex resulted in robust list initialization and a set_robust_list syscall. this is no longer necessary, and since the kernel's record of the robust list must now be cleared at thread exit time for detached threads, optimizing it out is more worthwhile than before too.
2015-02-03make execvp continue PATH search on EACCES rather than issuing an errrorRich Felker-1/+4
the specification for execvp itself is unclear as to whether encountering a file that cannot be executed due to EACCES during the PATH search is a mandatory error condition; however, XBD 8.3's specification of the PATH environment variable clarifies that the search continues until a file with "appropriate execution permissions" is found. since it seems undesirable/erroneous to report ENOENT rather than EACCES when an early path element has a non-executable file and all later path elements lack any file by the requested name, the new code stores a flag indicating that EACCES was seen and sets errno back to EACCES in this case.
2014-12-05use direct syscall rather than write function in posix_spawn childRich Felker-1/+1
the write function is a cancellation point and accesses thread-local state belonging to the calling thread in the parent process. since cancellation is blocked for the duration of posix_spawn, this is probably safe, but it's fragile and unnecessary. making the syscall directly is just as easy and clearly safe.
2014-12-05don't fail posix_spawn on failed closeRich Felker-2/+1
the resolution of austin group issue #370 removes the requirement that posix_spawn fail when the close file action is performed on an already-closed fd. since there are no other meaningful errors for close, just ignoring the return value completely is the simplest fix.
2014-07-05eliminate use of cached pid from thread structureRich Felker-1/+1
the main motivation for this change is to remove the assumption that the tid of the main thread is also the pid of the process. (the value returned by the set_tid_address syscall was used to fill both fields despite it semantically being the tid.) this is historically and presently true on linux and unlikely to change, but it conceivably could be false on other systems that otherwise reproduce the linux syscall api/abi. only a few parts of the code were actually still using the cached pid. in a couple places (aio and synccall) it was a minor optimization to avoid a syscall. caching could be reintroduced, but lazily as part of the public getpid function rather than at program startup, if it's deemed important for performance later. in other places (cancellation and pthread_kill) the pid was completely unnecessary; the tkill syscall can be used instead of tgkill. this is actually a rather subtle issue, since tgkill is supposedly a solution to race conditions that can affect use of tkill. however, as documented in the commit message for commit 7779dbd2663269b465951189b4f43e70839bc073, tgkill does not actually solve this race; it just limits it to happening within one process rather than between processes. we use a lock that avoids the race in pthread_kill, and the use in the cancellation signal handler is self-targeted and thus not subject to tid reuse races, so both are safe regardless of which syscall (tgkill or tkill) is used.