summaryrefslogtreecommitdiff
path: root/src/aio/aio.c
AgeCommit message (Collapse)AuthorLines
2022-10-19fix AS-safety of close when aio is in use and fd map is expandedRich Felker-0/+6
the aio operations that lead to calling __aio_get_queue with the possibility to expand the fd map are not AS-safe, but if they are interrupted by a signal handler, the signal handler may call close, which is required to be AS-safe. due to __aio_get_queue taking the write lock without blocking signals, such a call to close from a signal handler could deadlock. change __aio_get_queue to block signals if it needs to obtain a write lock, and restore when finished.
2022-10-19fix potential deadlock between multithreaded fork and aioRich Felker-2/+17
as reported by Alexey Izbyshev, there is a lock order inversion deadlock between the malloc lock and aio maplock at MT-fork time: _Fork attempts to take the aio maplock while fork already has the malloc lock, but a concurrent aio operation holding the maplock may attempt to allocate memory. move the __aio_atfork calls in the parent from _Fork to fork, and reorder the lock before most other locks, since nothing else depends on aio(*). this leaves us with the possibility that the child will not be able to obtain the read lock, if _Fork is used directly and happens concurrent with an aio operation. however, in that case, the child context is an async signal context that cannot call any further aio functions, so all we need is to ensure that close does not attempt to perform any aio cancellation. this can be achieved just by nulling out the map pointer. (*) even if other functions call close, they will only need a read lock, not a write lock, and read locks being recursive ensures they can obtain it. moreover, the number of read references held is bounded by something like twice the number of live threads, meaning that the read lock count cannot saturate.
2022-10-19remove LFS64 symbol aliases; replace with dynamic linker remappingRich Felker-7/+0
originally the namespace-infringing "large file support" interfaces were included as part of glibc-ABI-compat, with the intent that they not be used for linking, since our off_t is and always has been unconditionally 64-bit and since we usually do not aim to support nonstandard interfaces when there is an equivalent standard interface. unfortunately, having the symbols present and available for linking caused configure scripts to detect them and attempt to use them without declarations, producing all the expected ill effects that entails. as a result, commit 2dd8d5e1b8ba1118ff1782e96545cb8a2318592c was made to prevent this, using macros to redirect the LFS64 names to the standard names, conditional on _GNU_SOURCE or _LARGEFILE64_SOURCE. however, this has turned out to be a source of further problems, especially since g++ defines _GNU_SOURCE by default. in particular, the presence of these names as macros breaks a lot of valid code. this commit removes all the LFS64 symbols and replaces them with a mechanism in the dynamic linker symbol lookup failure path to retry with the spurious "64" removed from the symbol name. in the future, if/when the rest of glibc-ABI-compat is moved out of libc, this can be removed.
2020-12-08drop use of pthread_once for aio thread stack size initRich Felker-10/+8
pthread_once is not compatible with MT-fork constraints (commit 167390f05564e0a4d3fcb4329377fd7743267560) and is not needed here anyway; we already have a lock suitable for initialization. while changing this, fix a corner case where AT_MINSIGSTKSZ gives a value that's more than MINSIGSTKSZ but by a margin of less than 2048, thereby causing the size to be reduced. it shouldn't matter but the intent was to be the larger of a 2048-byte margin over the legacy fixed minimum stack requirement or a 512-byte margin over the minimum the kernel reports at runtime.
2020-11-11convert malloc use under libc-internal locks to use internal allocatorRich Felker-0/+5
this change lifts undocumented restrictions on calls by replacement mallocs to libc functions that might take these locks, and sets the stage for lifting restrictions on the child execution environment after multithreaded fork. care is taken to #define macros to replace all four functions (malloc, calloc, realloc, free) even if not all of them will be used, using an undefined symbol name for the ones intended not to be used so that any inadvertent future use will be caught at compile time rather than directed to the wrong implementation.
2020-10-14move aio implementation details to a proper internal headerRich Felker-0/+1
also fix the lack of declaration (and thus hidden visibility) in __stdio_close's use of __aio_close.
2020-09-28fix fork of processes with active async io contextsRich Felker-0/+14
previously, if a file descriptor had aio operations pending in the parent before fork, attempting to close it in the child would attempt to cancel a thread belonging to the parent. this could deadlock, fail, or crash the whole process of the cancellation signal handler was not yet installed in the parent. in addition, further use of aio from the child could malfunction or deadlock. POSIX specifies that async io operations are not inherited by the child on fork, so clear the entire aio fd map in the child, and take the aio map lock (with signals blocked) across the fork so that the lock is kept in a consistent state.
2018-12-11on failed aio submission, set aiocb error and return valueRich Felker-2/+4
it's not clear whether this is required, but it seems arguable that it should happen. for example aio_suspend is supposed to return immediately if any of the operations has "completed", which includes ending with an error status asynchonously and might also be interpreted to include doing so synchronously.
2018-12-11don't create aio queue/map structures for invalid file descriptorsRich Felker-4/+8
the map structures in particular are permanent once created, and thus a large number of aio function calls with invalid file descriptors could exhaust memory, whereas, assuming normal resource limits, only a very small number of entries ever need to be allocated. check validity of the fd before allocating anything new, so that allocation of large amounts of memory is only possible when resource limits have been increased and a large number of files are actually open. this change also improves error reporting for bad file descriptors to happen at the time the aio submission call is made, as opposed to asynchronously.
2018-12-11move aio queue allocation from io thread to submitting threadRich Felker-16/+21
since commit c9f415d7ea2dace5bf77f6518b6afc36bb7a5732, it has been possible that the allocator is application-provided code, which cannot necessarily run safely on io thread stacks, and which should not be able to see the existence of io threads, since they are an implementation detail. instead of having the io thread request and possibly allocate its queue (and the map structures leading to it), make the submitting thread responsible for this, and pass the queue pointer into the io thread via its args structure. this eliminates the only early error case in io threads, making it no longer necessary to pass an error status back to the submitting thread via the args structure.
2018-12-09fix and future-proof against stack overflow in aio io threadsRich Felker-1/+12
aio threads not using SIGEV_THREAD notification are created with small stacks and no guard page, which is possible since they only run the code for the requested io operation, not any application code. the motivation is not creating a lot of VMAs. however, the io thread needs to be able to receive a cancellation signal in case aio_cancel (implemented via pthread_cancel) is called. this requires sufficient stack space for a signal frame, which PTHREAD_STACK_MIN does not necessarily include. in principle MINSIGSTKSZ from signal.h should give us sufficient space for a signal frame, but the value is incorrect on some existing archs due to kernel addition of new vector register support without consideration for impact on ABI. some powerpc models exceed MINSIGSTKSZ by about 0.5k, and x86[_64] with AVX-512 can exceed it by up to about 1.5k. so use MINSIGSTKSZ+2048 to allow for the discrepancy plus some working space. unfortunately, it's possible that signal frame sizes could continue to grow, and some archs (aarch64) explicitly specify that they may. passing of a runtime value for MINSIGSTKSZ via AT_MINSIGSTKSZ in the aux vector was added to aarch64 linux, and presumably other archs will use this mechanism to report if they further increase the signal frame size. when AT_MINSIGSTKSZ is present, assume it's correct, so that we only need a small amount of working space in addition to it; in this case just add 512.
2018-09-12remove spurious inclusion of libc.h for LFS64 ABI aliasesRich Felker-6/+6
the LFS64 macro was not self-documenting and barely saved any characters. simply use weak_alias directly so that it's clear what's being done, and doesn't depend on a header to provide a strange macro.
2018-09-12reduce spurious inclusion of libc.hRich Felker-1/+0
libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
2015-03-03make all objects used with atomic operations volatileRich Felker-1/+2
the memory model we use internally for atomics permits plain loads of values which may be subject to concurrent modification without requiring that a special load function be used. since a compiler is free to make transformations that alter the number of loads or the way in which loads are performed, the compiler is theoretically free to break this usage. the most obvious concern is with atomic cas constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be transformed to a_cas(p,*p,f(*p)); where the latter is intended to show multiple loads of *p whose resulting values might fail to be equal; this would break the atomicity of the whole operation. but even more fundamental breakage is possible. with the changes being made now, objects that may be modified by atomics are modeled as volatile, and the atomic operations performed on them by other threads are modeled as asynchronous stores by hardware which happens to be acting on the request of another thread. such modeling of course does not itself address memory synchronization between cores/cpus, but that aspect was already handled. this all seems less than ideal, but it's the best we can do without mandating a C11 compiler and using the C11 model for atomics. in the case of pthread_once_t, the ABI type of the underlying object is not volatile-qualified. so we are assuming that accessing the object through a volatile-qualified lvalue via casts yields volatile access semantics. the language of the C standard is somewhat unclear on this matter, but this is an assumption the linux kernel also makes, and seems to be the correct interpretation of the standard.
2015-02-14fix type error (arch-dependent) in new aio codeRich Felker-1/+1
a_store is only valid for int, but ssize_t may be defined as long or another type. since there is no valid way for another thread to acess the return value without first checking the error/completion status of the aiocb anyway, an atomic store is not necessary.
2015-02-13overhaul aio implementation for correctnessRich Felker-0/+378
previously, aio operations were not tracked by file descriptor; each operation was completely independent. this resulted in non-conforming behavior for non-seekable/append-mode writes (which are required to be ordered) and made it impossible to implement aio_cancel, which in turn made closing file descriptors with outstanding aio operations unsafe. the new implementation is significantly heavier (roughly twice the size, and seems to be slightly slower) and presently aims mainly at correctness, not performance. most of the public interfaces have been moved into a single file, aio.c, because there is little benefit to be had from splitting them. whenever any aio functions are used, aio_cancel and the internal queue lifetime management and fd-to-queue mapping code must be linked, and these functions make up the bulk of the code size. the close function's interaction with aio is implemented with weak alias magic, to avoid pulling in heavy aio cancellation code in programs that don't use aio, and the expensive cancellation path (which includes signal blocking) is optimized out when there are no active aio queues.