summaryrefslogtreecommitdiff
path: root/src/linux
AgeCommit message (Collapse)AuthorLines
2024-02-25riscv: fall back to syscall __riscv_flush_icacheStefan O'Rear-0/+1
Matches glibc behavior and fixes a case where we could fall off the function without returning a value.
2024-02-24add statx interface using syscall, fallback to fstatatDuncan Bellamy-0/+42
2024-02-22add framework to support archs without a native wait4 syscallRich Felker-1/+1
this commit should make no codegen change for existing archs, but is a prerequisite for new archs including riscv32. the wait4 emulation backend provides both cancellable and non-cancellable variants because waitpid is required to be a cancellation point, but all of our other uses are not, and most of them cannot be. based on patch by Stefan O'Rear.
2024-02-03riscv: correct symbol version of __vdso_flush_icachegns-1/+1
Previously, __riscv_flush_icache would not work correctly as __vdso_flush_icache had a wrong symbol version. Fix this by correcting symbol version. Fixes: 0a48860c27a8 ("add riscv64 architecture support")
2024-01-25add preadv2 and pwritev2 syscall wrappers, flag value macrosRich Felker-0/+34
2024-01-21move ppoll from src/linux to src/select reflecting future standardizationRich Felker-26/+0
the ppoll function has been accepted as a future part of the standard as the outcome of Austin Group tracker issue 1263. move the source file to reflect this.
2023-06-01fix public clone function to be safe and usable by applicationsRich Felker-6/+50
the clone() function has been effectively unusable since it was added, due to producing a child process with inconsistent state. in particular, the child process's thread structure still contains the tid, thread list pointers, thread count, and robust list for the parent. this will cause malfunction in interfaces that attempt to use the tid or thread list, some of which are specified to be async-signal-safe. this patch attempts to make clone() consistent in a _Fork-like sense. as in _Fork, when the parent process is multi-threaded, the child process inherits an async-signal context where it cannot call AS-unsafe functions, but its context is now intended to be safe for calling AS-safe functions. making clone fork-like would also be a future option, if it turns out that this is what makes sense to applications, but it's not done at this time because the changes would be more invasive. in the case where the CLONE_VM flag is used, clone is only vfork-like, not _Fork-like. in particular, the child will see itself as having the parent's tid, and cannot safely call any libc functions but one of the exec family or _exit. handling of flags and variadic arguments is also changed so that arguments are only consumed with flags that indicate their presence, and so that flags which produce an inconsistent state are disallowed (reported as EINVAL). in particular, all libc functions carry a contract that they are only callable with ABI requirements met, which includes having a valid thread pointer to a thread structure that's unique within the process, and whose contents are opaque and only able to be setup internally by the implementation. the only way for an application to use flags that violate these requirements without executing any libc code is to perform the syscall from application-provided asm.
2023-04-11wait4: fix missing rusage on x32 due to wrong success conditionAlexey Izbyshev-1/+1
Resource usage data is filled by the kernel only when wait4 returns a pid, i.e. a positive value. Commit 5850546e9669f793aab61dfc7c4f2c1ff35c4b29 introduced this bug, possibly because of copy-pasting from getrusage.
2022-10-19remove LFS64 symbol aliases; replace with dynamic linker remappingRich Felker-10/+0
originally the namespace-infringing "large file support" interfaces were included as part of glibc-ABI-compat, with the intent that they not be used for linking, since our off_t is and always has been unconditionally 64-bit and since we usually do not aim to support nonstandard interfaces when there is an equivalent standard interface. unfortunately, having the symbols present and available for linking caused configure scripts to detect them and attempt to use them without declarations, producing all the expected ill effects that entails. as a result, commit 2dd8d5e1b8ba1118ff1782e96545cb8a2318592c was made to prevent this, using macros to redirect the LFS64 names to the standard names, conditional on _GNU_SOURCE or _LARGEFILE64_SOURCE. however, this has turned out to be a source of further problems, especially since g++ defines _GNU_SOURCE by default. in particular, the presence of these names as macros breaks a lot of valid code. this commit removes all the LFS64 symbols and replaces them with a mechanism in the dynamic linker symbol lookup failure path to retry with the spurious "64" removed from the symbol name. in the future, if/when the rest of glibc-ABI-compat is moved out of libc, this can be removed.
2022-08-24epoll_create: fail with EINVAL if size is non-positiveKristina Martsenko-0/+1
This is a part of the interface contract defined in the Linux man page (official for a Linux-specific interface) and asserted by test cases in the Linux Test Project (LTP).
2022-08-20use alt signal stack when present for implementation-internal signalsRich Felker-1/+1
a request for this behavior has been open for a long time. the motivation is that application code, particularly under some language runtimes designed around very-low-footprint coroutine type constructs, may be operating with extremely small stack sizes unsuitable for receiving signals, using a separate signal stack for any signals it might handle. progress on this was blocked at one point trying to determine whether the implementation is actually entitled to clobber the alt stack, but the phrasing "available to the implementation" in the POSIX spec for sigaltstack seems to make it clear that the application cannot rely on the contents of this memory to be preserved in the absence of signal delivery (on the abstract machine, excluding implementation-internal signals) and that we can therefore use it for delivery of signals that "don't exist" on the abstract machine. no change is made for SIGTIMER since it is always blocked when used, and accepted via sigwaitinfo rather than execution of the signal handler.
2021-04-03make epoll_[p]wait a cancellation pointRich Felker-2/+2
this is a Linux-specific function and not covered by POSIX's requirements for which interfaces are cancellation points, but glibc makes it one and existing software relies on it being one. at some point a review for similar functions that should be made cancellation points should be done.
2020-10-27fix setgroups behavior in multithreaded processRich Felker-1/+29
this function is outside the scope of the standards, but logically should behave like the set*id functions whose effects are process-global.
2020-10-14remove unused weak definition of __tl_sync in membarrier.cRich Felker-5/+0
2020-08-17add gettid functionRich Felker-0/+8
this is a prerequisite for addition of other interfaces that use kernel tids, including futex and SIGEV_THREAD_ID. there is some ambiguity as to whether the semantic return type should be int or pid_t. either way, futex API imposes a contract that the values fit in int (excluding some upper reserved bits). glibc used pid_t, so in the interest of not having gratuitous mismatch (the underlying types are the same anyway), pid_t is used here as well. while conceptually this is a syscall, the copy stored in the thread structure is always valid in all contexts where it's valid to call libc functions, so it's used to avoid the syscall.
2020-06-02reformat clock_adjtime with always-true condition removedRich Felker-48/+46
2020-06-02always use time64 syscall first for clock_adjtimeRich Felker-2/+1
clock_adjtime always returns the current clock setting in struct timex, so it's always possible that the time64 version is needed.
2020-06-02fix broken time64 clock_adjtimeRich Felker-1/+1
the 64-bit time code path used the wrong (time32) syscall. fortunately this code path is not yet taken unless attempting to set a post-Y2038 time.
2019-10-20clock_adjtime: generalize time64 not to assume old struct layout matchRich Felker-11/+46
commit 2b4fd6f75b4fa66d28cddcf165ad48e8fda486d1 added time64 for this function, but did so with a hidden assumption that the new time64 version of struct timex will be layout-compatible with the old one. however, there is little benefit to doing it that way, and the cost is permanent special-casing of 32-bit archs with 64-bit time_t in the public interface definitions. instead, do a full translation of the structure going in and out. this commit is actually a revision to an earlier uncommited version of the code.
2019-10-19wait4, getrusage: add time64/x32 variantRich Felker-2/+32
presently the kernel does not actually define time64 versions of these syscalls, and they're not really needed except to represent extreme cpu time usage. however, x32's versions of the syscalls already behave as time64 ones, meaning the functions were broken on x32 if the caller used any part of the rusage result other than ru_utime and ru_stime. commit 7e8171143124f7f510db555dc6f6327a965a3e84 made it possible to fix this by treating x32's syscalls as time64 versions. in the non-time64-syscall case, make the syscall with the rusage destination pointer adjusted so that all members but the timevals line up between the libc and kernel structures. on 64-bit archs, or present 32-bit archs with 32-bit time_t, the timevals will line up too and no further work is needed. for future 32-bit archs with 64-bit time_t, the timevals are copied into place, contingent on time_t being larger than long.
2019-08-23add copy_file_range system call wrapperÁrni Dagur-0/+8
2019-08-02clock_adjtime: add time64 support, decouple 32-bit time_t, fix x32Rich Felker-0/+110
the 64-bit/time64 version of the syscall is not API-compatible with the userspace timex structure definition; fields specified as long have type long long. so when using the time64 syscall, we have to convert the entire structure. this was always the case for x32 as well, but went unnoticed, meaning that clock_adjtime just passed junk to the kernel on x32. it should be fixed now. for the fallback case, we avoid encoding any assumptions about the new location of the time member or naming of the legacy slots by accessing them through a union of the kernel type and the new userspace type. the only assumption is that the non-time members live at the same offsets as in the (non-time64, long-based) kernel timex struct. this property saves us from having to convert the whole thing, and avoids a lot of additional work in compat shims. the new code is statically unreachable for now except on x32, where it fixes major brokenness. it is permanently unreachable on 64-bit.
2019-07-29timerfd: add time64 syscall support, decouple 32-bit time_tRich Felker-0/+42
the changes here are semantically and structurally identical to those made to timer_settime and timer_gettime for time64 support.
2019-07-28pselect, ppoll: add time64 syscall support, decouple 32-bit time_tRich Felker-1/+17
time64 syscall is used only if it's the only one defined for the arch, or if the requested timeout length does not fit in 32 bits. on current 32-bit archs where time_t is a 32-bit type, this makes it statically unreachable. on 64-bit archs, there are only superficial changes to the code after preprocessing. both before and after these changes, these functions copied their timeout arguments to avoid letting the kernel clobber the caller's copies. now, the copying also serves to change the type from userspace timespec to a pair of longs, which makes a difference only in the 32-bit fallback case, not on 64-bit.
2019-07-27implement settimeofday in terms of clock_settime, not old syscallRich Felker-1/+6
this is yet another place where special handling of time syscalls can and should be avoided by implementing legacy functions in terms of their modern replacements. in theory a fallback to SYS_settimeofday could be added to clock_settime, but SYS_clock_settime has been available since Linux 2.6.0 or earlier, i.e. all the way back to the minimum supported version.
2019-07-20refactor adjtime function using adjtimex function instead of syscallRich Felker-1/+1
this removes the assumption that userspace struct timex matches the syscall type and sets the stage for 64-bit time_t on 32-bit archs.
2019-07-20refactor adjtimex in terms of clock_adjtimeRich Felker-2/+4
this sets the stage for having the conversion logic for 64-bit time_t all in one file, and as a bonus makes clock_adjtime for CLOCK_REALTIME work even on kernels too old to have the clock_adjtime syscall.
2019-06-28cap getdents length argument to INT_MAXRich Felker-0/+2
the linux syscall treats this argument as having type int, so passing extremely long buffer sizes would be misinterpreted by the kernel. since "short reads" are always acceptable, just cap it down. patch based on report and suggested change by Florian Weimer.
2019-06-14add riscv64 architecture supportRich Felker-0/+33
Author: Alex Suykov <alex.suykov@gmail.com> Author: Aric Belsito <lluixhi@gmail.com> Author: Drew DeVault <sir@cmpwn.com> Author: Michael Clark <mjc@sifive.com> Author: Michael Forney <mforney@mforney.org> Author: Stefan O'Rear <sorear2@gmail.com> This port has involved the work of many people over several years. I have tried to ensure that everyone with substantial contributions has been credited above; if any omissions are found they will be noted later in an update to the authors/contributors list in the COPYRIGHT file. The version committed here comes from the riscv/riscv-musl repo's commit 3fe7e2c75df78eef42dcdc352a55757729f451e2, with minor changes by me for issues found during final review: - a_ll/a_sc atomics are removed (according to the ISA spec, lr/sc are not safe to use in separate inline asm fragments) - a_cas[_p] is fixed to be a memory barrier - the call from the _start assembly into the C part of crt1/ldso is changed to allow for the possibility that the linker does not place them nearby each other. - DTP_OFFSET is defined correctly so that local-dynamic TLS works - reloc.h LDSO_ARCH logic is simplified and made explicit. - unused, non-functional crti/n asm files are removed. - an empty .sdata section is added to crt1 so that the __global_pointer reference is resolvable. - indentation style errors in some asm files are fixed.
2019-04-09in membarrier fallback, allow for possibility that sigaction failsRich Felker-8/+9
this is a workaround to avoid a crashing regression on qemu-user when dynamic TLS is installed at dlopen time. the sigaction syscall should not be able to fail, but it does fail for implementation-internal signals under qemu user-level emulation if the host libc qemu is running under reserves the same signals for implementation-internal use, since qemu makes no provision to redirect/emulate them. after sigaction fails, the subsequent tkill would terminate the process abnormally as the default action. no provision to account for membarrier failing is made in the dynamic linker code that installs new TLS. at the formal level, the missing barrier in this case is incorrect, and perhaps we should fail the dlopen operation, but in practice all the archs we support (and probably all real-world archs except alpha, which isn't yet supported) should give the right behavior with no barrier at all as a consequence of consume-order properties. in the long term, this workaround should be supplemented or replaced by something better -- a different fallback approach to ensuring memory consistency, or dynamic allocation of implementation-internal signals. the latter is appealing in that it would allow cancellation to work under qemu-user too, and would even allow many levels of nested emulation.
2019-02-22add membarrier syscall wrapper, refactor dynamic tls install to use itRich Felker-0/+76
the motivation for this change is twofold. first, it gets the fallback logic out of the dynamic linker, improving code readability and organization. second, it provides application code that wants to use the membarrier syscall, which depends on preregistration of intent before the process becomes multithreaded unless unbounded latency is acceptable, with a symbol that, when linked, ensures that this registration happens.
2018-09-12wireup linux/name_to_handle_at and name_to_handle_at syscallsKhem Raj-0/+18
2018-09-12remove spurious inclusion of libc.h for LFS64 ABI aliasesRich Felker-8/+4
the LFS64 macro was not self-documenting and barely saved any characters. simply use weak_alias directly so that it's clear what's being done, and doesn't depend on a header to provide a strange macro.
2018-09-12reduce spurious inclusion of libc.hRich Felker-5/+2
libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
2018-09-12remove unused __getdents, rename and move fileRich Felker-0/+9
the __-prefixed filename does not make sense when the only purpose of this file is implementing a public function that's not used as a backend for implementing the standard dirent functions.
2018-09-12overhaul internally-public declarations using wrapper headersRich Felker-2/+0
commits leading up to this one have moved the vast majority of libc-internal interface declarations to appropriate internal headers, allowing them to be type-checked and setting the stage to limit their visibility. the ones that have not yet been moved are mostly namespace-protected aliases for standard/public interfaces, which exist to facilitate implementing plain C functions in terms of POSIX functionality, or C or POSIX functionality in terms of extensions that are not standardized. some don't quite fit this description, but are "internally public" interfacs between subsystems of libc. rather than create a number of newly-named headers to declare these functions, and having to add explicit include directives for them to every source file where they're needed, I have introduced a method of wrapping the corresponding public headers. parallel to the public headers in $(srcdir)/include, we now have wrappers in $(srcdir)/src/include that come earlier in the include path order. they include the public header they're wrapping, then add declarations for namespace-protected versions of the same interfaces and any "internally public" interfaces for the subsystem they correspond to. along these lines, the wrapper for features.h is now responsible for the definition of the hidden, weak, and weak_alias macros. this means source files will no longer need to include any special headers to access these features. over time, it is my expectation that the scope of what is "internally public" will expand, reducing the number of source files which need to include *_impl.h and related headers down to those which are actually implementing the corresponding subsystems, not just using them.
2018-09-12fix issues from public functions defined without declaration visibleRich Felker-0/+6
policy is that all public functions which have a public declaration should be defined in a context where that public declaration is visible, to avoid preventable type mismatches. an audit performed using GCC's -Wmissing-declarations turned up the violations corrected here. in some cases the public header had not been included; in others, a feature test macro needed to make the declaration visible had been omitted. in the case of gethostent and getnetent, the omission seems to have been intentional, as a hack to admit a single stub definition for both functions. this kind of hack is no longer acceptable; it's UB and would not fly with LTO or advanced toolchains. the hack is undone to make exposure of the declarations possible.
2018-06-20add memfd_create syscall wrapperSzabolcs Nagy-0/+8
memfd_create was added in linux v3.17 and glibc has api for it.
2018-06-20add mlock2 linux syscall wrapperSzabolcs Nagy-0/+10
mlock2 syscall was added in linux v4.4 and glibc has api for it. It falls back to mlock in case of flags==0, so that case works even on older kernels. MLOCK_ONFAULT is moved under _GNU_SOURCE following glibc.
2018-02-22add getrandom syscall wrapperHauke Mehrtens-0/+7
This syscall is available since Linux 3.17 and was also implemented in glibc in version 2.25 using the same interfaces.
2017-07-04fix undefined behavior in ptraceAlexander Monakov-2/+6
2016-01-22move x32 sysinfo impl and syscall fixup code out of arch/x32/srcRich Felker-1/+50
all such arch-specific translation units are being moved to appropriate arch dirs under the main src tree.
2015-07-09fix incorrect void return type for syncfs functionRich Felker-2/+2
being nonstandard, the closest thing to a specification for this function is its man page, which documents it as returning int. it can fail with EBADF if the file descriptor passed is invalid.
2014-06-14fix missing argument to syscall in fanotify_markClément Vasseur-1/+1
2014-05-30fix breakage from recent syscall commits due to missing errno macrosRich Felker-0/+3
2014-05-30fix for broken kernel side RLIM_INFINITY on mipsSzabolcs Nagy-1/+16
On 32 bit mips the kernel uses -1UL/2 to mark RLIM_INFINITY (and this is the definition in the userspace api), but since it is in the middle of the valid range of limits and limits are often compared with relational operators, various kernel side logic is broken if larger than -1UL/2 limits are used. So we truncate the limits to -1UL/2 in get/setrlimit and prlimit. Even if the kernel side logic consistently treated -1UL/2 as greater than any other limit value, there wouldn't be any clean workaround that allowed using large limits: * using -1UL/2 as RLIM_INFINITY in userspace would mean different infinity value for get/setrlimt and prlimit (where infinity is always -1ULL) and userspace logic could break easily (just like the kernel is broken now) and more special case code would be needed for mips. * translating -1UL/2 kernel side value to -1ULL in userspace would mean that -1UL/2 limit cannot be set (eg. -1UL/2+1 had to be passed to the kernel instead).
2014-05-29support linux kernel apis (new archs) with old syscalls removedRich Felker-8/+29
such archs are expected to omit definitions of the SYS_* macros for syscalls their kernels lack from arch/$ARCH/bits/syscall.h. the preprocessor is then able to select the an appropriate implementation for affected functions. two basic strategies are used on a case-by-case basis: where the old syscalls correspond to deprecated library-level functions, the deprecated functions have been converted to wrappers for the modern function, and the modern function has fallback code (omitted at the preprocessor level on new archs) to make use of the old syscalls if the new syscall fails with ENOSYS. this also improves functionality on older kernels and eliminates the incentive to program with deprecated library-level functions for the sake of compatibility with older kernels. in other situations where the old syscalls correspond to library-level functions which are not deprecated but merely lack some new features, such as the *at functions, the old syscalls are still used on archs which support them. this may change at some point in the future if or when fallback code is added to the new functions to make them usable (possibly with reduced functionality) on old kernels.
2014-04-15add namespace-protected name for sysinfo functionRich Felker-6/+5
it will be needed to implement some things in sysconf, and the syscall can't easily be used directly because the x32 syscall uses the wrong structure layout. the l (uncreative, for "linux") prefix is used since the symbol name __sysinfo is already taken for AT_SYSINFO from the aux vector. the way the x32 override of this function works is also changed to be simpler and avoid the useless jump instruction.
2014-03-06x32: fix sysinfo()rofl0r-0/+5
the kernel uses long longs in the struct, but the documentation says they're long. so we need to fixup the mismatch between the userspace and kernelspace structs. since the struct offers a mem_unit member, we can avoid truncation by adjusting that value.
2014-02-09clone: make clone a wrapper around __cloneBobby Bingham-0/+19
The architecture-specific assembly versions of clone did not set errno on failure, which is inconsistent with glibc. __clone still returns the error via its return value, and clone is now a wrapper that sets errno as needed. The public clone has also been moved to src/linux, as it's not directly related to the pthreads API. __clone is called by pthread_create, which does not report errors via errno. Though not strictly necessary, it's nice to avoid clobbering errno here.