path: root/src/linux
AgeCommit message (Collapse)AuthorLines
2021-04-03make epoll_[p]wait a cancellation pointRich Felker-2/+2
this is a Linux-specific function and not covered by POSIX's requirements for which interfaces are cancellation points, but glibc makes it one and existing software relies on it being one. at some point a review for similar functions that should be made cancellation points should be done.
2020-10-27fix setgroups behavior in multithreaded processRich Felker-1/+29
this function is outside the scope of the standards, but logically should behave like the set*id functions whose effects are process-global.
2020-10-14remove unused weak definition of __tl_sync in membarrier.cRich Felker-5/+0
2020-08-17add gettid functionRich Felker-0/+8
this is a prerequisite for addition of other interfaces that use kernel tids, including futex and SIGEV_THREAD_ID. there is some ambiguity as to whether the semantic return type should be int or pid_t. either way, futex API imposes a contract that the values fit in int (excluding some upper reserved bits). glibc used pid_t, so in the interest of not having gratuitous mismatch (the underlying types are the same anyway), pid_t is used here as well. while conceptually this is a syscall, the copy stored in the thread structure is always valid in all contexts where it's valid to call libc functions, so it's used to avoid the syscall.
2020-06-02reformat clock_adjtime with always-true condition removedRich Felker-48/+46
2020-06-02always use time64 syscall first for clock_adjtimeRich Felker-2/+1
clock_adjtime always returns the current clock setting in struct timex, so it's always possible that the time64 version is needed.
2020-06-02fix broken time64 clock_adjtimeRich Felker-1/+1
the 64-bit time code path used the wrong (time32) syscall. fortunately this code path is not yet taken unless attempting to set a post-Y2038 time.
2019-10-20clock_adjtime: generalize time64 not to assume old struct layout matchRich Felker-11/+46
commit 2b4fd6f75b4fa66d28cddcf165ad48e8fda486d1 added time64 for this function, but did so with a hidden assumption that the new time64 version of struct timex will be layout-compatible with the old one. however, there is little benefit to doing it that way, and the cost is permanent special-casing of 32-bit archs with 64-bit time_t in the public interface definitions. instead, do a full translation of the structure going in and out. this commit is actually a revision to an earlier uncommited version of the code.
2019-10-19wait4, getrusage: add time64/x32 variantRich Felker-2/+32
presently the kernel does not actually define time64 versions of these syscalls, and they're not really needed except to represent extreme cpu time usage. however, x32's versions of the syscalls already behave as time64 ones, meaning the functions were broken on x32 if the caller used any part of the rusage result other than ru_utime and ru_stime. commit 7e8171143124f7f510db555dc6f6327a965a3e84 made it possible to fix this by treating x32's syscalls as time64 versions. in the non-time64-syscall case, make the syscall with the rusage destination pointer adjusted so that all members but the timevals line up between the libc and kernel structures. on 64-bit archs, or present 32-bit archs with 32-bit time_t, the timevals will line up too and no further work is needed. for future 32-bit archs with 64-bit time_t, the timevals are copied into place, contingent on time_t being larger than long.
2019-08-23add copy_file_range system call wrapperÁrni Dagur-0/+8
2019-08-02clock_adjtime: add time64 support, decouple 32-bit time_t, fix x32Rich Felker-0/+110
the 64-bit/time64 version of the syscall is not API-compatible with the userspace timex structure definition; fields specified as long have type long long. so when using the time64 syscall, we have to convert the entire structure. this was always the case for x32 as well, but went unnoticed, meaning that clock_adjtime just passed junk to the kernel on x32. it should be fixed now. for the fallback case, we avoid encoding any assumptions about the new location of the time member or naming of the legacy slots by accessing them through a union of the kernel type and the new userspace type. the only assumption is that the non-time members live at the same offsets as in the (non-time64, long-based) kernel timex struct. this property saves us from having to convert the whole thing, and avoids a lot of additional work in compat shims. the new code is statically unreachable for now except on x32, where it fixes major brokenness. it is permanently unreachable on 64-bit.
2019-07-29timerfd: add time64 syscall support, decouple 32-bit time_tRich Felker-0/+42
the changes here are semantically and structurally identical to those made to timer_settime and timer_gettime for time64 support.
2019-07-28pselect, ppoll: add time64 syscall support, decouple 32-bit time_tRich Felker-1/+17
time64 syscall is used only if it's the only one defined for the arch, or if the requested timeout length does not fit in 32 bits. on current 32-bit archs where time_t is a 32-bit type, this makes it statically unreachable. on 64-bit archs, there are only superficial changes to the code after preprocessing. both before and after these changes, these functions copied their timeout arguments to avoid letting the kernel clobber the caller's copies. now, the copying also serves to change the type from userspace timespec to a pair of longs, which makes a difference only in the 32-bit fallback case, not on 64-bit.
2019-07-27implement settimeofday in terms of clock_settime, not old syscallRich Felker-1/+6
this is yet another place where special handling of time syscalls can and should be avoided by implementing legacy functions in terms of their modern replacements. in theory a fallback to SYS_settimeofday could be added to clock_settime, but SYS_clock_settime has been available since Linux 2.6.0 or earlier, i.e. all the way back to the minimum supported version.
2019-07-20refactor adjtime function using adjtimex function instead of syscallRich Felker-1/+1
this removes the assumption that userspace struct timex matches the syscall type and sets the stage for 64-bit time_t on 32-bit archs.
2019-07-20refactor adjtimex in terms of clock_adjtimeRich Felker-2/+4
this sets the stage for having the conversion logic for 64-bit time_t all in one file, and as a bonus makes clock_adjtime for CLOCK_REALTIME work even on kernels too old to have the clock_adjtime syscall.
2019-06-28cap getdents length argument to INT_MAXRich Felker-0/+2
the linux syscall treats this argument as having type int, so passing extremely long buffer sizes would be misinterpreted by the kernel. since "short reads" are always acceptable, just cap it down. patch based on report and suggested change by Florian Weimer.
2019-06-14add riscv64 architecture supportRich Felker-0/+33
Author: Alex Suykov <> Author: Aric Belsito <> Author: Drew DeVault <> Author: Michael Clark <> Author: Michael Forney <> Author: Stefan O'Rear <> This port has involved the work of many people over several years. I have tried to ensure that everyone with substantial contributions has been credited above; if any omissions are found they will be noted later in an update to the authors/contributors list in the COPYRIGHT file. The version committed here comes from the riscv/riscv-musl repo's commit 3fe7e2c75df78eef42dcdc352a55757729f451e2, with minor changes by me for issues found during final review: - a_ll/a_sc atomics are removed (according to the ISA spec, lr/sc are not safe to use in separate inline asm fragments) - a_cas[_p] is fixed to be a memory barrier - the call from the _start assembly into the C part of crt1/ldso is changed to allow for the possibility that the linker does not place them nearby each other. - DTP_OFFSET is defined correctly so that local-dynamic TLS works - reloc.h LDSO_ARCH logic is simplified and made explicit. - unused, non-functional crti/n asm files are removed. - an empty .sdata section is added to crt1 so that the __global_pointer reference is resolvable. - indentation style errors in some asm files are fixed.
2019-04-09in membarrier fallback, allow for possibility that sigaction failsRich Felker-8/+9
this is a workaround to avoid a crashing regression on qemu-user when dynamic TLS is installed at dlopen time. the sigaction syscall should not be able to fail, but it does fail for implementation-internal signals under qemu user-level emulation if the host libc qemu is running under reserves the same signals for implementation-internal use, since qemu makes no provision to redirect/emulate them. after sigaction fails, the subsequent tkill would terminate the process abnormally as the default action. no provision to account for membarrier failing is made in the dynamic linker code that installs new TLS. at the formal level, the missing barrier in this case is incorrect, and perhaps we should fail the dlopen operation, but in practice all the archs we support (and probably all real-world archs except alpha, which isn't yet supported) should give the right behavior with no barrier at all as a consequence of consume-order properties. in the long term, this workaround should be supplemented or replaced by something better -- a different fallback approach to ensuring memory consistency, or dynamic allocation of implementation-internal signals. the latter is appealing in that it would allow cancellation to work under qemu-user too, and would even allow many levels of nested emulation.
2019-02-22add membarrier syscall wrapper, refactor dynamic tls install to use itRich Felker-0/+76
the motivation for this change is twofold. first, it gets the fallback logic out of the dynamic linker, improving code readability and organization. second, it provides application code that wants to use the membarrier syscall, which depends on preregistration of intent before the process becomes multithreaded unless unbounded latency is acceptable, with a symbol that, when linked, ensures that this registration happens.
2018-09-12wireup linux/name_to_handle_at and name_to_handle_at syscallsKhem Raj-0/+18
2018-09-12remove spurious inclusion of libc.h for LFS64 ABI aliasesRich Felker-8/+4
the LFS64 macro was not self-documenting and barely saved any characters. simply use weak_alias directly so that it's clear what's being done, and doesn't depend on a header to provide a strange macro.
2018-09-12reduce spurious inclusion of libc.hRich Felker-5/+2
libc.h was intended to be a header for access to global libc state and related interfaces, but ended up included all over the place because it was the way to get the weak_alias macro. most of the inclusions removed here are places where weak_alias was needed. a few were recently introduced for hidden. some go all the way back to when libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented) cancellation points had to include it. remaining spurious users are mostly callers of the LOCK/UNLOCK macros and files that use the LFS64 macro to define the awful *64 aliases. in a few places, new inclusion of libc.h is added because several internal headers no longer implicitly include libc.h. declarations for __lockfile and __unlockfile are moved from libc.h to stdio_impl.h so that the latter does not need libc.h. putting them in libc.h made no sense at all, since the macros in stdio_impl.h are needed to use them correctly anyway.
2018-09-12remove unused __getdents, rename and move fileRich Felker-0/+9
the __-prefixed filename does not make sense when the only purpose of this file is implementing a public function that's not used as a backend for implementing the standard dirent functions.
2018-09-12overhaul internally-public declarations using wrapper headersRich Felker-2/+0
commits leading up to this one have moved the vast majority of libc-internal interface declarations to appropriate internal headers, allowing them to be type-checked and setting the stage to limit their visibility. the ones that have not yet been moved are mostly namespace-protected aliases for standard/public interfaces, which exist to facilitate implementing plain C functions in terms of POSIX functionality, or C or POSIX functionality in terms of extensions that are not standardized. some don't quite fit this description, but are "internally public" interfacs between subsystems of libc. rather than create a number of newly-named headers to declare these functions, and having to add explicit include directives for them to every source file where they're needed, I have introduced a method of wrapping the corresponding public headers. parallel to the public headers in $(srcdir)/include, we now have wrappers in $(srcdir)/src/include that come earlier in the include path order. they include the public header they're wrapping, then add declarations for namespace-protected versions of the same interfaces and any "internally public" interfaces for the subsystem they correspond to. along these lines, the wrapper for features.h is now responsible for the definition of the hidden, weak, and weak_alias macros. this means source files will no longer need to include any special headers to access these features. over time, it is my expectation that the scope of what is "internally public" will expand, reducing the number of source files which need to include *_impl.h and related headers down to those which are actually implementing the corresponding subsystems, not just using them.
2018-09-12fix issues from public functions defined without declaration visibleRich Felker-0/+6
policy is that all public functions which have a public declaration should be defined in a context where that public declaration is visible, to avoid preventable type mismatches. an audit performed using GCC's -Wmissing-declarations turned up the violations corrected here. in some cases the public header had not been included; in others, a feature test macro needed to make the declaration visible had been omitted. in the case of gethostent and getnetent, the omission seems to have been intentional, as a hack to admit a single stub definition for both functions. this kind of hack is no longer acceptable; it's UB and would not fly with LTO or advanced toolchains. the hack is undone to make exposure of the declarations possible.
2018-06-20add memfd_create syscall wrapperSzabolcs Nagy-0/+8
memfd_create was added in linux v3.17 and glibc has api for it.
2018-06-20add mlock2 linux syscall wrapperSzabolcs Nagy-0/+10
mlock2 syscall was added in linux v4.4 and glibc has api for it. It falls back to mlock in case of flags==0, so that case works even on older kernels. MLOCK_ONFAULT is moved under _GNU_SOURCE following glibc.
2018-02-22add getrandom syscall wrapperHauke Mehrtens-0/+7
This syscall is available since Linux 3.17 and was also implemented in glibc in version 2.25 using the same interfaces.
2017-07-04fix undefined behavior in ptraceAlexander Monakov-2/+6
2016-01-22move x32 sysinfo impl and syscall fixup code out of arch/x32/srcRich Felker-1/+50
all such arch-specific translation units are being moved to appropriate arch dirs under the main src tree.
2015-07-09fix incorrect void return type for syncfs functionRich Felker-2/+2
being nonstandard, the closest thing to a specification for this function is its man page, which documents it as returning int. it can fail with EBADF if the file descriptor passed is invalid.
2014-06-14fix missing argument to syscall in fanotify_markClément Vasseur-1/+1
2014-05-30fix breakage from recent syscall commits due to missing errno macrosRich Felker-0/+3
2014-05-30fix for broken kernel side RLIM_INFINITY on mipsSzabolcs Nagy-1/+16
On 32 bit mips the kernel uses -1UL/2 to mark RLIM_INFINITY (and this is the definition in the userspace api), but since it is in the middle of the valid range of limits and limits are often compared with relational operators, various kernel side logic is broken if larger than -1UL/2 limits are used. So we truncate the limits to -1UL/2 in get/setrlimit and prlimit. Even if the kernel side logic consistently treated -1UL/2 as greater than any other limit value, there wouldn't be any clean workaround that allowed using large limits: * using -1UL/2 as RLIM_INFINITY in userspace would mean different infinity value for get/setrlimt and prlimit (where infinity is always -1ULL) and userspace logic could break easily (just like the kernel is broken now) and more special case code would be needed for mips. * translating -1UL/2 kernel side value to -1ULL in userspace would mean that -1UL/2 limit cannot be set (eg. -1UL/2+1 had to be passed to the kernel instead).
2014-05-29support linux kernel apis (new archs) with old syscalls removedRich Felker-8/+29
such archs are expected to omit definitions of the SYS_* macros for syscalls their kernels lack from arch/$ARCH/bits/syscall.h. the preprocessor is then able to select the an appropriate implementation for affected functions. two basic strategies are used on a case-by-case basis: where the old syscalls correspond to deprecated library-level functions, the deprecated functions have been converted to wrappers for the modern function, and the modern function has fallback code (omitted at the preprocessor level on new archs) to make use of the old syscalls if the new syscall fails with ENOSYS. this also improves functionality on older kernels and eliminates the incentive to program with deprecated library-level functions for the sake of compatibility with older kernels. in other situations where the old syscalls correspond to library-level functions which are not deprecated but merely lack some new features, such as the *at functions, the old syscalls are still used on archs which support them. this may change at some point in the future if or when fallback code is added to the new functions to make them usable (possibly with reduced functionality) on old kernels.
2014-04-15add namespace-protected name for sysinfo functionRich Felker-6/+5
it will be needed to implement some things in sysconf, and the syscall can't easily be used directly because the x32 syscall uses the wrong structure layout. the l (uncreative, for "linux") prefix is used since the symbol name __sysinfo is already taken for AT_SYSINFO from the aux vector. the way the x32 override of this function works is also changed to be simpler and avoid the useless jump instruction.
2014-03-06x32: fix sysinfo()rofl0r-0/+5
the kernel uses long longs in the struct, but the documentation says they're long. so we need to fixup the mismatch between the userspace and kernelspace structs. since the struct offers a mem_unit member, we can avoid truncation by adjusting that value.
2014-02-09clone: make clone a wrapper around __cloneBobby Bingham-0/+19
The architecture-specific assembly versions of clone did not set errno on failure, which is inconsistent with glibc. __clone still returns the error via its return value, and clone is now a wrapper that sets errno as needed. The public clone has also been moved to src/linux, as it's not directly related to the pthreads API. __clone is called by pthread_create, which does not report errors via errno. Though not strictly necessary, it's nice to avoid clobbering errno here.
2014-01-07fix const-correctness of argument to stimeRich Felker-1/+1
it's unclear what the historical signature for this function was, but semantically, the argument should be a pointer to const, and this is what glibc uses. correct programs should not be using this function anyway, so it's unlikely to matter.
2014-01-07fix signedness of pgoff argument to remap_file_pagesRich Felker-1/+1
both the kernel and glibc agree that this argument is unsigned; the incorrect type ssize_t came from erroneous man pages.
2014-01-07fix incorrect type for wd argument of inotify_rm_watchRich Felker-1/+1
this was wrong since the original commit adding inotify, and I don't see any explanation for it. not even the man pages have it wrong. it was most likely a copy-and-paste error.
2014-01-06add some missing LFS64 aliases for fadvise/fallocate functionsRich Felker-0/+4
2014-01-03fanotify.c: fix typo in header inclusionrofl0r-1/+1
the header is included only as a guard to check that the declaration and definition match, so the typo didn't cause any breakage aside from omitting this check.
2014-01-02disable the brk functionRich Felker-1/+2
the reasons are the same as for sbrk. unlike sbrk, there is no safe usage because brk does not return any useful information, so it should just fail unconditionally.
2014-01-02disable sbrk for all values of increment except 0Rich Felker-3/+3
use of sbrk is never safe; it conflicts with malloc, and malloc may be used internally by the implementation basically anywhere. prior to this change, applications attempting to use sbrk to do their own heap management simply caused untrackable memory corruption; now, they will fail with ENOMEM allowing the errors to be fixed. sbrk(0) is still permitted as a way to get the current brk; some misguided applications use this as a measurement of their memory usage or for other related purposes, and such usage is harmless. eventually sbrk may be re-added if/when malloc is changed to avoid using the brk by using mmap for all allocations.
2014-01-02add fanotify syscall wrapper and headerrofl0r-0/+14
2013-12-20add sys/quota.h and quotactl syscall wrapperRich Felker-0/+7
based on patch by Timo Teräs.
2013-12-12include cleanups: remove unused headers and add feature test macrosSzabolcs Nagy-10/+12
2013-05-26fix the prototype of settimeofday to follow the original BSD declarationSzabolcs Nagy-1/+2