summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)AuthorLines
2019-11-03fix time64 link regression of dlsym stub for static-linked programsRich Felker-0/+4
in commit 22daaea39f1cc5f7391f0a5cd84576ffb58c2860, the __dlsym_redir_time64 function providing the backend for __dlsym_time64 was defined only in the dynamic linker, and thus was undefined when static linking a program referencing dlsym. use the same stub_dlsym definition that provides __dlsym (the non-redirecting backend) for static linked programs to provide it, conditional on _REDIR_TIME64.
2019-11-02add __dlsym_time64 asm entry point for all legacy-32bit-time_t archsRich Felker-0/+27
2019-10-28make fstatat fill in old time32 stat fields tooRich Felker-0/+16
here _REDIR_TIME64 is used as an indication that there's an old ABI, and thereby the old time32 timespec fields of struct stat. keeping struct stat compatible and providing both versions of the timespec fields is done so that ftw/nftw does not need painful compat shims, and (more importantly) so that similar interfaces between pairs of libc consumers (applications/libraries) will be less likely to break when one has been rebuilt for time64 but the other has not.
2019-10-28disable lfs64 aliases for remapped time64 functionsRich Felker-0/+14
these functions cannot provide the glibc lfs64-ABI-compatible symbols when time_t differs from what it was in that ABI. instead, the aliases need to be provided by the time32 compat shims or through some other mechanism.
2019-10-25update case mappings to unicode 12.1.0Rich Felker-85/+92
2019-10-25update ctype data to unicode 12.1.0u_quark-201/+232
2019-10-25overhaul wide character case mapping implementationRich Felker-290/+345
the existing implementation of case mappings was very small (typically around 1.5k), but unmaintainable, requiring manual addition of new case mappings with each new edition of Unicode. often, it turned out that newly-added case mappings were not easily representable in the existing tightly-constrained table structures, requiring new hacks to be invented and delaying support for new characters. the new implementation added here follows the pattern used for character class membership, with a two-level table allowing Unicode blocks for which no data is needed to be elided. however, rather than single-bit data, each character maps to a one of up to 6 case-mapping rules available to its block, where 6 is floor(cbrt(256)) and allow 3 characters to be represented per byte (vs 8 with bit tables). blocks that would need more than 6 rules designate one as an exception and let lookup pass into a binary search of exceptional cases for the block. the number 6 was chosen empirically; many blocks would be ok with 4 rules (uncased, lower, upper, possible exceptions), some even just with 2, but the latter are rare and fitting 4 characters per byte rather than 3 does not save significant space. moreover, somewhat surprisingly, there are sufficiently many blocks where even 4 rules don't suffice without a lot of exceptions (blocks where some case pairs are laced, others offset) that originally I was looking at supporting variable-width tables, with 1-, 2-, or 3-bit entries, thereby allowing blocks with 8 rules. as implemented in my experiments, that version was significantly larger and involved more memory accesses/cache lines. improvements in size at the expense of some performance might be possible by utilizing iswalpha data or merging the table of case mapping identity with alphabetic identity. these were explored somewhat when the code was first written, and might be worth revisiting in the future.
2019-10-25add missing case mapping between U+03F3 and U+037FRich Felker-0/+1
somehow this seems to have been overlooked. add it now so that subsequent overhaul of case mapping implementation will not introduce a functional change at the same time.
2019-10-24fix errno for posix_openpt with no free ptys availableRich Felker-1/+3
linux fails the open with ENOSPC, but POSIX mandates EAGAIN.
2019-10-20clock_adjtime: generalize time64 not to assume old struct layout matchRich Felker-11/+46
commit 2b4fd6f75b4fa66d28cddcf165ad48e8fda486d1 added time64 for this function, but did so with a hidden assumption that the new time64 version of struct timex will be layout-compatible with the old one. however, there is little benefit to doing it that way, and the cost is permanent special-casing of 32-bit archs with 64-bit time_t in the public interface definitions. instead, do a full translation of the structure going in and out. this commit is actually a revision to an earlier uncommited version of the code.
2019-10-19wait4, getrusage: add time64/x32 variantRich Felker-3/+61
presently the kernel does not actually define time64 versions of these syscalls, and they're not really needed except to represent extreme cpu time usage. however, x32's versions of the syscalls already behave as time64 ones, meaning the functions were broken on x32 if the caller used any part of the rusage result other than ru_utime and ru_stime. commit 7e8171143124f7f510db555dc6f6327a965a3e84 made it possible to fix this by treating x32's syscalls as time64 versions. in the non-time64-syscall case, make the syscall with the rusage destination pointer adjusted so that all members but the timevals line up between the libc and kernel structures. on 64-bit archs, or present 32-bit archs with 32-bit time_t, the timevals will line up too and no further work is needed. for future 32-bit archs with 64-bit time_t, the timevals are copied into place, contingent on time_t being larger than long.
2019-10-18fix return value of ungetc when argument is outside unsigned char rangeRich Felker-1/+1
aside from the special value EOF, ungetc is specified to accept and convert values outside the range of unsigned char. conversion takes place automatically as part of assignment when storing into the buffer, but the return value is also required to be the resulting converted value, and this requirement was not satisfied. simplified from patch by Wang Jianjian.
2019-10-18fix incorrect use of fabs on long double operand in floatscan.cRich Felker-4/+1
based on patch by Dan Gohman, who caught this via compiler warnings. analysis by Szabolcs Nagy determined that it's a bug, whereby errno can be set incorrectly for values where the coercion from long double to double causes rounding. it seems likely that floating point status flags may be set incorrectly as a result too. at the same time, clean up use of preprocessor concatenation involving LDBL_MANT_DIG, which spuriously depends on it being a single unadorned decimal integer literal, and instead use the equivalent formulation 2/LDBL_EPSILON. an equivalent change on the printf side was made in commit bff6095d915f3e41206e47ea2a570ecb937ef926.
2019-10-14mips: add single-instruction math functionsinfo@mobile-stream.com-0/+64
SQRT.fmt exists on MIPS II+ (float), MIPS III+ (double). ABS.fmt exists on MIPS I+ but only cores with ABS2008 flag in FCSR implement the required behaviour.
2019-10-14fix cacosh results for arguments with negative imaginary partMichael Morrell-3/+12
2019-10-13math: fix signed int left shift ub in sqrtSzabolcs Nagy-4/+2
Both sqrt and sqrtf shifted the signed exponent as signed int to adjust the bit representation of the result. There are signed right shifts too in the code but those are implementation defined and are expected to compile to arithmetic shift on supported compilers and targets.
2019-10-13fix aliasing-based undefined behavior in mbsrtowcsRich Felker-2/+8
mbsrtowcs contains "vectorized" loops to quickly step over bytes without the high bit set; these have undefined behavior by virtue of aliasing uint32_t over top of char data for the accesses. commit 4d0a82170a25464c39522d7190b9fe302045ddb2 fixed the corresponding usage in string functions by using the may_alias attribute conditional on __GNUC__ and disabled the vectorized code in its absence. do the same for mbsrtowcs.
2019-09-29remove remaining traces of __tls_get_newSzabolcs Nagy-12/+1
Some declarations of __tls_get_new were left in the code, even though the definition got removed in commit 9d44b6460ab603487dab4d916342d9ba4467e6b9 install dynamic tls synchronously at dlopen, streamline access this can make the build fail with ld: lib/libc.so: hidden symbol `__tls_get_new' isn't defined when libc.so is linked without --gc-sections, because a .hidden declaration in asm code creates a reference even if the symbol is not actually used.
2019-09-27math: optimize lrint on 32bit targetsSzabolcs Nagy-1/+27
lrint in (LONG_MAX, 1/DBL_EPSILON) and in (-1/DBL_EPSILON, LONG_MIN) is not trivial: rounding to int may be inexact, but the conversion to int may overflow and then the inexact flag must not be raised. (the overflow threshold is rounding mode dependent). this matters on 32bit targets (without single instruction lrint or rint), so the common case (when there is no overflow) is optimized by inlining the lrint logic, otherwise the old code is kept as a fallback. on my laptop an i486 lrint call is asm:10ns, old c:30ns, new c:21ns on a smaller arm core: old c:71ns, new c:34ns on a bigger arm core: old c:27ns, new c:19ns
2019-09-26fix mips setjmp/longjmp fpu state on r6, related issuesRich Felker-24/+12
mips32 has two fpu register file variants: FR=0 with 32 32-bit registers, where pairs of neighboring even/odd registers are used to represent doubles, and FR=1 with 32 64-bit registers, each of which can store a single or double. up through r5 (our "mips" arch), the supported ABI uses FR=0, but modern compilers generate "fpxx" model code that can safely operate with either model. r6, which is an incompatible but similar ISA, drops FR=0 and only provides the FR=1 model. as such, setjmp and longjmp, which depended on being able to save and restore call-saved doubles by storing and loading their 32-bit halves, were completely broken in the presence of floating point code on mips r6. to fix this, use the s.d and l.d mnemonics to store and load fpu registers. these expand to the existing swc1 and lwc1 instructions for pairs of 32-bit fpu registers on mips1, but on mips2 and later they translate directly to the 64-bit sdc1 and ldc1. with FR=0, sdc1 and ldc1 behave just like the pairs of swc1 and lwc1 instructions they replace, storing or loading the even/odd pair of fpu registers that can be treated as separate single-precision floats or as a unit representing a double. but with FR=1, they store/load individual 64-bit registers. this yields the ABI-correct behavior on mips r6, and should make linking of pre-r6 (plain "mips") code with "fp64" model code workable, although this is and will likely remain unsupported usage. in addition to the mips r6 problem this change fixes, reportedly clang's internal assembler refuses to assemble swc1 and lwc1 instructions for odd register indices when building for "fpxx" model (the default). this caused setjmp and longjmp not to build. by using the s.d and l.d forms, this problem is avoided too. as a bonus, code size is reduced everywhere but mips1.
2019-09-26arm: fix setjmp and longjmp asm for armv8-aSzabolcs Nagy-0/+14
armv8 removed the coprocessor instructions other than cp14, so on an armv8 system the related hwcaps should never be set. new llvm complains about the use of coprocessor instructions in armv8-a mode (even though they are never executed at runtime), so ifdef them out when musl is built for armv8.
2019-09-25fix data race in timer_create with SIGEV_THREAD notificationRich Felker-2/+2
in the timer thread start function, self->timer_id was accessed without synchronization; the timer thread could fail to see the store from the calling thread, resulting in timer_delete failing to delete the correct kernel-level timer. this fix is based on a patch by changdiankang, but with the load moved to after receiving the timer_delete signal rather than just after the start barrier, so as not to retain the possibility of data race with timer_delete.
2019-09-13harden thread start with failed scheduling against broken __cloneRich Felker-1/+1
commit 8a544ee3a2a75af278145b09531177cab4939b41 introduced a dependency of the failure path for explicit scheduling at thread creation on __clone's handling of the start function returning, which should result in SYS_exit. as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, the arm version of __clone was broken in this case. in the past, the mips version was also broken; it was fixed in commit 8b2b61e0001281be0dcd3dedc899bf187172fecb. since this code path is pretty much entirely untested (previously only reachable in applications that call the public clone() and return from the start function) and consists of fragile per-arch asm, don't assume it works, at least not until it's been thoroughly tested. instead make the SYS_exit syscall from the start function's failure path.
2019-09-13fix %lf in wprintfBrion Vibber-0/+2
commit cc3a4466605fe8dfc31f3b75779110ac93055bc1 fixed this for printf but neglected to fix wprintf. Previously, %lf caused a failure to output.
2019-09-11fix arm __tlsdesc_dynamic when built as thumb code without __ARM_ARCH>=5Rich Felker-0/+4
we don't actually support building asm source files as thumb1, but it's possible that the condition __ARM_ARCH>=5 would be false on old compilers that did not define __ARM_ARCH at all. avoiding that would require enumerating all of the possible __ARM_ARCH_*__ macros for testing. as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc is not valid for saving a return address when in thumb mode. since this code is a hot path (dynamic TLS access), don't do the out-of-line bl->bx chaining to save the return value; instead, use the fact that this file is preprocessed asm to add the missing thumb bit with an add in place of the mov. the change here does not affect builds for ISA levels new enough to have a thread pointer read instruction, or for armv5 and later as long as the compiler properly defines __ARM_ARCH, or for any build as arm (not thumb) code. it's likely that it makes no difference whatsoever to any present-day practical build environments, but nonetheless now it's safe. as an alternative, we could just assume __thumb__ implies availability of blx since we don't support building asm source files as thumb1. I didn't do that in order to avoid having a wrong assumption here if that ever changes.
2019-09-11fix arm __a_barrier_oldkuser when built as thumbRich Felker-2/+2
as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc is not a valid method for saving the return address in code that might be built as thumb. this one is unlikely to matter, since any ISA level that has thumb2 should also have native implementations of atomics that don't involve kuser_helper, and the affected code is only used on very old kernels to begin with.
2019-09-11fix code path where child function returns in arm __clone built as thumbRich Felker-7/+3
mov lr,pc is not a valid way to save the return address in thumb mode since it omits the thumb bit. use a chain of bl and bx to emulate blx. this could be avoided by converting to a .S file with preprocessor conditions to use blx if available, but the time cost here is dominated by the syscall anyway. while making this change, also remove the remnants of support for pre-bx ISA levels. commit 9f290a49bf9ee247d540d3c83875288a7991699c removed the hack from the parent code paths, but left the unnecessary code in the child. keeping it would require rewriting two code paths rather than one, and is useless for reasons described in that commit.
2019-09-06synchronously clean up pthread_create failure due to scheduling errorsRich Felker-13/+18
previously, when pthread_create failed due to inability to set explicit scheduling according to the requested attributes, the nascent thread was detached and made responsible for its own cleanup via the standard pthread_exit code path. this left it consuming resources potentially well after pthread_create returned, in a way that the application could not see or mitigate, and unnecessarily exposed its existence to the rest of the implementation via the global thread list. instead, attempt explicit scheduling early and reuse the failure path for __clone failure if it fails. the nascent thread's exit futex is not needed for unlocking the thread list, since the thread calling pthread_create holds the thread list lock the whole time, so it can be repurposed to ensure the thread has finished exiting. no pthread_exit is needed, and freeing the stack, if needed, can happen just as it would if __clone failed.
2019-09-06set explicit scheduling for new thread from calling thread, not selfRich Felker-21/+12
if setting scheduling properties succeeds, the new thread may end up with lower priority than the caller, and may be unable to continue running due to another intermediate-priority thread. this produces a priority inversion situation for the thread calling pthread_create, since it cannot return until the new thread reports success. originally, the parent was responsible for setting the new thread's priority; commits b8742f32602add243ee2ce74d804015463726899 and 40bae2d32fd6f3ffea437fa745ad38a1fe77b27e changed it as part of trimming down the pthread structure. since then, commit 04335d9260c076cf4d9264bd93dd3b06c237a639 partly reversed the changes, but did not switch responsibilities back. do that now.
2019-09-06fix unsynchronized decrement of thread count on pthread_create errorRich Felker-1/+2
commit 8f11e6127fe93093f81a52b15bb1537edc3fc8af wrongly documented that all changes to libc.threads_minus_1 were guarded by the thread list lock, but the decrement for failed SYS_clone took place after the thread list lock was released.
2019-08-30add public declaration for optreset under appropriate feature profilesRich Felker-0/+1
commit 030e52639248ac8417a4934298caa78c21a228d1 added optreset, a BSD extension to getopt duplicating the functionality (also an extension) of setting optind to 0, but failed to provide a public declaration for it. according to the BSD documentation and headers, the application is not supposed to need to provide its own declaration.
2019-08-30add posix_spawn [f]chdir file actionsRich Felker-0/+45
these are presently extensions, thus named with _np to match glibc and other implementations that provide them; however they are likely to be standardized in the future without the _np suffix as a result of Austin Group issue 1208. if so, both names will be kept as aliases.
2019-08-23add copy_file_range system call wrapperÁrni Dagur-0/+8
2019-08-17fix external dummy_lock symbol inadvertently introduced in sigactionRich Felker-1/+1
commit 9b14ad541068d4f7d0be9bcd1ff4c70090d868d3 introduced this namespace violation.
2019-08-13fix accidentlly-external cmp symbol introduced with catgetsRich Felker-1/+1
commit 7590203c486d9002522019045d34ee3dee0a66f5 omitted static here.
2019-08-11add support for powerpc/powerpc64 unaligned relocationsSamuel Holland-0/+1
R_PPC_UADDR32 (R_PPC64_UADDR64) has the same meaning as R_PPC_ADDR32 (R_PPC64_ADDR64), except that its address need not be aligned. For powerpc64, BFD ld(1) will automatically convert between ADDR<->UADDR relocations when the address is/isn't at its native alignment. This will happen if, for example, there is a pointer in a packed struct. gold and lld do not currently generate R_PPC64_UADDR64, but pass through misaligned R_PPC64_ADDR64 relocations from object files, possibly relaxing them to misaligned R_PPC64_RELATIVE. In both cases (relaxed or not) this violates the PSABI, which defines the relevant field type as "a 64-bit field occupying 8 bytes, the alignment of which is 8 bytes unless otherwise specified." All three linkers violate the PSABI on 32-bit powerpc, where the only difference is that the field is 32 bits wide, aligned to 4 bytes. Currently musl fails to load executables linked by BFD ld containing R_PPC64_UADDR64, with the error "unsupported relocation type 43". This change provides compatibility with BFD ld on powerpc64, and any static linker on either architecture that starts following the PSABI more closely.
2019-08-08add secure_getenv functionPetr Vaněk-0/+8
This function is a GNU extension introduced in glibc 2.17.
2019-08-07in clock_getres, check for null pointer before storing resultRich Felker-1/+1
POSIX allows a null pointer, in which case the function only checks the validity of the clock id argument.
2019-08-07remove spurious null check in clock_settimeRich Felker-1/+1
at the point of this check, the pointer has already been dereferenced. clock_settime is not defined for null pointer arguments.
2019-08-07fix regression in recvmmsg with no timeoutRich Felker-1/+1
somewhat analogous to commit d0b547dfb5f7678cab6bc39dd736ed6454357ca4, but here the omission of the null timeout check was in the time64 syscall code path. this code is not yet used except on x32.
2019-08-07add non-stub implementation of catgets localization functionsRich Felker-3/+114
these accept the netbsd/openbsd message catalog file format, consisting of a sorted list of set headers and a sorted list of message headers for each set, admitting trivial binary search for lookups. the gnu format was not chosen because it's unusably bad. it does not admit efficient (log time or better) lookups; rather, it requires linear search or hash table lookups, and the hash function is awful: it's literally set_id*msg_id.
2019-08-07fix regression in select with no timeoutRich Felker-1/+2
commit 722a1ae3351a03ab25010dbebd492eced664853b inadvertently passed a copy of {s,us} to the syscall even if the timeout argument tv was null, thereby causing immediate timeout (polling) in place of unlimited timeout. only archs using SYS_select were affected.
2019-08-07fix failure of glob to match broken symlinks under some conditionsRich Felker-5/+12
when the pattern ended with one or more literal path components, or when the GLOB_MARK flag was passed to request that glob flag directory results and the type obtained by readdir was unknown or inconclusive (symlink), the stat function was called to evaluate existence and/or determine type. however, stat fails with ENOENT for broken symlinks, and this caused the match to be omitted from the results. instead, use stat only for the unknown/inconclusive cases with GLOB_MARK, and otherwise, or if stat fails, use lstat existence still needs to be determined. this minimizes the number of costly syscalls, performing both only in the case where GLOB_MARK is in use and there is a final literal path component which is a broken symlink. based on/simplified from patch by James Y Knight.
2019-08-06in arm cancellation point asm, don't unnecessarily preserve link registerPatrick Oppenlander-4/+4
The only reason we needed to preserve the link register was because we were using a branch-link instruction to branch to __cp_cancel. Replacing this with a branch means we can avoid the save/restore as the link register is no longer modified.
2019-08-06glob: implement GLOB_TILDE and GLOB_TILDE_CHECKIsmael Luceno-1/+41
2019-08-05use setitimer function rather than syscall to implement alarmRich Felker-3/+3
otherwise alarm will break on 32-bit archs when time_t is changed to 64-bit. a second itimerval object is introduced for retrieving the old value, since the setitimer function has restrict-qualified arguments.
2019-08-05fix build regression in i386 asm for atan2, atan2fRich Felker-2/+2
commit f3ed8bfe8a82af1870ddc8696ed4cc1d5aa6b441 inadvertently removed labels that were still needed.
2019-08-05fix x87 stack imbalance in corner cases of i386 math asmRich Felker-44/+14
commit 31c5fb80b9eae86f801be4f46025bc6532a554c5 introduced underflow code paths for the i386 math asm, along with checks on the fpu status word to skip the underflow-generation instructions if the underflow flag was already raised. unfortunately, at least one such path, in log1p, returned with 2 items on the x87 stack rather than just 1 item for the return value. this is a violation of the ABI's calling convention, and could cause subsequent floating point code to produce NANs due to x87 stack overflow. if floating point results are used in flow control, this can lead to runaway wrong code execution. rather than reviewing each "underflow already raised" code path for correctness, remove them all. they're likely slower than just performing the underflow code unconditionally, and significantly more complex. all of this code should be ripped out and replaced by C source files with inline asm. doing so would preclude this kind of error by having the compiler perform all x87 stack register allocation and stack manipulation, and would produce comparable or better code. however such a change is a much larger project.
2019-08-05fix regression in clock_gettime on 32-bit archs without vdsoRich Felker-0/+1
commit 72f50245d018af0c31b38dec83c557a4e5dd1ea8 broke this by creating a code path where r is uninitialized.
2019-08-02clock_gettime: add support for 32-bit vdso with 64-bit time_tRich Felker-0/+32
this fixes a major upcoming performance regression introduced by commit 72f50245d018af0c31b38dec83c557a4e5dd1ea8, whereby 32-bit archs would lose vdso clock_gettime after switching to 64-bit time_t, unless the kernel supports time64 and provides a time64 version of the vdso function. this would incur not just one but two syscalls: first, the failed time64 syscall, then the fallback time32 one. overflow of the 32-bit result is detected and triggers a revert to syscalls. normally, on a system that's not Y2038-ready, this would still overflow, but if the process has been migrated to a time64-capable kernel or if the kernel has been hot-patched to add time64 syscalls, it may conceivably work.