| Age | Commit message (Collapse) | Author | Lines |
|
|
|
on some kernel builds, the vdso exports symbols with DT_GNU_HASH but
omits DT_HASH, causing the vdso not to get used and for clock_gettime
to fall back to using a syscall.
our vdso resolver does not use the hash table anyway, but does need
the symbol count from the standard sysv hash table. if it's missing,
use the GNU hashtable to calculate the number of symbols.
|
|
this corrects missing validation when using alternate group database
backends via nscd, as reported by 0rbitingZer0, which could result in
a heap-based buffer overflow.
while the source of truth for user (passwd) and group definitions is
generally an equal or higher-privilege domain than the application,
and compromise of nscd could inherently lead to bypass of some access
controls, it is still worthwhile to harden against direct attacks from
a compromised nscd.
this patch adds validation in the least invasive way possible,
erroring out at the point where a write past the end of the buffer
would previously have occurred.
a check is also added for member counts that would cause arithmetic
overflow in the existing buffer size computations, including negative
counts. this could be handled better by making adjustments where the
arithmetic is performed, but the way it's done here avoids making any
changes except for the actual bounds check.
|
|
Add madvise flag which performs a best-effort synchronous
collapse of the native pages mapped by the memory range
into Transparent Huge Pages (THPs)
see
linux commit 7d8faaf155454f8798ec56404faca29a82689c77
mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse
|
|
this flag works like MADV_DONTNEED but also applies to locked memory
ranges.
see
linux commit 9457056ac426e5ed0671356509c8dcce69f8dee0
mm: madvise: MADV_DONTNEED_LOCKED
|
|
Add madvise flags to populate(prefault) page tables
see
linux commit 4ca9b3859dac14bbef0c27d00667bb5b10917adb
mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables
|
|
|
|
The signal stack extension field of loongarch64 is mutable, and the
types are distinguished according to some magic.
|
|
These new LoongArch reloc types(101 to 126) have been added in
LoongArch psABI v2.30 and NT_LOONGARCH_HW_BREAK/NT_LOONGARCH_HW_WATCH
sync with Linux 6.12 elf.h.
|
|
analogous to fenv-sf.c for all other archs with softfloat variants.
|
|
Add two missing syscalls from v5.14 and new syscalls from v6.4 .. v6.19
add __NR_quotactl_fd from linux v5.14 see
linux commit 9dfa23c8de925041b7b45637a1a80a98a22f19dd
quota: Add mountpath based quota support
linux commit fa8b90070a80bb1a3042b4b25af4b3ee2c4c27e1
quota: wire up quotactl_path
linux commit 64c2c2c62f92339b176ea24403d8db16db36f9e6
quota: Change quotactl_path() systcall to an fd-based one
add __NR_memfd_secret from linux v5.14 see
linux commit 7bb7f2ac24a028b20fca466b9633847b289b156a
arch, mm: wire up memfd_secret system call where relevant
linux commit 1507f51255c9ff07d75909a84e7c0d7f3c4b2f49
mm: introduce memfd_secret system call to create "secret" memory areas
note: was already added on x86 and s390, now on aarch64 and riscv*.
add riscv __NR_riscv_hwprobe from linux v6.4 see
linux commit ea3de9ce8aa280c5175c835bd3e94a3a9b814b74
RISC-V: Add a syscall for HW probing
add x86_64 only map_shadow_stack syscall from linux v6.6 see
linux commit c35559f94ebc3e3bc82e56e07161bb5986cd9761
x86/shstk: Introduce map_shadow_stack syscall
add map_shadow_stack syscall from linux v6.7 see
linux commit 2fd0ebad27bcd4c8fc61c61a98d4283c47054bcf
arch: Reserve map_shadow_stack() syscall number for all architectures
add futex_* syscalls from linux v6.7 see
linux commit 9f6c532f59b20580acf8ede9409c9b8dce6e74e1
futex: Add sys_futex_wake()
linux commit cb8c4312afca1b2dc64107e7e7cea81911055612
futex: Add sys_futex_wait()
linux commit 0f4b5f972216782a4acb1ae00dcb55173847c2ff
futex: Add sys_futex_requeue()
add statmount, listmount syscalls from linux v6.8 see
linux commit d8b0f5465012538cc4bb10ddc4affadbab73465b
wire up syscalls for statmount/listmount
linux commit b4c2bea8ceaa50cd42a8f73667389d801a3ecf2d
add listmount(2) syscall
linux commit 46eae99ef73302f9fb3dddcd67c374b3dffe8fd6
add statmount(2) syscall
add lsm_* syscalls from linux v6.8 see
linux commit 5f42375904b08890f2e8e7cd955c5bf0c2c0d05a
LSM: wireup Linux Security Module syscalls
linux commit ad4aff9ec25f400608283c10d634cc4eeda83a02
LSM: Create lsm_list_modules system call
linux commit a04a1198088a1378d0389c250cc684f649bcc91e
LSM: syscalls for current process attributes
add mseal syscall from linux v6.10 see
linux commit ff388fe5c481d39cc0a5940d1ad46f7920f1d646
mseal: wire up mseal syscall
on sh add sync_file_range2 from linux v6.10 see
linux commit 30766f1105d6d2459c3b9fe34a3e52b637a72950
sh: rework sync_file_range ABI
on x86 add uretprobe from linux v6.11 see
linux commit 54233a4254036efca91b9bffbd398ecf39e90555
uretprobe: change syscall number, again
add *xattrat syscalls from linux v6.13 see
linux commit 6140be90ec70c39fa844741ca3cc807dd0866394
fs/xattr: add *at family syscalls
add open_tree_attr from linux v6.15 see
linux commit c4a16820d90199409c9bf01c4f794e1e9e8d8fd8
fs: add open_tree_attr()
add file_{get,set}attr from linux v6.17 see
linux commit be7efb2d20d67f334a7de2aef77ae6c69367e646
fs: introduce file_getattr and file_setattr syscalls
on x86 add uprobe from linux v6.18 see
linux commit 56101b69c9190667f473b9f93f8b6d8209aaa816
uprobes/x86: Add uprobe syscall to speed up uprobe
add listns from linux v6.19 see
linux commit b36d4b6aa88ef039647228b98c59a875e92f8c8e
arch: hookup listns() system call
|
|
sll_addr is a member of sockaddr_ll, but it is misspelled as
ssl_addr in comment.
Please cc me back when reply.
Signed-off-by: RocketDev <marocketbd@gmail.com>
|
|
The LOCK_OBJ_DEF macro is used with a trailing semicolon. However,
since the macro definition ends with the closing brace of a function
definition, the ISO C grammar does not allow an extra semicolon.
To fix this, swap the order of the two definitions, and drop the
semicolon from the __malloc_lock declaration.
|
|
This fixes an error in 6af4f25b899e89e4b91f8c197ae5a6ce04bcce7b: The
r0 register is special in addressing modes on s390x and is interpreted
as constant zero, i.e. lg %r5, 8(%r0) would effectively become lg %r5,
8. So care should be taken to never use r0 as an address register in
s390x assembly.
|
|
commit f96e47a26102d537c29435f0abf9ec94676a030e introduced a new
overflow past the end of the base-1e9 buffer for floating point to
decimal conversion while fixing a different overflow below the start
of the buffer.
this bug has not been present in any release, and has not been
analyzed in depth for security considerations.
the root cause of the bug, incorrect size accounting for the mantissa,
long predates the above commit, but was only exposed once the
excessive offset causing overflow in the other direction was removed.
the number of slots for expanding the mantissa was computed as if each
slot could peel off at least 29 bits. this would be true if the
mantissa were placed and expanded to the left of the radix point, but
we don't do that because it would require repeated fmod and division.
instead, we start the mantissa with 29 bits to the left of the radix
point, so that they can be peeled off by conversion to integer and
subtraction, followed by a multiplication by 1e9 to prepare for the
next iteration. so while the first slot peels 29 bits, advancing to
the next slot adds back somewhere between 20 and 21 bits: the length
of the mantissa of 1e9. this means we need to account for a slot for
every 8 bits of mantissa past the initial 29.
add a comment to that effect and adjust the max_mant_slots formula.
|
|
per the psABI, floating point register contents beyond the register
size of the targeted ABI variant are never call-saved, so no
hwcap-conditional logic is needed here and the assembly-time
conditions are based purely on ABI variant macros, not the targeted
ISA level.
|
|
the ctr and xer special registers are call-clobbered and
syscall-clobbered. failure to include them in the clobber list may
result in wrong code that attempts to use a value which is no longer
present in the register after the syscall. this has been reported to
manifest newly with gcc 15.
|
|
v4-compatible addresses in ipv6 are a deprecated feature where the
high 96 bits are all zero and an ipv4 address is stored in the low
32 bits. however, since :: and ::1 are the unspecified and loopback
addresses, these two particular values are excluded from the
definition of the v4-compat class.
our version of the macro incorrectly assessed this condition by
checking only the high 96 and low 8 bits. this incorrectly excluded
the v4compat version of any ipv4 address ending in .1, not just ::1.
rather than writing out non-obvious or error-prone conditions on the
individual address bytes, just express the "not :: or ::1" condition
naturally using the existing IN6_IS_ADDR_UNSPECIFIED and
IN6_IS_ADDR_LOOPBACK macros, after checking that the high 96 bits are
all zero. any vaguely reasonable compiler will collapse out the
redundant tests of the upper bits as part of CSE.
|
|
as stated in the comment added, the ABI for SME requires libc to be
aware of and support the extension to the register file. this is
necessary to handle lazy saving correctly across setjmp/longjmp, and
on older kernels, in order not to introduce memory corruption bugs
that may be exploitable vulnerabilities when creating new threads.
previously, we did not expose __getauxval, the interface libgcc uses
to determine runtime availability of SME, so it was not usable when
following the intended ABI. since commit
ab4635fba6769e19fb411a1ab3c8aa7407e11188 has now exposed this
interface, a mitigation is needed to ensure SME is not used
unless/until we have proper support for it.
while SME is the current hwcap feature that needs this treatment,
as-yet-undefined hwcap bits are also masked in case other new cpu
features have similar ABI issues. this could be re-evaluated at some
point in the future.
for now, the masking is only on aarch64. arguably it should be
considered for all archs, but whether it's needed is really a matter
of how ABI policy & stability are handled by the maintainers of the
arch psABI, and aarch64 is the one that's demonstrated a necessity. if
it turns out something like this is needed for more/all archs, making
a generalized framework for it would make sense. for now, it's stuffed
into __set_thread_area the same way atomics detection is stuffed there
for 32-bit arm and sh, as it's a convenient point for "arch-specific
early setup code" without invasive changes.
|
|
this change both aligns with the intended future direction for most
assembly usage, and makes it possible to add arch-specific setup logic
based on hwcaps like we have for 32-bit arm.
|
|
there are probably more new auxv keys that should be added, but these
are added now specifically because we may need to mask them.
|
|
commit 572a2e2eb91f00f2f25d301cfb50f435e7ae16b3 adjusted the buffer
for decimal conversion to be a VLA that only uses the full size needed
for long double when the argument type was long double. however, it
failed to update a later expression for the positioning within the
buffer, which still used a fixed offset of LDBL_MANT_DIG. this caused
doubles with a large positive exponent to overflow below the start of
the array, producing wrong output and potentially runaway wrong
execution.
this bug has not been present in any release, and has not been
analyzed in depth for security considerations.
it turns out the original buffer offset expression involving
LDBL_MANT_DIG was incorrect as well, and only worked because the space
reserved for expanding the exponent is roughly 3 times the size it
needs to be when the exponent is positive, leaving plenty of extra
space to compensate for the error. the actual offset should be in
base-1000000000 slot units, not bits, and numerically equal to the
number of slots that were previously allocated for mantissa expansion.
in order to ensure consistency and make the code more comprehensible,
commented subexpressions are replaced by intermediate named variables,
and the newly introduced max_mant_slots is used for both the
allocation and the buffer offset adjustment. the included +1 term
accounts for a trailing zero slot that's always emitted.
|
|
the alias fp is only supported on some assemblers. use the actual
register name x29 instead.
|
|
This is needed so that libgcc can access AT_HWCAP without violating
link namespace rules.
Internally musl already used __getauxval symbol for the same reason,
we just remove the hidden marking.
|
|
As of Linux 6.11, these fields and mask macros have been added to
include/uapi/linux/stat.h.
|
|
|
|
Linux kernel commit ee988c11acf6f9464b7b44e9a091bf6afb3b3a49 added two
new HWCAP bits: one for ARCH_3_1, which is the Power10 ISA revision, and
one for MMA, which is the optional Matrix Multiply Assist extension.
|
|
When buffering on a FILE is disabled we still send both iovecs, even
though the first one is always empty. Clean things up by skipping the
empty iovec instead.
|
|
the loop condition ending on end-of-haystack ends before a zero-length
needle can be matched, so just explicitly check it before the loop.
|
|
POSIX 2024 added a requirement that mbsnrtowcs, like mbrtowc, consume
any final partial character and store it in the mbstate_t object
before returning. this was previously unspecified but documented as a
potential future change.
an internal mbstate_t object is added for the case where the argument
is a null pointer. previously this was not needed since no operations
could modify the internal object and not processing it at all gave the
same behavior "as if" there were an internal object.
|
|
some recent compilers have adopted a dubious interpretation of the C
specification for union initializers, that when the initialized member
is smaller than the size of the union, the remaining padding does not
have to be zero-initialized. in the interests of not depending on any
particular interpretation, place the larger member first so it's
initialized and ensures the whole object is zero-filled.
|
|
traditionally, our cfsetispeed just set the output speed. this was not
conforming or reasonable behavior.
use of the input baud bits in termios c_cflag depends on kernel
support, which was added to linux along with TCSETS2 ioctl and
arbitrary-baud functionality sometime in the 2.6 series. with older
kernels, the separate input baud will not take, but this is the best
behavior we can hope for anyway, certainly better than wrongly
clobbering output baud setting.
the nonstandard cfsetspeed is now moved to a separate file, since it
no longer admits the weak alias implementation that made it
namespace-safe. it now sets the output speed, and on success, sets the
input speed to 0 (matched to output).
|
|
This just mirrors what is done in the start code for the affected
ports, as well as what is already done for the three x86 ports.
Clearing the frame pointer helps protect FP-based unwinders from
wrongly attempting to traverse into the parent thread's call frame
stack.
|
|
This was an oversight specific to these archs; others have always
aligned the new stack pointer correctly.
|
|
|
|
|
|
this function is documented as returning a null pointer on failure and
the current textdomain encoding, which is always UTF-8 in our
implementation, on success. there was some confusion over whether it's
expected to also return a null pointer in the case where it's using
the locale's encoding by default, rather than an explicitly bound one,
but it does not seem like that behavior would match applications'
expectations, and it would require gratuitously storing a meaningless
1-bit state for the textdomain.
|
|
|
|
|
|
|
|
the UTF-8 output code was written assuming an invariant that iconv's
decoders only emit valid Unicode Scalar Values which wctomb can encode
successfully, thereby always returning a value between 1 and 4.
if this invariant is not satisfied, wctomb returns (size_t)-1, and the
subsequent adjustments to the output buffer pointer and remaining
output byte count overflow, moving the output position backwards,
potentially past the beginning of the buffer, without storing any
bytes.
|
|
the man page for this nonstandardized function has historically
documented it as scanning for a substring; however, this is
functionally incorrect (matches the substring "atime" in the "noatime"
option, for example) and differs from other existing implementations.
with the change made here, it should match glibc and other
implementations, only matching whole options delimited by commas or
separated from a value by an equals sign.
|
|
as a result of incorrect bounds checking on the lead byte being
decoded, certain invalid inputs which should produce an encoding
error, such as "\xc8\x41", instead produced out-of-bounds loads from
the ksc table.
in a worst case, the loaded value may not be a valid unicode scalar
value, in which case, if the output encoding was UTF-8, wctomb would
return (size_t)-1, causing an overflow in the output pointer and
remaining buffer size which could clobber memory outside of the output
buffer.
bug report was submitted in private by Nick Wellnhofer on account of
potential security implications.
|
|
out-of-range second bytes were not handled, leading to wrong character
output rather than a reported encoding error.
fix based on bug report by Nick Wellnhofer, submitted in private in
case the issue turned out to have security implications.
|
|
Calling __tls_get_addr with brasl is not valid since it's a global symbol; doing
so results in an R_390_PC32DBL relocation error from lld. We could fix this by
marking __tls_get_addr hidden since it is not part of the s390x ABI, or by using
a different instruction. However, given its simplicity, it makes more sense to
just manually inline it into __tls_get_offset for performance.
The patch has been tested by applying to Zig's bundled musl copy and running the
full Zig test suite under qemu-s390x.
|
|
Some weird linkers may emit PT_LOAD segments with memsz = 0. ELF
specification does not forbid this, but such a segment with non-zero
p_vaddr will result in reclaiming of invalid memory address.
This patch skips such segments during reclaiming for better
compatibility.
|
|
we have the cpuset macros call calloc/free/memset/memcmp directly so
that they don't depend on any further ABI surface. this is not
namespace-clean, but only affects the _GNU_SOURCE feature profile,
which is not intended to be namespace-clean. nonetheless, reports come
up now and then of things which are gratuitously broken, usually when
an application has wrapped malloc with macros.
this patch parenthesizes the function names so that function-like
macros will not be expanded, and removes the unused declaration of
memcpy. this is not a complete solution, but it should improve things
for affected applications, particularly ones which are not even trying
to use the cpuset interfaces which got them just because g++ always
defines _GNU_SOURCE.
|
|
the kernel mq_attr structure has 8 64-bit longs instead of 8 32-bit
longs.
it's not clear that this is the nicest way to implement the fix, but
the concept (translation) is right, and the details can be changed
later if desired.
|
|
previously, we left any changes made by the application to the timer
thread's signal mask active when resetting the thread state for reuse.
not only did this violate the intended invariant that timer threads
start with all signals blocked; it also allowed application code to
execute in a thread that, formally, did not exist. and further, if the
internal SIGTIMER signal became unblocked, it could also lead to
missed timer expiration events.
|
|
commit 6ae2568bc2367b4d47e0ea1cb043fd56e697912f introduced a fatal
signal condition if the internal timer signal used for SIGEV_THREAD
timers is unblocked. this can happen whenever the application alters
the signal mask with SIG_SETMASK, since sigset_t objects never include
the bits used for implementation-internal signals.
this patch effectively reverts the breakage by adding back a no-op
signal handler.
overruns will not be accounted if the timer signal becomes unblocked,
but POSIX does not specify them except for SIGEV_SIGNAL timers anyway.
|