<feed xmlns='http://www.w3.org/2005/Atom'>
<title>musl/src/thread, branch v1.2.6</title>
<subtitle>musl - an implementation of the standard library for Linux-based systems</subtitle>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/'/>
<entry>
<title>s390x: shuffle register usage in __tls_get_offset to avoid r0 as address</title>
<updated>2025-10-12T20:15:47+00:00</updated>
<author>
<name>Alex Rønne Petersen</name>
<email>alex@alexrp.com</email>
</author>
<published>2025-10-12T03:35:19+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=1b76ff0767d01df72f692806ee5adee13c67ef88'/>
<id>1b76ff0767d01df72f692806ee5adee13c67ef88</id>
<content type='text'>
This fixes an error in 6af4f25b899e89e4b91f8c197ae5a6ce04bcce7b: The
r0 register is special in addressing modes on s390x and is interpreted
as constant zero, i.e. lg %r5, 8(%r0) would effectively become lg %r5,
8. So care should be taken to never use r0 as an address register in
s390x assembly.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This fixes an error in 6af4f25b899e89e4b91f8c197ae5a6ce04bcce7b: The
r0 register is special in addressing modes on s390x and is interpreted
as constant zero, i.e. lg %r5, 8(%r0) would effectively become lg %r5,
8. So care should be taken to never use r0 as an address register in
s390x assembly.
</pre>
</div>
</content>
</entry>
<entry>
<title>aarch64: mask off SME and unknown/future hwcap bits</title>
<updated>2025-07-16T16:04:39+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2025-07-16T16:04:39+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=8fd5d031876345e42ae3d11cc07b962f8625bc3b'/>
<id>8fd5d031876345e42ae3d11cc07b962f8625bc3b</id>
<content type='text'>
as stated in the comment added, the ABI for SME requires libc to be
aware of and support the extension to the register file. this is
necessary to handle lazy saving correctly across setjmp/longjmp, and
on older kernels, in order not to introduce memory corruption bugs
that may be exploitable vulnerabilities when creating new threads.

previously, we did not expose __getauxval, the interface libgcc uses
to determine runtime availability of SME, so it was not usable when
following the intended ABI. since commit
ab4635fba6769e19fb411a1ab3c8aa7407e11188 has now exposed this
interface, a mitigation is needed to ensure SME is not used
unless/until we have proper support for it.

while SME is the current hwcap feature that needs this treatment,
as-yet-undefined hwcap bits are also masked in case other new cpu
features have similar ABI issues. this could be re-evaluated at some
point in the future.

for now, the masking is only on aarch64. arguably it should be
considered for all archs, but whether it's needed is really a matter
of how ABI policy &amp; stability are handled by the maintainers of the
arch psABI, and aarch64 is the one that's demonstrated a necessity. if
it turns out something like this is needed for more/all archs, making
a generalized framework for it would make sense. for now, it's stuffed
into __set_thread_area the same way atomics detection is stuffed there
for 32-bit arm and sh, as it's a convenient point for "arch-specific
early setup code" without invasive changes.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
as stated in the comment added, the ABI for SME requires libc to be
aware of and support the extension to the register file. this is
necessary to handle lazy saving correctly across setjmp/longjmp, and
on older kernels, in order not to introduce memory corruption bugs
that may be exploitable vulnerabilities when creating new threads.

previously, we did not expose __getauxval, the interface libgcc uses
to determine runtime availability of SME, so it was not usable when
following the intended ABI. since commit
ab4635fba6769e19fb411a1ab3c8aa7407e11188 has now exposed this
interface, a mitigation is needed to ensure SME is not used
unless/until we have proper support for it.

while SME is the current hwcap feature that needs this treatment,
as-yet-undefined hwcap bits are also masked in case other new cpu
features have similar ABI issues. this could be re-evaluated at some
point in the future.

for now, the masking is only on aarch64. arguably it should be
considered for all archs, but whether it's needed is really a matter
of how ABI policy &amp; stability are handled by the maintainers of the
arch psABI, and aarch64 is the one that's demonstrated a necessity. if
it turns out something like this is needed for more/all archs, making
a generalized framework for it would make sense. for now, it's stuffed
into __set_thread_area the same way atomics detection is stuffed there
for 32-bit arm and sh, as it's a convenient point for "arch-specific
early setup code" without invasive changes.
</pre>
</div>
</content>
</entry>
<entry>
<title>aarch64: replace asm source file for __set_thread_area with inline asm</title>
<updated>2025-07-13T01:56:08+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2025-07-13T01:56:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=709fee55fd1f83faef91cf0542766da4421424f3'/>
<id>709fee55fd1f83faef91cf0542766da4421424f3</id>
<content type='text'>
this change both aligns with the intended future direction for most
assembly usage, and makes it possible to add arch-specific setup logic
based on hwcaps like we have for 32-bit arm.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
this change both aligns with the intended future direction for most
assembly usage, and makes it possible to add arch-specific setup logic
based on hwcaps like we have for 32-bit arm.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix register name usage in aarch64 clone.s</title>
<updated>2025-07-01T15:57:40+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2025-07-01T15:57:40+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=caae5a8b272861607c25f8ed86087bae960a07f0'/>
<id>caae5a8b272861607c25f8ed86087bae960a07f0</id>
<content type='text'>
the alias fp is only supported on some assemblers. use the actual
register name x29 instead.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
the alias fp is only supported on some assemblers. use the actual
register name x29 instead.
</pre>
</div>
</content>
</entry>
<entry>
<title>clone: clear the frame pointer in the child process on relevant ports</title>
<updated>2025-02-22T01:53:41+00:00</updated>
<author>
<name>Alex Rønne Petersen</name>
<email>alex@alexrp.com</email>
</author>
<published>2024-12-12T16:56:04+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=b6b81f697b38ef915a5dbf1311baba164822e917'/>
<id>b6b81f697b38ef915a5dbf1311baba164822e917</id>
<content type='text'>
This just mirrors what is done in the start code for the affected
ports, as well as what is already done for the three x86 ports.

Clearing the frame pointer helps protect FP-based unwinders from
wrongly attempting to traverse into the parent thread's call frame
stack.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This just mirrors what is done in the start code for the affected
ports, as well as what is already done for the three x86 ports.

Clearing the frame pointer helps protect FP-based unwinders from
wrongly attempting to traverse into the parent thread's call frame
stack.
</pre>
</div>
</content>
</entry>
<entry>
<title>clone: align the given stack pointer on or1k and riscv</title>
<updated>2025-02-22T00:56:28+00:00</updated>
<author>
<name>Alex Rønne Petersen</name>
<email>alex@alexrp.com</email>
</author>
<published>2025-02-08T04:39:59+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=5e03c03fcde3534b37a0b995a438cd176d6882d3'/>
<id>5e03c03fcde3534b37a0b995a438cd176d6882d3</id>
<content type='text'>
This was an oversight specific to these archs; others have always
aligned the new stack pointer correctly.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was an oversight specific to these archs; others have always
aligned the new stack pointer correctly.
</pre>
</div>
</content>
</entry>
<entry>
<title>s390x: manually inline __tls_get_addr in __tls_get_offset</title>
<updated>2025-02-09T14:46:53+00:00</updated>
<author>
<name>Alex Rønne Petersen</name>
<email>alex@alexrp.com</email>
</author>
<published>2025-01-24T05:12:13+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=6af4f25b899e89e4b91f8c197ae5a6ce04bcce7b'/>
<id>6af4f25b899e89e4b91f8c197ae5a6ce04bcce7b</id>
<content type='text'>
Calling __tls_get_addr with brasl is not valid since it's a global symbol; doing
so results in an R_390_PC32DBL relocation error from lld. We could fix this by
marking __tls_get_addr hidden since it is not part of the s390x ABI, or by using
a different instruction. However, given its simplicity, it makes more sense to
just manually inline it into __tls_get_offset for performance.

The patch has been tested by applying to Zig's bundled musl copy and running the
full Zig test suite under qemu-s390x.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Calling __tls_get_addr with brasl is not valid since it's a global symbol; doing
so results in an R_390_PC32DBL relocation error from lld. We could fix this by
marking __tls_get_addr hidden since it is not part of the s390x ABI, or by using
a different instruction. However, given its simplicity, it makes more sense to
just manually inline it into __tls_get_offset for performance.

The patch has been tested by applying to Zig's bundled musl copy and running the
full Zig test suite under qemu-s390x.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix lost or delayed wakes in sem_post under certain race conditions</title>
<updated>2024-08-10T20:30:28+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2024-08-10T20:30:28+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=882aedf6a13891f887d20f6a4184a13e94793b84'/>
<id>882aedf6a13891f887d20f6a4184a13e94793b84</id>
<content type='text'>
if sem_post is interrupted between clearing the waiters bit from the
semaphore value and performing the futex wait operation, subsequent
calls to sem_post will not perform a wake operation unless a new
waiter has arrived.

usually, this is at most a minor nuisance, since the original wake
operation will eventually happen. however, it's possible that the wake
is delayed indefinitely if interrupted by a signal handler, or that
the address the wake needs to be performed on is no longer mapped if
the semaphore was a process-shared one that has since been unmapped
but has a waiter on a different mapping of the same semaphore. this
can happen when another thread using the same mapping "steals the
post" atomically before actually becoming a second waiter, deduces
from success that it was the last user of the semaphore mapping, then
re-posts and unmaps the semaphore mapping. this scenario was described
in a report by Markus Wichmann.

instead of checking only the waiters bit, also check the waiter count
that was sampled before the atomic post operation, and perform the
wake if it's nonzero. this will not produce any additional wakes under
non-race conditions, since the waiters bit only becomes zero when
targeting a single waiter for wake. checking both was already the
behavior prior to commit 159d1f6c02569091c7a48bdb2e2e824b844a1902.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
if sem_post is interrupted between clearing the waiters bit from the
semaphore value and performing the futex wait operation, subsequent
calls to sem_post will not perform a wake operation unless a new
waiter has arrived.

usually, this is at most a minor nuisance, since the original wake
operation will eventually happen. however, it's possible that the wake
is delayed indefinitely if interrupted by a signal handler, or that
the address the wake needs to be performed on is no longer mapped if
the semaphore was a process-shared one that has since been unmapped
but has a waiter on a different mapping of the same semaphore. this
can happen when another thread using the same mapping "steals the
post" atomically before actually becoming a second waiter, deduces
from success that it was the last user of the semaphore mapping, then
re-posts and unmaps the semaphore mapping. this scenario was described
in a report by Markus Wichmann.

instead of checking only the waiters bit, also check the waiter count
that was sampled before the atomic post operation, and perform the
wake if it's nonzero. this will not produce any additional wakes under
non-race conditions, since the waiters bit only becomes zero when
targeting a single waiter for wake. checking both was already the
behavior prior to commit 159d1f6c02569091c7a48bdb2e2e824b844a1902.
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv32: add thread support</title>
<updated>2024-02-29T21:36:55+00:00</updated>
<author>
<name>Stefan O'Rear</name>
<email>sorear@fastmail.com</email>
</author>
<published>2020-09-03T09:56:46+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=b28c44de8c3131b45588f61569b1711c987ba1c3'/>
<id>b28c44de8c3131b45588f61569b1711c987ba1c3</id>
<content type='text'>
Identical to riscv64 except for stack offsets in clone.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Identical to riscv64 except for stack offsets in clone.
</pre>
</div>
</content>
</entry>
<entry>
<title>loongarch64 __clone: align stack pointer mod 16</title>
<updated>2024-02-26T20:23:01+00:00</updated>
<author>
<name>wanghongliang</name>
<email>wanghongliang@loongson.cn</email>
</author>
<published>2024-02-25T18:12:28+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=80e3b09823a1d718664bc13704f3f7c19038a19e'/>
<id>80e3b09823a1d718664bc13704f3f7c19038a19e</id>
<content type='text'>
According to LoongArch ABI Specs, stack need to be 16 align to improve
performance and compiler layout of stack frames.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
According to LoongArch ABI Specs, stack need to be 16 align to improve
performance and compiler layout of stack frames.
</pre>
</div>
</content>
</entry>
</feed>
