<feed xmlns='http://www.w3.org/2005/Atom'>
<title>musl/src/thread, branch v1.2.0</title>
<subtitle>musl - an implementation of the standard library for Linux-based systems</subtitle>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/'/>
<entry>
<title>harden thread start with failed scheduling against broken __clone</title>
<updated>2019-09-13T18:17:36+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-09-13T18:17:36+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=f5eee489f7662b08ad1bba4b1267e34eb9565bba'/>
<id>f5eee489f7662b08ad1bba4b1267e34eb9565bba</id>
<content type='text'>
commit 8a544ee3a2a75af278145b09531177cab4939b41 introduced a
dependency of the failure path for explicit scheduling at thread
creation on __clone's handling of the start function returning, which
should result in SYS_exit.

as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, the arm
version of __clone was broken in this case. in the past, the mips
version was also broken; it was fixed in commit
8b2b61e0001281be0dcd3dedc899bf187172fecb.

since this code path is pretty much entirely untested (previously only
reachable in applications that call the public clone() and return from
the start function) and consists of fragile per-arch asm, don't assume
it works, at least not until it's been thoroughly tested. instead make
the SYS_exit syscall from the start function's failure path.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8a544ee3a2a75af278145b09531177cab4939b41 introduced a
dependency of the failure path for explicit scheduling at thread
creation on __clone's handling of the start function returning, which
should result in SYS_exit.

as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, the arm
version of __clone was broken in this case. in the past, the mips
version was also broken; it was fixed in commit
8b2b61e0001281be0dcd3dedc899bf187172fecb.

since this code path is pretty much entirely untested (previously only
reachable in applications that call the public clone() and return from
the start function) and consists of fragile per-arch asm, don't assume
it works, at least not until it's been thoroughly tested. instead make
the SYS_exit syscall from the start function's failure path.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix arm __a_barrier_oldkuser when built as thumb</title>
<updated>2019-09-11T17:21:28+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-09-11T17:21:28+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=b0301f47f3cf510b0237a024a3a073d55799101f'/>
<id>b0301f47f3cf510b0237a024a3a073d55799101f</id>
<content type='text'>
as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc
is not a valid method for saving the return address in code that might
be built as thumb.

this one is unlikely to matter, since any ISA level that has thumb2
should also have native implementations of atomics that don't involve
kuser_helper, and the affected code is only used on very old kernels
to begin with.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
as noted in commit 05870abeaac0588fb9115cfd11f96880a0af2108, mov lr,pc
is not a valid method for saving the return address in code that might
be built as thumb.

this one is unlikely to matter, since any ISA level that has thumb2
should also have native implementations of atomics that don't involve
kuser_helper, and the affected code is only used on very old kernels
to begin with.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix code path where child function returns in arm __clone built as thumb</title>
<updated>2019-09-11T17:13:57+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-09-11T17:13:57+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=05870abeaac0588fb9115cfd11f96880a0af2108'/>
<id>05870abeaac0588fb9115cfd11f96880a0af2108</id>
<content type='text'>
mov lr,pc is not a valid way to save the return address in thumb mode
since it omits the thumb bit. use a chain of bl and bx to emulate blx.
this could be avoided by converting to a .S file with preprocessor
conditions to use blx if available, but the time cost here is
dominated by the syscall anyway.

while making this change, also remove the remnants of support for
pre-bx ISA levels. commit 9f290a49bf9ee247d540d3c83875288a7991699c
removed the hack from the parent code paths, but left the unnecessary
code in the child. keeping it would require rewriting two code paths
rather than one, and is useless for reasons described in that commit.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
mov lr,pc is not a valid way to save the return address in thumb mode
since it omits the thumb bit. use a chain of bl and bx to emulate blx.
this could be avoided by converting to a .S file with preprocessor
conditions to use blx if available, but the time cost here is
dominated by the syscall anyway.

while making this change, also remove the remnants of support for
pre-bx ISA levels. commit 9f290a49bf9ee247d540d3c83875288a7991699c
removed the hack from the parent code paths, but left the unnecessary
code in the child. keeping it would require rewriting two code paths
rather than one, and is useless for reasons described in that commit.
</pre>
</div>
</content>
</entry>
<entry>
<title>synchronously clean up pthread_create failure due to scheduling errors</title>
<updated>2019-09-06T20:17:44+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-09-06T20:17:44+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=8a544ee3a2a75af278145b09531177cab4939b41'/>
<id>8a544ee3a2a75af278145b09531177cab4939b41</id>
<content type='text'>
previously, when pthread_create failed due to inability to set
explicit scheduling according to the requested attributes, the nascent
thread was detached and made responsible for its own cleanup via the
standard pthread_exit code path. this left it consuming resources
potentially well after pthread_create returned, in a way that the
application could not see or mitigate, and unnecessarily exposed its
existence to the rest of the implementation via the global thread
list.

instead, attempt explicit scheduling early and reuse the failure path
for __clone failure if it fails. the nascent thread's exit futex is
not needed for unlocking the thread list, since the thread calling
pthread_create holds the thread list lock the whole time, so it can be
repurposed to ensure the thread has finished exiting. no pthread_exit
is needed, and freeing the stack, if needed, can happen just as it
would if __clone failed.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
previously, when pthread_create failed due to inability to set
explicit scheduling according to the requested attributes, the nascent
thread was detached and made responsible for its own cleanup via the
standard pthread_exit code path. this left it consuming resources
potentially well after pthread_create returned, in a way that the
application could not see or mitigate, and unnecessarily exposed its
existence to the rest of the implementation via the global thread
list.

instead, attempt explicit scheduling early and reuse the failure path
for __clone failure if it fails. the nascent thread's exit futex is
not needed for unlocking the thread list, since the thread calling
pthread_create holds the thread list lock the whole time, so it can be
repurposed to ensure the thread has finished exiting. no pthread_exit
is needed, and freeing the stack, if needed, can happen just as it
would if __clone failed.
</pre>
</div>
</content>
</entry>
<entry>
<title>set explicit scheduling for new thread from calling thread, not self</title>
<updated>2019-09-06T19:54:32+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-09-06T19:26:44+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=022f27d541af68f047d367ad9d93f941135555c1'/>
<id>022f27d541af68f047d367ad9d93f941135555c1</id>
<content type='text'>
if setting scheduling properties succeeds, the new thread may end up
with lower priority than the caller, and may be unable to continue
running due to another intermediate-priority thread. this produces a
priority inversion situation for the thread calling pthread_create,
since it cannot return until the new thread reports success.

originally, the parent was responsible for setting the new thread's
priority; commits b8742f32602add243ee2ce74d804015463726899 and
40bae2d32fd6f3ffea437fa745ad38a1fe77b27e changed it as part of
trimming down the pthread structure. since then, commit
04335d9260c076cf4d9264bd93dd3b06c237a639 partly reversed the changes,
but did not switch responsibilities back. do that now.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
if setting scheduling properties succeeds, the new thread may end up
with lower priority than the caller, and may be unable to continue
running due to another intermediate-priority thread. this produces a
priority inversion situation for the thread calling pthread_create,
since it cannot return until the new thread reports success.

originally, the parent was responsible for setting the new thread's
priority; commits b8742f32602add243ee2ce74d804015463726899 and
40bae2d32fd6f3ffea437fa745ad38a1fe77b27e changed it as part of
trimming down the pthread structure. since then, commit
04335d9260c076cf4d9264bd93dd3b06c237a639 partly reversed the changes,
but did not switch responsibilities back. do that now.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix unsynchronized decrement of thread count on pthread_create error</title>
<updated>2019-09-06T19:54:32+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-09-06T19:52:00+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=dd0a23dd9e157624347fdcc9dca452675b93da70'/>
<id>dd0a23dd9e157624347fdcc9dca452675b93da70</id>
<content type='text'>
commit 8f11e6127fe93093f81a52b15bb1537edc3fc8af wrongly documented
that all changes to libc.threads_minus_1 were guarded by the thread
list lock, but the decrement for failed SYS_clone took place after the
thread list lock was released.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8f11e6127fe93093f81a52b15bb1537edc3fc8af wrongly documented
that all changes to libc.threads_minus_1 were guarded by the thread
list lock, but the decrement for failed SYS_clone took place after the
thread list lock was released.
</pre>
</div>
</content>
</entry>
<entry>
<title>in arm cancellation point asm, don't unnecessarily preserve link register</title>
<updated>2019-08-06T18:03:56+00:00</updated>
<author>
<name>Patrick Oppenlander</name>
<email>patrick.oppenlander@gmail.com</email>
</author>
<published>2019-08-01T04:34:59+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=e0e8ae754cc7653fcff489a0e229adbbb49fde6c'/>
<id>e0e8ae754cc7653fcff489a0e229adbbb49fde6c</id>
<content type='text'>
The only reason we needed to preserve the link register was because we
were using a branch-link instruction to branch to __cp_cancel.
Replacing this with a branch means we can avoid the save/restore as
the link register is no longer modified.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The only reason we needed to preserve the link register was because we
were using a branch-link instruction to branch to __cp_cancel.
Replacing this with a branch means we can avoid the save/restore as
the link register is no longer modified.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix missing declarations for pthread_join extensions in source file</title>
<updated>2019-08-02T04:08:23+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-07-31T21:18:21+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=3541925fb1db0bce7eaca7900fdd624e2f50ed6d'/>
<id>3541925fb1db0bce7eaca7900fdd624e2f50ed6d</id>
<content type='text'>
per policy, define the feature test macro to get declarations for the
pthread_tryjoin_np and pthread_timedjoin_np functions. in the past
this has been only for checking; with 32-bit archs getting 64-bit
time_t it will also be necessary for symbols to get redirected
correctly.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
per policy, define the feature test macro to get declarations for the
pthread_tryjoin_np and pthread_timedjoin_np functions. in the past
this has been only for checking; with 32-bit archs getting 64-bit
time_t it will also be necessary for symbols to get redirected
correctly.
</pre>
</div>
</content>
</entry>
<entry>
<title>remove x32 syscall timespec fixup hacks</title>
<updated>2019-07-29T04:19:21+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-07-28T23:29:28+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=4c307bed03d97da4cfd9224c5b7669bb22cd9faf'/>
<id>4c307bed03d97da4cfd9224c5b7669bb22cd9faf</id>
<content type='text'>
the x32 syscall interfaces treat timespec's tv_nsec member as 64-bit
despite the API type being long and long being 32-bit in the ABI. this
is no problem for syscalls that store timespecs to userspace as
results, but caused uninitialized padding to be misinterpreted as the
high bits in syscalls that take timespecs as input.

since the beginning of the port, we've dealt with this situation with
hacks in syscall_arch.h, and injected between __syscall_cp_c and
__syscall_cp_asm, to special-case the syscall numbers that involve
timespecs as inputs and copy them to a form suitable to pass to the
kernel.

commit 40aa18d55ab763e69ad16d0cf1cebea708ffde47 set the stage for
removal of these hacks by letting us treat the "normal" x32 syscalls
dealing with timespec as if they're x32's "time64" syscalls,
effectively making x32 ax "time64-only 32-bit arch" like riscv32 will
be when it's added. since then, all users of syscalls that x32's
syscall_arch.h had hacks for have been updated to use time64 syscalls,
so the hacks can be removed.

there are still at least a few other timespec-related syscalls broken
on x32, which were overlooked when the x32 hacks were done or added
later. these include at least recvmmsg, adjtimex/clock_adjtime, and
timerfd_settime, and they will be fixed independently later on.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
the x32 syscall interfaces treat timespec's tv_nsec member as 64-bit
despite the API type being long and long being 32-bit in the ABI. this
is no problem for syscalls that store timespecs to userspace as
results, but caused uninitialized padding to be misinterpreted as the
high bits in syscalls that take timespecs as input.

since the beginning of the port, we've dealt with this situation with
hacks in syscall_arch.h, and injected between __syscall_cp_c and
__syscall_cp_asm, to special-case the syscall numbers that involve
timespecs as inputs and copy them to a form suitable to pass to the
kernel.

commit 40aa18d55ab763e69ad16d0cf1cebea708ffde47 set the stage for
removal of these hacks by letting us treat the "normal" x32 syscalls
dealing with timespec as if they're x32's "time64" syscalls,
effectively making x32 ax "time64-only 32-bit arch" like riscv32 will
be when it's added. since then, all users of syscalls that x32's
syscall_arch.h had hacks for have been updated to use time64 syscalls,
so the hacks can be removed.

there are still at least a few other timespec-related syscalls broken
on x32, which were overlooked when the x32 hacks were done or added
later. these include at least recvmmsg, adjtimex/clock_adjtime, and
timerfd_settime, and they will be fixed independently later on.
</pre>
</div>
</content>
</entry>
<entry>
<title>futex wait operations: add time64 syscall support, decouple 32-bit time_t</title>
<updated>2019-07-28T21:44:51+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2019-07-28T21:44:51+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=1492bdf53c70f436b0fc2a7238011f9786bf4cd5'/>
<id>1492bdf53c70f436b0fc2a7238011f9786bf4cd5</id>
<content type='text'>
thanks to the original factorization using the __timedwait function,
there are no FUTEX_WAIT calls anywhere else, giving us a single point
of change to make nearly all the timed thread primitives time64-ready.
the one exception is the FUTEX_LOCK_PI command for PI mutex timedlock.
I haven't tried to make these two points share code, since they have
different fallbacks (no non-private fallback needed for PI since PI
was added later) and FUTEX_LOCK_PI isn't a cancellation point (thus
allowing the whole code path to inline into pthread_mutex_timedlock).

as for other changes in this series, the time64 syscall is used only
if it's the only one defined for the arch, or if the requested timeout
does not fit in 32 bits. on current 32-bit archs where time_t is a
32-bit type, this makes it statically unreachable.

on 64-bit archs, there are only superficial changes to the code after
preprocessing. on current 32-bit archs, the time is passed via an
intermediate copy to remove the assumption that time_t is a 32-bit
type.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
thanks to the original factorization using the __timedwait function,
there are no FUTEX_WAIT calls anywhere else, giving us a single point
of change to make nearly all the timed thread primitives time64-ready.
the one exception is the FUTEX_LOCK_PI command for PI mutex timedlock.
I haven't tried to make these two points share code, since they have
different fallbacks (no non-private fallback needed for PI since PI
was added later) and FUTEX_LOCK_PI isn't a cancellation point (thus
allowing the whole code path to inline into pthread_mutex_timedlock).

as for other changes in this series, the time64 syscall is used only
if it's the only one defined for the arch, or if the requested timeout
does not fit in 32 bits. on current 32-bit archs where time_t is a
32-bit type, this makes it statically unreachable.

on 64-bit archs, there are only superficial changes to the code after
preprocessing. on current 32-bit archs, the time is passed via an
intermediate copy to remove the assumption that time_t is a 32-bit
type.
</pre>
</div>
</content>
</entry>
</feed>
