<feed xmlns='http://www.w3.org/2005/Atom'>
<title>musl/src/thread, branch v1.2.2</title>
<subtitle>musl - an implementation of the standard library for Linux-based systems</subtitle>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/'/>
<entry>
<title>fix omission of non-stub pthread_mutexattr_getprotocol</title>
<updated>2020-12-07T22:25:08+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-12-07T22:25:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=90ff016996753d83263445940710c87d9afa71f3'/>
<id>90ff016996753d83263445940710c87d9afa71f3</id>
<content type='text'>
this change should have been made when priority inheritance mutex
support was added. if priority protection is also added at some point
the implementation will need to change and will probably no longer be
a simple bit shuffling.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
this change should have been made when priority inheritance mutex
support was added. if priority protection is also added at some point
the implementation will need to change and will probably no longer be
a simple bit shuffling.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix failure to preserve r6 in s390x asm; per ABI it is call-saved</title>
<updated>2020-12-04T22:01:05+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-12-04T22:01:05+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=50c7935cd21f33f42013447d57bc19d2e6318aae'/>
<id>50c7935cd21f33f42013447d57bc19d2e6318aae</id>
<content type='text'>
both __clone and __syscall_cp_asm failed to restore the original value
of r6 after using it as a syscall argument register. the extent of
breakage is not known, and in some cases may be mitigated by the only
callers being internal to libc; if they used r6 but no longer needed
its value after the call, they may not have noticed the problem.
however at least posix_spawn (which uses __clone) was observed
returning to the application with the wrong value in r6, leading to
crash.

since the call frame ABI already provides a place to spill registers,
fixing this is just a matter of using it. in __clone, we also
spuriously restore r6 in the child, since the parent branch directly
returns to the caller. this takes the value from an uninitialized slot
of the child's stack, but is harmless since there is no caller to
return to in the child.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
both __clone and __syscall_cp_asm failed to restore the original value
of r6 after using it as a syscall argument register. the extent of
breakage is not known, and in some cases may be mitigated by the only
callers being internal to libc; if they used r6 but no longer needed
its value after the call, they may not have noticed the problem.
however at least posix_spawn (which uses __clone) was observed
returning to the application with the wrong value in r6, leading to
crash.

since the call frame ABI already provides a place to spill registers,
fixing this is just a matter of using it. in __clone, we also
spuriously restore r6 in the child, since the parent branch directly
returns to the caller. this takes the value from an uninitialized slot
of the child's stack, but is harmless since there is no caller to
return to in the child.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix regression in pthread_exit</title>
<updated>2020-11-20T15:43:20+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-11-20T15:43:20+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=debbddf7c86dfe7fb2f44f057123ccfd950ff555'/>
<id>debbddf7c86dfe7fb2f44f057123ccfd950ff555</id>
<content type='text'>
commit d26e0774a59bb7245b205bc8e7d8b35cc2037095 moved the detach state
transition at exit before the thread list lock was taken. this
inadvertently allowed pthread_join to race to take the thread list
lock first, and proceed with unmapping of the exiting thread's memory.

we could fix this by just revering the offending commit and instead
performing __vm_wait unconditionally before taking the thread list
lock, but that may be costly. instead, bring back the old DT_EXITING
vs DT_EXITED state distinction that was removed in commit
8f11e6127fe93093f81a52b15bb1537edc3fc8af, and don't transition to
DT_EXITED (a value of 0, which is what pthread_join waits for) until
after the lock has been taken.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d26e0774a59bb7245b205bc8e7d8b35cc2037095 moved the detach state
transition at exit before the thread list lock was taken. this
inadvertently allowed pthread_join to race to take the thread list
lock first, and proceed with unmapping of the exiting thread's memory.

we could fix this by just revering the offending commit and instead
performing __vm_wait unconditionally before taking the thread list
lock, but that may be costly. instead, bring back the old DT_EXITING
vs DT_EXITED state distinction that was removed in commit
8f11e6127fe93093f81a52b15bb1537edc3fc8af, and don't transition to
DT_EXITED (a value of 0, which is what pthread_join waits for) until
after the lock has been taken.
</pre>
</div>
</content>
</entry>
<entry>
<title>protect destruction of process-shared mutexes against robust list races</title>
<updated>2020-11-19T21:36:49+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-11-19T21:20:45+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=233bb6972d84e9cb908ff038f78d97e487082225'/>
<id>233bb6972d84e9cb908ff038f78d97e487082225</id>
<content type='text'>
after a non-normal-type process-shared mutex is unlocked, it's
immediately available to another thread to lock, unlock, and destroy,
but the first unlocking thread may still have a pointer to it in its
robust_list pending slot. this means, on async process termination,
the kernel may attempt to access and modify the memory that used to
contain the mutex -- memory that may have been reused for some other
purpose after the mutex was destroyed.

setting up for this kind of race to occur is difficult to begin with,
requiring dynamic use of shared memory maps, and actually hitting the
race is very difficult even with a suitable setup. so this is mostly a
theoretical fix, but in any case the cost is very low.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
after a non-normal-type process-shared mutex is unlocked, it's
immediately available to another thread to lock, unlock, and destroy,
but the first unlocking thread may still have a pointer to it in its
robust_list pending slot. this means, on async process termination,
the kernel may attempt to access and modify the memory that used to
contain the mutex -- memory that may have been reused for some other
purpose after the mutex was destroyed.

setting up for this kind of race to occur is difficult to begin with,
requiring dynamic use of shared memory maps, and actually hitting the
race is very difficult even with a suitable setup. so this is mostly a
theoretical fix, but in any case the cost is very low.
</pre>
</div>
</content>
</entry>
<entry>
<title>pthread_exit: don't __vm_wait under thread list lock</title>
<updated>2020-11-19T21:09:16+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-11-19T21:09:16+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=d26e0774a59bb7245b205bc8e7d8b35cc2037095'/>
<id>d26e0774a59bb7245b205bc8e7d8b35cc2037095</id>
<content type='text'>
the __vm_wait operation can delay forward progress arbitrarily long if
a thread holding the lock is interrupted by a signal. in a worst case
this can deadlock. any critical section holding the thread list lock
must respect lock ordering contracts and must not take any lock which
is not AS-safe.

to fix, move the determination of thread joinable/detached state to
take place before the killlock and thread list lock are taken. this
requires reverting the atomic state transition if we determine that
the exiting thread is the last thread and must call exit, but that's
easy to do since it's a single-threaded context with application
signals blocked.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
the __vm_wait operation can delay forward progress arbitrarily long if
a thread holding the lock is interrupted by a signal. in a worst case
this can deadlock. any critical section holding the thread list lock
must respect lock ordering contracts and must not take any lock which
is not AS-safe.

to fix, move the determination of thread joinable/detached state to
take place before the killlock and thread list lock are taken. this
requires reverting the atomic state transition if we determine that
the exiting thread is the last thread and must call exit, but that's
easy to do since it's a single-threaded context with application
signals blocked.
</pre>
</div>
</content>
</entry>
<entry>
<title>lift child restrictions after multi-threaded fork</title>
<updated>2020-11-11T20:55:30+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-11-11T18:37:33+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=167390f05564e0a4d3fcb4329377fd7743267560'/>
<id>167390f05564e0a4d3fcb4329377fd7743267560</id>
<content type='text'>
as the outcome of Austin Group tracker issue #62, future editions of
POSIX have dropped the requirement that fork be AS-safe. this allows
but does not require implementations to synchronize fork with internal
locks and give forked children of multithreaded parents a partly or
fully unrestricted execution environment where they can continue to
use the standard library (per POSIX, they can only portably use
AS-safe functions).

up until recently, taking this allowance did not seem desirable.
however, commit 8ed2bd8bfcb4ea6448afb55a941f4b5b2b0398c0 exposed the
extent to which applications and libraries are depending on the
ability to use malloc and other non-AS-safe interfaces in MT-forked
children, by converting latent very-low-probability catastrophic state
corruption into predictable deadlock. dealing with the fallout has
been a huge burden for users/distros.

while it looks like most of the non-portable usage in applications
could be fixed given sufficient effort, at least some of it seems to
occur in language runtimes which are exposing the ability to run
unrestricted code in the child as part of the contract with the
programmer. any attempt at fixing such contracts is not just a
technical problem but a social one, and is probably not tractable.

this patch extends the fork function to take locks for all libc
singletons in the parent, and release or reset those locks in the
child, so that when the underlying fork operation takes place, the
state protected by these locks is consistent and ready for the child
to use. locking is skipped in the case where the parent is
single-threaded so as not to interfere with legacy AS-safety property
of fork in single-threaded programs. lock order is mostly arbitrary,
but the malloc locks (including bump allocator in case it's used) must
be taken after the locks on any subsystems that might use malloc, and
non-AS-safe locks cannot be taken while the thread list lock is held,
imposing a requirement that it be taken last.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
as the outcome of Austin Group tracker issue #62, future editions of
POSIX have dropped the requirement that fork be AS-safe. this allows
but does not require implementations to synchronize fork with internal
locks and give forked children of multithreaded parents a partly or
fully unrestricted execution environment where they can continue to
use the standard library (per POSIX, they can only portably use
AS-safe functions).

up until recently, taking this allowance did not seem desirable.
however, commit 8ed2bd8bfcb4ea6448afb55a941f4b5b2b0398c0 exposed the
extent to which applications and libraries are depending on the
ability to use malloc and other non-AS-safe interfaces in MT-forked
children, by converting latent very-low-probability catastrophic state
corruption into predictable deadlock. dealing with the fallout has
been a huge burden for users/distros.

while it looks like most of the non-portable usage in applications
could be fixed given sufficient effort, at least some of it seems to
occur in language runtimes which are exposing the ability to run
unrestricted code in the child as part of the contract with the
programmer. any attempt at fixing such contracts is not just a
technical problem but a social one, and is probably not tractable.

this patch extends the fork function to take locks for all libc
singletons in the parent, and release or reset those locks in the
child, so that when the underlying fork operation takes place, the
state protected by these locks is consistent and ready for the child
to use. locking is skipped in the case where the parent is
single-threaded so as not to interfere with legacy AS-safety property
of fork in single-threaded programs. lock order is mostly arbitrary,
but the malloc locks (including bump allocator in case it's used) must
be taken after the locks on any subsystems that might use malloc, and
non-AS-safe locks cannot be taken while the thread list lock is held,
imposing a requirement that it be taken last.
</pre>
</div>
</content>
</entry>
<entry>
<title>convert malloc use under libc-internal locks to use internal allocator</title>
<updated>2020-11-11T18:31:50+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-11-11T18:08:42+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=34952fe5de44a833370cbe87b63fb8eec61466d7'/>
<id>34952fe5de44a833370cbe87b63fb8eec61466d7</id>
<content type='text'>
this change lifts undocumented restrictions on calls by replacement
mallocs to libc functions that might take these locks, and sets the
stage for lifting restrictions on the child execution environment
after multithreaded fork.

care is taken to #define macros to replace all four functions (malloc,
calloc, realloc, free) even if not all of them will be used, using an
undefined symbol name for the ones intended not to be used so that any
inadvertent future use will be caught at compile time rather than
directed to the wrong implementation.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
this change lifts undocumented restrictions on calls by replacement
mallocs to libc functions that might take these locks, and sets the
stage for lifting restrictions on the child execution environment
after multithreaded fork.

care is taken to #define macros to replace all four functions (malloc,
calloc, realloc, free) even if not all of them will be used, using an
undefined symbol name for the ones intended not to be used so that any
inadvertent future use will be caught at compile time rather than
directed to the wrong implementation.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix erroneous pthread_cond_wait mutex waiter count logic due to typo</title>
<updated>2020-10-30T20:50:08+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-10-30T20:50:08+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=d91a6cf6e369a79587c5665fce9635e5634ca201'/>
<id>d91a6cf6e369a79587c5665fce9635e5634ca201</id>
<content type='text'>
introduced in commit 27b2fc9d6db956359727a66c262f1e69995660aa.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
introduced in commit 27b2fc9d6db956359727a66c262f1e69995660aa.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix missing-wake regression in pthread_cond_wait</title>
<updated>2020-10-30T15:21:06+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-10-30T15:21:06+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=27b2fc9d6db956359727a66c262f1e69995660aa'/>
<id>27b2fc9d6db956359727a66c262f1e69995660aa</id>
<content type='text'>
the reasoning in commit 2d0bbe6c788938d1332609c014eeebc1dff966ac was
not entirely correct. while it's true that setting the waiters flag
ensures that the next unlock will perform a wake, it's possible that
the wake is consumed by a mutex waiter that has no relationship with
the condvar wait queue being processed, which then takes the mutex.
when that thread subsequently unlocks, it sees no waiters, and leaves
the rest of the condvar queue stuck.

bring back the waiter count adjustment, but skip it for PI mutexes,
for which a successful lock-after-waiting always sets the waiters bit.
if future changes are made to bring this same waiters-bit contract to
all lock types, this can be reverted.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
the reasoning in commit 2d0bbe6c788938d1332609c014eeebc1dff966ac was
not entirely correct. while it's true that setting the waiters flag
ensures that the next unlock will perform a wake, it's possible that
the wake is consumed by a mutex waiter that has no relationship with
the condvar wait queue being processed, which then takes the mutex.
when that thread subsequently unlocks, it sees no waiters, and leaves
the rest of the condvar queue stuck.

bring back the waiter count adjustment, but skip it for PI mutexes,
for which a successful lock-after-waiting always sets the waiters bit.
if future changes are made to bring this same waiters-bit contract to
all lock types, this can be reverted.
</pre>
</div>
</content>
</entry>
<entry>
<title>fix sem_close unmapping of still-referenced semaphore</title>
<updated>2020-10-28T20:13:45+00:00</updated>
<author>
<name>Rich Felker</name>
<email>dalias@aerifal.cx</email>
</author>
<published>2020-10-28T20:13:45+00:00</published>
<link rel='alternate' type='text/html' href='http://git.musl-libc.org/cgit/musl/commit/?id=f70375df85d26235a45e74559afd69be59e5ff99'/>
<id>f70375df85d26235a45e74559afd69be59e5ff99</id>
<content type='text'>
sem_open is required to return the same sem_t pointer for all
references to the same named semaphore when it's opened more than once
in the same process. thus we keep a table of all the mapped semaphores
and their reference counts. the code path for sem_close checked the
reference count, but then proceeded to unmap the semaphore regardless
of whether the count had reached zero.

add an immediate unlock-and-return for the nonzero refcnt case so the
property of performing the munmap syscall after releasing the lock can
be preserved.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sem_open is required to return the same sem_t pointer for all
references to the same named semaphore when it's opened more than once
in the same process. thus we keep a table of all the mapped semaphores
and their reference counts. the code path for sem_close checked the
reference count, but then proceeded to unmap the semaphore regardless
of whether the count had reached zero.

add an immediate unlock-and-return for the nonzero refcnt case so the
property of performing the munmap syscall after releasing the lock can
be preserved.
</pre>
</div>
</content>
</entry>
</feed>
