|Age||Commit message (Collapse)||Author||Lines|
This function is a GNU extension introduced in glibc 2.17.
currently the bfd linker does not seem to create tls segments where
p_vaddr%p_align != 0, but this is valid in ELF and then the runtime
computed tls offset must satisfy
offset%p_align == (base+p_vaddr)%p_align
and in case of local exec tls (main executable) the smallest such
offset must be used (otherwise it is incompatible with the offset
computed by the static linker). the !TLS_ABOVE_TP case is handled
correctly (the offset is negative then in the formula).
the ldso code for TLS_ABOVE_TP is changed so the static tls offset
of each module satisfies the formula.
this is the first part of a series of patches intended to make
__syscall fully self-contained in the object file produced using
syscall.h, which will make it possible for crt1 code to perform
the (confusingly named) i386 __vsyscall mechanism, which this commit
removes, was introduced before the presence of a valid thread pointer
was mandatory; back then the thread pointer was setup lazily only if
threads were used. the intent was to be able to perform syscalls using
the kernel's fast entry point in the VDSO, which can use the sysenter
(Intel) or syscall (AMD) instruction instead of int $128, but without
inlining an access to the __syscall global at the point of each
syscall, which would incur a significant size cost from PIC setup
everywhere. the mechanism also shuffled registers/calling convention
around to avoid spills of call-saved registers, and to avoid
allocating ebx or ebp via asm constraints, since there are plenty of
broken-but-supported compiler versions which are incapable of
allocating ebx with -fPIC or ebp with -fno-omit-frame-pointer.
the new mechanism preserves the properties of avoiding spills and
avoiding allocation of ebx/ebp in constraints, but does it inline,
using some fairly simple register shuffling, and uses a field of the
thread structure rather than global data for the vdso-provided syscall
for now, the external __syscall function is refactored not to use the
old __vsyscall so it can be kept, but the intent is to remove it too.
the hard problem here is unlinking threads from a list when they exit
without creating a window of inconsistency where the kernel task for a
thread still exists and is still executing instructions in userspace,
but is not reflected in the list. the magic solution here is getting
rid of per-thread exit futex addresses (set_tid_address), and instead
using the exit futex to unlock the global thread list.
since pthread_join can no longer see the thread enter a detach_state
of EXITED (which depended on the exit futex address pointing to the
detach_state), it must now observe the unlocking of the thread list
lock before it can unmap the joined thread and return. it doesn't
actually have to take the lock. for this, a __tl_sync primitive is
offered, with a signature that will allow it to be enhanced for quick
return even under contention on the lock, if needed. for now, the
exiting thread always performs a futex wake on its detach_state. a
future change could optimize this out except when there is already a
initial/dynamic variants of detached state no longer need to be
tracked separately, since the futex address is always set to the
global list lock, not a thread-local address that could become invalid
on detached thread exit. all detached threads, however, must perform a
second sigprocmask syscall to block implementation-internal signals,
since locking the thread list with them already blocked is not
the arch-independent C version of __unmapself no longer needs to take
a lock or setup its own futex address to release the lock, since it
must necessarily be called with the thread list lock already held,
guaranteeing exclusive access to the temporary stack.
changes to libc.threads_minus_1 no longer need to be atomic, since
they are guarded by the thread list lock. it is largely vestigial at
this point, and can be replaced with a cheaper boolean indicating
whether the process is multithreaded at some point in the future.
Use "+r" in the asm instead of implementing a non-transparent copy by
applying "0" constraint to the source value. Introduce a typedef for
the function type to avoid spelling it out twice.
this is not needed for correctness, but doesn't hurt, and in some
cases the compiler may pessimize the call assuming the callee might be
variadic when it lacks a prototype.
commit 4390383b32250a941ec616e8bff6f568a801b1c0 inadvertently used "r"
instead of "0" for the input constraint, which only happened to work
for the configuration I tested it on because it usually makes sense
for the compiler to choose the same input and output register.
on multiple occasions I've started to flatten/inline the code in
__init_libc, only to rediscover the reason it was not inlined: GCC
fails to deallocate its stack (and now, with the changes in commit
4390383b32250a941ec616e8bff6f568a801b1c0, fails to produce a tail call
to the stage 2 function; see PR #87639) before calling main if it was
document this with a comment and use an explicit noinline attribute if
__GNUC__ is defined so that even with CFLAGS that heavily favor
inlining it won't get inlined.
this is the analog of commit 1c84c99913bf1cd47b866ed31e665848a0da84a2
for static linking. unlike with dynamic linking, we don't have
symbolic lookup to use as a barrier. use a dummy (target-agnostic)
degenerate inline asm fragment instead. this technique has precedent
in commit 05ac345f895098657cf44d419b5d572161ebaf43 where it's used for
explicit_bzero. if it proves problematic in any way, loading the
address of the stage 2 function from a pointer object whose address
leaks to kernelspace during thread pointer init could be used as an
even stronger barrier.
as explained in commit 6ba5517a460c6c438f64d69464fdfc3269a4c91a, some
archs use an offset (typicaly -0x8000) with their DTPOFF relocations,
which __tls_get_addr needs to invert. on affected archs, which lack
direct support for large immediates, this can cost multiple extra
instructions in the hot path. instead, incorporate the DTP_OFFSET into
the DTV entries. this means they are no longer valid pointers, so
store them as an array of uintptr_t rather than void *; this also
makes it easier to access slot 0 as a valid slot count.
commit e75b16cf93ebbc1ce758d3ea6b2923e8b2457c68 left behind cruft in
two places, __reset_tls and __tls_get_new, from back when it was
possible to have uninitialized gap slots indicated by a null pointer
in the DTV. since the concept of null pointer is no longer meaningful
with an offset applied, remove this cruft.
presently there are no archs with both TLSDESC and nonzero DTP_OFFSET,
but the dynamic TLSDESC relocation code is also updated to apply an
inverted offset to its offset field, so that the offset DTV would not
impose a runtime cost in TLSDESC resolver functions.
this facilitates building software that assumes a large default stack
size without any patching to call pthread_setattr_default_np or
pthread_attr_setstacksize at each thread creation site, using just
normally the PT_GNU_STACK header is used only to reflect whether
executable stack is desired, but with GNU ld at least, passing
-Wl,-z,stack-size=N will set a size on the program header. with this
patch, that size will be incorporated into the default stack size
(subject to increase-only rule and DEFAULT_STACK_MAX limit).
both static and dynamic linking honor the program header. for dynamic
linking, all libraries loaded at program start, including preloaded
ones, are considered. dlopened libraries are not considered, for
several reasons. extra logic would be needed to defer processing until
the load of the new library is commited, synchronization woud be
needed since other threads may be running concurrently, and the
effectiveness woud be limited since the larger size would not apply to
threads that already existed at the time of dlopen. programs that will
dlopen code expecting a large stack need to declare the requirement
themselves, or pthread_setattr_default_np can be used.
libc.h was intended to be a header for access to global libc state and
related interfaces, but ended up included all over the place because
it was the way to get the weak_alias macro. most of the inclusions
removed here are places where weak_alias was needed. a few were
recently introduced for hidden. some go all the way back to when
libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented)
cancellation points had to include it.
remaining spurious users are mostly callers of the LOCK/UNLOCK macros
and files that use the LFS64 macro to define the awful *64 aliases.
in a few places, new inclusion of libc.h is added because several
internal headers no longer implicitly include libc.h.
declarations for __lockfile and __unlockfile are moved from libc.h to
stdio_impl.h so that the latter does not need libc.h. putting them in
libc.h made no sense at all, since the macros in stdio_impl.h are
needed to use them correctly anyway.
commits leading up to this one have moved the vast majority of
libc-internal interface declarations to appropriate internal headers,
allowing them to be type-checked and setting the stage to limit their
visibility. the ones that have not yet been moved are mostly
namespace-protected aliases for standard/public interfaces, which
exist to facilitate implementing plain C functions in terms of POSIX
functionality, or C or POSIX functionality in terms of extensions that
are not standardized. some don't quite fit this description, but are
"internally public" interfacs between subsystems of libc.
rather than create a number of newly-named headers to declare these
functions, and having to add explicit include directives for them to
every source file where they're needed, I have introduced a method of
wrapping the corresponding public headers.
parallel to the public headers in $(srcdir)/include, we now have
wrappers in $(srcdir)/src/include that come earlier in the include
path order. they include the public header they're wrapping, then add
declarations for namespace-protected versions of the same interfaces
and any "internally public" interfaces for the subsystem they
along these lines, the wrapper for features.h is now responsible for
the definition of the hidden, weak, and weak_alias macros. this means
source files will no longer need to include any special headers to
access these features.
over time, it is my expectation that the scope of what is "internally
public" will expand, reducing the number of source files which need to
include *_impl.h and related headers down to those which are actually
implementing the corresponding subsystems, not just using them.
this cleans up what had become widespread direct inline use of "GNU C"
style attributes directly in the source, and lowers the barrier to
increased use of hidden visibility, which will be useful to recovering
some of the efficiency lost when the protected visibility hack was
dropped in commit dc2f368e565c37728b0d620380b849c3a1ddd78f, especially
on archs where the PLT ABI is costly.
In TLS variant I the TLS is above TP (or above a fixed offset from TP)
but on some targets there is a reserved gap above TP before TLS starts.
This matters for the local-exec tls access model when the offsets of
TLS variables from the TP are hard coded by the linker into the
executable, so the libc must compute these offsets the same way as the
linker. The tls offset of the main module has to be
If there is no TLS in the main module then the gap can be ignored
since musl does not use it and the tls access models of shared
libraries are not affected.
The previous setup only worked if (tls_align & -GAP_ABOVE_TP) == 0
(i.e. TLS did not require large alignment) because the gap was
treated as a fixed offset from TP. Now the TP points at the end
of the pthread struct (which is aligned) and there is a gap above
it (which may also need alignment).
The fix required changing TP_ADJ and __pthread_self on affected
targets (aarch64, arm and sh) and in the tlsdesc asm the offset to
access the dtv changed too.
previously, some accesses to the detached state (from pthread_join and
pthread_getattr_np) were unsynchronized; they were harmless in
programs with well-defined behavior, but ugly. other accesses (in
pthread_exit and pthread_detach) were synchronized by a poorly named
"exitlock", with an ad-hoc trylock operation on it open-coded in
pthread_detach, whose only purpose was establishing protocol for which
thread is responsible for deallocation of detached-thread resources.
instead, use an atomic detach_state and unify it with the futex used
to wait for thread exit. this eliminates 2 members from the pthread
structure, gets rid of the hackish lock usage, and makes rigorous the
trap added in commit 80bf5952551c002cf12d96deb145629765272db0 for
catching attempts to join detached threads. it should also make
attempt to detach an already-detached thread reliably trap.
the tid field in the pthread structure is not volatile, and really
shouldn't be, so as not to limit the compiler's ability to reorder,
merge, or split loads in code paths that may be relevant to
performance (like controlling lock ownership).
however, use of objects which are not volatile or atomic with futex
wait is inherently broken, since the compiler is free to transform a
single load into multiple loads, thereby using a different value for
the controlling expression of the loop and the value passed to the
futex syscall, leading the syscall to block instead of returning.
reportedly glibc's pthread_join was actually affected by an equivalent
issue in glibc on s390.
add a separate, dedicated join_futex object for pthread_join to use.
it was reported by Erik Bosman that poll fails without setting revents
when the nfds argument exceeds the current value for RLIMIT_NOFILE,
causing the subsequent open calls to be bypassed. if the rlimit is
either 1 or 2, this leaves fd 0 and 1 potentially closed but openable
when the application code is reached.
based on a brief reading of the poll syscall documentation and code,
it may be possible for poll to fail under other attacker-controlled
conditions as well. if it turns out these are reasonable conditions
that may happen in the real world, we may have to go back and
implement fallbacks to probe each fd individually if poll fails, but
for now, keep things simple and treat all poll failures as fatal.
this is for consistency with the way it's done in in the dynamic
linker, avoiding a deprecated C feature (non-prototype function
types), and improving code generation. GCC unnecessarily uses the
variadic calling convention (e.g. clearing rax on x86_64) when making
a call where the argument types are not known for compatibility with
wrong code which calls variadic functions this way. (C on the other
hand is clear that such calls have undefined behavior.)
This aligns clearenv with the Linux man page by setting 'environ'
rather than '*environ' to NULL, and stops it from leaking entries
allocated by the libc.
Rewrite environment access functions to slim down code, fix bugs and
avoid invoking undefined behavior.
* avoid using int-typed iterators where size_t would be correct;
* use strncmp instead of memcmp consistently;
* tighten prologues by invoking __strchrnul;
* handle NULL environ.
* handle "=value" input via unsetenv too (will return -1/EINVAL);
* rewrite and simplify __putenv; fix the leak caused by failure to
deallocate entry added by preceding setenv when called from putenv.
* move management of libc-allocated entries to this translation unit,
and use no-op weak symbols in putenv/unsetenv;
* rewrite; this fixes UB caused by testing a free'd pointer against
NULL on entry to subsequent loops.
Failure to extend allocation tracking array (previously __env_map, now
env_alloced) is ignored rather than causing to report -1/ENOMEM to the
caller; the worst-case consequence is leaking this allocation when it
is removed or replaced in a subsequent environment access.
Initially UB in unsetenv was reported by Alexander Cherepanov.
Using a weak alias to avoid pulling in malloc via unsetenv was
suggested by Rich Felker.
It is possible for argv to be a null pointer, but the __progname
variable is used to implement functions in src/legacy/err.c that do not
expect it to be null. It is also available to the user via the
program_invocation_name alias as a GNU extension, and the implementation
in Glibc initializes it to a pointer to empty string rather than NULL.
Since argv is usually non-null and it's preferable to keep those
variables in BSS, implement the fallbacks in __init_libc, which also
allows to have an intermediate fallback to AT_EXECFN.
the static-linked version of __init_tls needs to locate the TLS
initialization image via the ELF program headers, which requires
determining the base address at which the program was loaded. the
existing code attempted to do this by comparing the actual address of
the program headers (obtained via auxv) with the virtual address for
the PT_PHDR record in the program headers. however, the linker seems
to produce a PT_PHDR record only when a program interpreter (dynamic
linker) is used. thus the computation failed and used the default base
address of 0, leading to a crash when trying to access the TLS image
at the wrong address.
the dynamic linker entry point and static-PIE rcrt1.o startup code
compute the base address instead by taking the difference between the
run-time address of _DYNAMIC and the virtual address in the PT_DYNAMIC
record. this patch copies the approach they use, but with a weak
symbolic reference to _DYNAMIC instead of obtaining the address from
the crt_arch.h asm. this works because relocations have already been
performed at the time __init_tls is called.
This is the minimal fix for __putenv leaving a pointer to freed heap
storage in __env_map array, which could later on lead to errors such
commit ad1cd43a86645ba2d4f7c8747240452a349d6bc1 eliminated
preprocessor-level omission of references to the init/fini array
symbols from object files going into libc.so. the references are weak,
and the intent was that the linker would resolve them to zero in
libc.so, but instead it leaves undefined references that could be
satisfied at runtime. normally these references would be harmless,
since the code using them does not even get executed, but some older
binutils versions produce a linking error: when linking a program
against libc.so, ld first tries to use the hidden init/fini array
symbols produced by the linker script to satisfy the references in
libc.so, then produces an error because the definitions are hidden.
ideally ld would have already provided definitions of these symbols
when linking libc.so, but the linker script for -shared omits them.
to avoid this situation, the dynamic linker now provides its own dummy
definitions of the init/fini array symbols for libc.so. since they are
hidden, everything binds at ld time and no references remain in the
dynamic symbol table. with modern binutils and --gc-sections, both
the dummy empty array objects and the code referencing them get
dropped at link time, anyway.
the _init and _fini symbols are also switched back to using weak
definitions rather than weak references since the latter behave
somewhat problematically in general, and the weak definition approach
was known to work well.
this both allows removal of some of the main remaining uses of the
SHARED macro and clears one obstacle to static-linked dlopen support,
which may be added at some point in the future.
specialized single-TLS-module versions of __copy_tls and __reset_tls
are removed and replaced with code adapted from their dynamic-linked
versions, capable of operating on a whole chain of TLS modules, and
use of the dynamic linker's DSO chain (which contains large struct dso
objects) by these functions is replaced with a new chain of struct
tls_module objects containing only the information needed for
implementing TLS. this may also yield some performance benefit
initializing TLS for a new thread when a large number of modules
without TLS have been loaded, since since there is no need to walk
structures for modules without TLS.
use weak definitions that the dynamic linker can override instead of
preprocessor conditionals on SHARED so that the same libc start and
exit code can be used for both static and dynamic linking.
this is the first and simplest stage of removal of the SHARED macro,
which will eventually allow libc.a and libc.so to be produced from the
same object files.
the original motivation for these #ifdefs which are now being removed
was to allow building a static-only libc using a compiler that does
not support visibility. however, SHARED was the wrong condition to
test for this anyway; various assembly-language sources refer to
hidden symbols and declare them with the .hidden directive, making it
wrong to define the referenced symbols as non-hidden. if there is a
need in the future to build libc using compilers that lack visibility,
support could be moved to the build system or perhaps the __PIC__
macro could be checked instead of SHARED.
this change is needed to be compatible with fdpic, where some of the
main application's relocations may be performed as part of the crt1
entry point. if we call init functions before passing control, these
relocations will not yet have been performed, and the init code will
potentially make use of invalid pointers.
conceptually, no code provided by the application or third-party
libraries should run before the application entry point. the
difference is not observable to programs using the crt1 we provide,
but it could come into play if custom entry point code is used, so
it's better to be doing this right anyway.
this symbol is needed only on archs where the PLT call ABI is klunky,
and only for position-independent code compiled with stack protector.
thus references usually only appear in shared libraries or PIE
executables, but they can also appear when linking statically if some
of the object files being linked were built as PIC/PIE.
normally libssp_nonshared.a from the compiler toolchain should provide
__stack_chk_fail_local, but reportedly it appears prior to -lc in the
link order, thus failing to satisfy references from libc itself (which
arise only if libc.a was built as PIC/PIE with stack protector
i386, x86_64, x32, and powerpc all use TLS for stack protector canary
values in the default stack protector ABI, but the location only
matched the ABI on i386 and x86_64. on x32, the expected location for
the canary contained the tid, thus producing spurious mismatches
(resulting in process termination) upon fork. on powerpc, the expected
location contained the stdio_locks list head, so returning from a
function after calling flockfile produced spurious mismatches. in both
cases, the random canary was not present, and a predictable value was
used instead, making the stack protector hardening much less effective
than it should be.
in the current fix, the thread structure has been expanded to have
canary fields at all three possible locations, and archs that use a
non-default location must define a macro in pthread_arch.h to choose
which location is used. for most archs (which lack TLS canary ABI) the
choice does not matter.
both static and dynamic linked versions of the __copy_tls function
have a hidden assumption that the alignment of the beginning or end of
the memory passed is suitable for storing an array of pointers for the
dtv. pthread_create satisfies this requirement except when
libc.tls_size is misaligned, which cannot happen with dynamic linking
due to way update_tls_size computes the total size, but could happen
with static linking and odd-sized TLS.
commit dab441aea240f3b7c18a26d2ef51979ea36c301c, which made thread
pointer init mandatory for all programs, rendered this store obsolete
by removing the early-return path for static programs with no TLS.
this slightly reduces the code size cost of TLS/thread-pointer for
static linking since __init_tp can be inlined into its only caller and
removed. this is analogous to the handling of __init_libc in
__libc_start_main, where the function only has external linkage when
it needs to be called from the dynamic linker.
these are used as hidden by asm files (and such use is the whole
reason they exist), but their actual definitions were not hidden.
part of the goal here is to eliminate use of the ATTR_LIBC_VISIBILITY
macro outside of libc.h, since it was never intended to be 'public'.
this was already essentially possible as a result of the previous
commits changing the dynamic linker/thread pointer bootstrap process.
this commit mainly adds build system infrastructure:
configure no longer attempts to disable stack protector. instead it
simply determines how so the makefile can disable stack protector for
a few translation units used during early startup.
stack protector is also disabled for memcpy and memset since compilers
(incorrectly) generate calls to them on some archs to implement
struct initialization and assignment, and such calls may creep into
no explicit attempt to enable stack protector is made by configure at
this time; any stack protector option supported by the compiler can be
passed to configure in CFLAGS, and if the compiler uses stack
protector by default, this default is respected.
since 1.1.0, musl has nominally required a thread pointer to be setup.
most of the remaining code that was checking for its availability was
doing so for the sake of being usable by the dynamic linker. as of
commit 71f099cb7db821c51d8f39dfac622c61e54d794c, this is no longer
necessary; the thread pointer is now valid before any libc code
(outside of dynamic linker bootstrap functions) runs.
this commit essentially concludes "phase 3" of the "transition path
for removing lazy init of thread pointer" project that began during
the 1.1.0 release cycle.
as a result of commit 12e1e324683a1d381b7f15dd36c99b37dd44d940, kernel
processing of the robust list is only needed for process-shared
mutexes. previously the first attempt to lock any owner-tracked mutex
resulted in robust list initialization and a set_robust_list syscall.
this is no longer necessary, and since the kernel's record of the
robust list must now be cleared at thread exit time for detached
threads, optimizing it out is more worthwhile than before too.
There are two main abi variants for thread local storage layout:
(1) TLS is above the thread pointer at a fixed offset and the pthread
struct is below that. So the end of the struct is at known offset.
(2) the thread pointer points to the pthread struct and TLS starts
below it. So the start of the struct is at known (zero) offset.
Assembly code for the dynamic TLSDESC callback needs to access the
dynamic thread vector (dtv) pointer which is currently at the front
of the pthread struct. So in case of (1) the asm code needs to hard
code the offset from the end of the struct which can easily break if
the struct changes.
This commit adds a copy of the dtv at the end of the struct. New members
must not be added after dtv_copy, only before it. The size of the struct
is increased a bit, but there is opportunity for size optimizations.
a conservative estimate of 4*sizeof(size_t) was used as the minimum
alignment for thread-local storage, despite the only requirements
being alignment suitable for struct pthread and void* (which struct
pthread already contains). additional alignment required by the
application or libraries is encoded in their headers and is already
over-alignment prevented the builtin_tls array from ever being used in
dynamic-linked programs on 64-bit archs, thereby requiring allocation
at startup even in programs with no TLS of their own.
C99 6.10.3p11 disallows such constructs
so use an #ifdef outside of the argument list of __syscall
the main motivation for this change is to remove the assumption that
the tid of the main thread is also the pid of the process. (the value
returned by the set_tid_address syscall was used to fill both fields
despite it semantically being the tid.) this is historically and
presently true on linux and unlikely to change, but it conceivably
could be false on other systems that otherwise reproduce the linux
only a few parts of the code were actually still using the cached pid.
in a couple places (aio and synccall) it was a minor optimization to
avoid a syscall. caching could be reintroduced, but lazily as part of
the public getpid function rather than at program startup, if it's
deemed important for performance later. in other places (cancellation
and pthread_kill) the pid was completely unnecessary; the tkill
syscall can be used instead of tgkill. this is actually a rather
subtle issue, since tgkill is supposedly a solution to race conditions
that can affect use of tkill. however, as documented in the commit
message for commit 7779dbd2663269b465951189b4f43e70839bc073, tgkill
does not actually solve this race; it just limits it to happening
within one process rather than between processes. we use a lock that
avoids the race in pthread_kill, and the use in the cancellation
signal handler is self-targeted and thus not subject to tid reuse
races, so both are safe regardless of which syscall (tgkill or tkill)
this commit adds non-stub implementations of setlocale, duplocale,
newlocale, and uselocale, along with the data structures and minimal
code needed for representing the active locale on a per-thread basis
and optimizing the common case where thread-local locale settings are
not in use.
at this point, the data structures only contain what is necessary to
represent LC_CTYPE (a single flag) and LC_MESSAGES (a name for use in
finding message translation files). representation for the other
categories will be added later; the expectation is that a single
pointer will suffice for each.
for LC_CTYPE, the strings "C" and "POSIX" are treated as special; any
other string is accepted and treated as "C.UTF-8". for other
categories, any string is accepted after being truncated to a maximum
supported length (currently 15 bytes). for LC_MESSAGES, the name is
kept regardless of whether libc itself can use such a message
translation locale, since applications using catgets or gettext should
be able to use message locales libc is not aware of. for other
categories, names which are not successfully loaded as locales (which,
at present, means all names) are treated as aliases for "C". setlocale
locale settings are not yet used anywhere, so this commit should have
no visible effects except for the contents of the string returned by
such separation serves multiple purposes:
- by having the common path for __tls_get_addr alone in its own
function with a tail call to the slow case, code generation is
- by having __tls_get_addr in it own file, it can be replaced on a
per-arch basis as needed, for optimization or ABI-specific purposes.
- by removing __tls_get_addr from __init_tls.c, a few bytes of code
are shaved off of static binaries (which are unlikely to use this
function unless the linker messed up).
the motivation for the errno_ptr field in the thread structure, which
this commit removes, was to allow the main thread's errno to keep its
address when lazy thread pointer initialization was used. &errno was
evaluated prior to setting up the thread pointer and stored in
errno_ptr for the main thread; subsequently created threads would have
errno_ptr pointing to their own errno_val in the thread structure.
since lazy initialization was removed, there is no need for this extra
level of indirection; __errno_location can simply return the address
of the thread's errno_val directly. this does cause &errno to change,
but the change happens before entry to application code, and thus is
such kernels cannot support threads, but the thread pointer is also
important for other purposes, most notably stack protector. without a
valid thread pointer, all code compiled with stack protector will
crash. the same applies to any use of thread-local storage by
applications or libraries.
the concept of this patch is to fall back to using the modify_ldt
syscall, which has been around since linux 1.0, to setup the gs
segment register. since the kernel does not have a way to
automatically assign ldt entries, use of slot zero is hard-coded. if
this fallback path is used, __set_thread_area returns a positive value
(rather than the usual zero for success, or negative for error)
indicating to the caller that the thread pointer was successfully set,
but only for the main thread, and that thread creation will not work
properly. the code in __init_tp has been changed accordingly to record
this result for later use by pthread_create.
such archs are expected to omit definitions of the SYS_* macros for
syscalls their kernels lack from arch/$ARCH/bits/syscall.h. the
preprocessor is then able to select the an appropriate implementation
for affected functions. two basic strategies are used on a
where the old syscalls correspond to deprecated library-level
functions, the deprecated functions have been converted to wrappers
for the modern function, and the modern function has fallback code
(omitted at the preprocessor level on new archs) to make use of the
old syscalls if the new syscall fails with ENOSYS. this also improves
functionality on older kernels and eliminates the incentive to program
with deprecated library-level functions for the sake of compatibility
with older kernels.
in other situations where the old syscalls correspond to library-level
functions which are not deprecated but merely lack some new features,
such as the *at functions, the old syscalls are still used on archs
which support them. this may change at some point in the future if or
when fallback code is added to the new functions to make them usable
(possibly with reduced functionality) on old kernels.
open is handled specially because it is used from so many places, in
so many variants (2 or 3 arguments, setting errno or not, and
cancellable or not). trying to do it as a function would not only
increase bloat, but would also risk subtle breakage.
this is the first step towards supporting "new" archs where linux
lacks "old" syscalls.