|Age||Commit message (Collapse)||Author||Lines|
this is the first part of a series of patches intended to make
__syscall fully self-contained in the object file produced using
syscall.h, which will make it possible for crt1 code to perform
the (confusingly named) i386 __vsyscall mechanism, which this commit
removes, was introduced before the presence of a valid thread pointer
was mandatory; back then the thread pointer was setup lazily only if
threads were used. the intent was to be able to perform syscalls using
the kernel's fast entry point in the VDSO, which can use the sysenter
(Intel) or syscall (AMD) instruction instead of int $128, but without
inlining an access to the __syscall global at the point of each
syscall, which would incur a significant size cost from PIC setup
everywhere. the mechanism also shuffled registers/calling convention
around to avoid spills of call-saved registers, and to avoid
allocating ebx or ebp via asm constraints, since there are plenty of
broken-but-supported compiler versions which are incapable of
allocating ebx with -fPIC or ebp with -fno-omit-frame-pointer.
the new mechanism preserves the properties of avoiding spills and
avoiding allocation of ebx/ebp in constraints, but does it inline,
using some fairly simple register shuffling, and uses a field of the
thread structure rather than global data for the vdso-provided syscall
for now, the external __syscall function is refactored not to use the
old __vsyscall so it can be kept, but the intent is to remove it too.
Use "+r" in the asm instead of implementing a non-transparent copy by
applying "0" constraint to the source value. Introduce a typedef for
the function type to avoid spelling it out twice.
this is not needed for correctness, but doesn't hurt, and in some
cases the compiler may pessimize the call assuming the callee might be
variadic when it lacks a prototype.
commit 4390383b32250a941ec616e8bff6f568a801b1c0 inadvertently used "r"
instead of "0" for the input constraint, which only happened to work
for the configuration I tested it on because it usually makes sense
for the compiler to choose the same input and output register.
on multiple occasions I've started to flatten/inline the code in
__init_libc, only to rediscover the reason it was not inlined: GCC
fails to deallocate its stack (and now, with the changes in commit
4390383b32250a941ec616e8bff6f568a801b1c0, fails to produce a tail call
to the stage 2 function; see PR #87639) before calling main if it was
document this with a comment and use an explicit noinline attribute if
__GNUC__ is defined so that even with CFLAGS that heavily favor
inlining it won't get inlined.
this is the analog of commit 1c84c99913bf1cd47b866ed31e665848a0da84a2
for static linking. unlike with dynamic linking, we don't have
symbolic lookup to use as a barrier. use a dummy (target-agnostic)
degenerate inline asm fragment instead. this technique has precedent
in commit 05ac345f895098657cf44d419b5d572161ebaf43 where it's used for
explicit_bzero. if it proves problematic in any way, loading the
address of the stage 2 function from a pointer object whose address
leaks to kernelspace during thread pointer init could be used as an
even stronger barrier.
commits leading up to this one have moved the vast majority of
libc-internal interface declarations to appropriate internal headers,
allowing them to be type-checked and setting the stage to limit their
visibility. the ones that have not yet been moved are mostly
namespace-protected aliases for standard/public interfaces, which
exist to facilitate implementing plain C functions in terms of POSIX
functionality, or C or POSIX functionality in terms of extensions that
are not standardized. some don't quite fit this description, but are
"internally public" interfacs between subsystems of libc.
rather than create a number of newly-named headers to declare these
functions, and having to add explicit include directives for them to
every source file where they're needed, I have introduced a method of
wrapping the corresponding public headers.
parallel to the public headers in $(srcdir)/include, we now have
wrappers in $(srcdir)/src/include that come earlier in the include
path order. they include the public header they're wrapping, then add
declarations for namespace-protected versions of the same interfaces
and any "internally public" interfaces for the subsystem they
along these lines, the wrapper for features.h is now responsible for
the definition of the hidden, weak, and weak_alias macros. this means
source files will no longer need to include any special headers to
access these features.
over time, it is my expectation that the scope of what is "internally
public" will expand, reducing the number of source files which need to
include *_impl.h and related headers down to those which are actually
implementing the corresponding subsystems, not just using them.
this cleans up what had become widespread direct inline use of "GNU C"
style attributes directly in the source, and lowers the barrier to
increased use of hidden visibility, which will be useful to recovering
some of the efficiency lost when the protected visibility hack was
dropped in commit dc2f368e565c37728b0d620380b849c3a1ddd78f, especially
on archs where the PLT ABI is costly.
it was reported by Erik Bosman that poll fails without setting revents
when the nfds argument exceeds the current value for RLIMIT_NOFILE,
causing the subsequent open calls to be bypassed. if the rlimit is
either 1 or 2, this leaves fd 0 and 1 potentially closed but openable
when the application code is reached.
based on a brief reading of the poll syscall documentation and code,
it may be possible for poll to fail under other attacker-controlled
conditions as well. if it turns out these are reasonable conditions
that may happen in the real world, we may have to go back and
implement fallbacks to probe each fd individually if poll fails, but
for now, keep things simple and treat all poll failures as fatal.
this is for consistency with the way it's done in in the dynamic
linker, avoiding a deprecated C feature (non-prototype function
types), and improving code generation. GCC unnecessarily uses the
variadic calling convention (e.g. clearing rax on x86_64) when making
a call where the argument types are not known for compatibility with
wrong code which calls variadic functions this way. (C on the other
hand is clear that such calls have undefined behavior.)
It is possible for argv to be a null pointer, but the __progname
variable is used to implement functions in src/legacy/err.c that do not
expect it to be null. It is also available to the user via the
program_invocation_name alias as a GNU extension, and the implementation
in Glibc initializes it to a pointer to empty string rather than NULL.
Since argv is usually non-null and it's preferable to keep those
variables in BSS, implement the fallbacks in __init_libc, which also
allows to have an intermediate fallback to AT_EXECFN.
commit ad1cd43a86645ba2d4f7c8747240452a349d6bc1 eliminated
preprocessor-level omission of references to the init/fini array
symbols from object files going into libc.so. the references are weak,
and the intent was that the linker would resolve them to zero in
libc.so, but instead it leaves undefined references that could be
satisfied at runtime. normally these references would be harmless,
since the code using them does not even get executed, but some older
binutils versions produce a linking error: when linking a program
against libc.so, ld first tries to use the hidden init/fini array
symbols produced by the linker script to satisfy the references in
libc.so, then produces an error because the definitions are hidden.
ideally ld would have already provided definitions of these symbols
when linking libc.so, but the linker script for -shared omits them.
to avoid this situation, the dynamic linker now provides its own dummy
definitions of the init/fini array symbols for libc.so. since they are
hidden, everything binds at ld time and no references remain in the
dynamic symbol table. with modern binutils and --gc-sections, both
the dummy empty array objects and the code referencing them get
dropped at link time, anyway.
the _init and _fini symbols are also switched back to using weak
definitions rather than weak references since the latter behave
somewhat problematically in general, and the weak definition approach
was known to work well.
use weak definitions that the dynamic linker can override instead of
preprocessor conditionals on SHARED so that the same libc start and
exit code can be used for both static and dynamic linking.
this change is needed to be compatible with fdpic, where some of the
main application's relocations may be performed as part of the crt1
entry point. if we call init functions before passing control, these
relocations will not yet have been performed, and the init code will
potentially make use of invalid pointers.
conceptually, no code provided by the application or third-party
libraries should run before the application entry point. the
difference is not observable to programs using the crt1 we provide,
but it could come into play if custom entry point code is used, so
it's better to be doing this right anyway.
these are used as hidden by asm files (and such use is the whole
reason they exist), but their actual definitions were not hidden.
such archs are expected to omit definitions of the SYS_* macros for
syscalls their kernels lack from arch/$ARCH/bits/syscall.h. the
preprocessor is then able to select the an appropriate implementation
for affected functions. two basic strategies are used on a
where the old syscalls correspond to deprecated library-level
functions, the deprecated functions have been converted to wrappers
for the modern function, and the modern function has fallback code
(omitted at the preprocessor level on new archs) to make use of the
old syscalls if the new syscall fails with ENOSYS. this also improves
functionality on older kernels and eliminates the incentive to program
with deprecated library-level functions for the sake of compatibility
with older kernels.
in other situations where the old syscalls correspond to library-level
functions which are not deprecated but merely lack some new features,
such as the *at functions, the old syscalls are still used on archs
which support them. this may change at some point in the future if or
when fallback code is added to the new functions to make them usable
(possibly with reduced functionality) on old kernels.
open is handled specially because it is used from so many places, in
so many variants (2 or 3 arguments, setting errno or not, and
cancellable or not). trying to do it as a function would not only
increase bloat, but would also risk subtle breakage.
this is the first step towards supporting "new" archs where linux
lacks "old" syscalls.
being static allows it to be inlined in __libc_start_main; inlining
should take place at all levels since the function is called exactly
once. this further reduces mandatory startup code size for static
there is no reason (and seemingly there never was any) for
__init_security to be its own function. it's linked unconditionally
so it can just be placed inline in __init_libc.
moving the call to __init_ssp from __init_security to __init_libc
makes __init_security a leaf function, which allows the compiler to
make it smaller. __init_libc is already non-leaf, and the additional
call makes no difference to the amount of register spillage.
in addition, it really made no sense for the call to __init_ssp to be
buried inside __init_security rather than parallel with other init
PAGE_SIZE was hardcoded to 4096, which is historically what most
systems use, but on several archs it is a kernel config parameter,
user space can only know it at execution time from the aux vector.
PAGE_SIZE and PAGESIZE are not defined on archs where page size is
a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE)
to query it. Internally libc code defines PAGE_SIZE to libc.page_size,
which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink
as well. (Note that libc.page_size can be accessed without GOT, ie.
before relocations are done)
Some fpathconf settings are hardcoded to 4096, these should be actually
queried from the filesystem using statfs.
modern (4.7.x and later) gcc uses init/fini arrays, rather than the
legacy _init/_fini function pasting and crtbegin/crtend ctors/dtors
system, on most or all archs. some archs had already switched a long
time ago. without following this change, global ctors/dtors will cease
to work under musl when building with new gcc versions.
the most surprising part of this patch is that it actually reduces the
size of the init code, for both static and shared libc. this is
achieved by (1) unifying the handling main program and shared
libraries in the dynamic linker, and (2) eliminating the
glibc-inspired rube goldberg machine for passing around init and fini
function pointers. to clarify, some background:
the function signature for __libc_start_main was based on glibc, as
part of the original goal of being able to run some glibc-linked
binaries. it worked by having the crt1 code, which is linked into
every application, static or dynamic, obtain and pass pointers to the
init and fini functions, which __libc_start_main is then responsible
for using and recording for later use, as necessary. however, in
neither the static-linked nor dynamic-linked case do we actually need
crt1.o's help. with dynamic linking, all the pointers are available in
the _DYNAMIC block. with static linking, it's safe to simply access
the _init/_fini and __init_array_start, etc. symbols directly.
obviously changing the __libc_start_main function signature in an
incompatible way would break both old musl-linked programs and
glibc-linked programs, so let's not do that. instead, the function can
just ignore the information it doesn't need. new archs need not even
provide the useless args in their versions of crt1.o. existing archs
should continue to provide it as long as there is an interest in
having newly-linked applications be able to run on old versions of
musl; at some point in the future, this support can be removed.
this is a bit ugly, and the motivation for supporting it is
questionable. however the main factors were:
1. it will be useful to have this for certain internal purposes
anyway -- things like syslog.
2. applications can just save argv in main, but it's hard to fix
non-portable library code that's depending on being able to get the
invocation name without the main application's help.
previously, shared library constructors were being called before
important internal things like the environment (extern char **environ)
and hwcap flags (needed for sjlj to work right with float on arm) were
initialized in __libc_start_main. rather than trying to have to
dynamic linker make sure this stuff all gets initialized right, I've
opted to just defer calling shared library constructors until after
the main program's entry point is reached. this also fixes the order
of ctors to be the exact reverse of dtors, which is a desirable
property and possibly even mandated by some languages.
the main practical effect of this change is that shared libraries
calling getenv from ctors will no longer fail.
this doubles the performance of the fastest syscalls on the atom I
tested it on; improvement is reportedly much more dramatic on
worst-case cpus. cannot be used for cancellable syscalls.
the code in __libc_start_main is now responsible for parsing auxv,
rather than duplicating the parsing all over the place. this should
shave off a few cycles and some code size. __init_libc is left as an
external-linkage function despite the fact that it could be static, to
prevent it from being inlined and permanently wasting stack space when
main is called.
a few other minor changes are included, like eliminating per-thread
ssp canaries (they were likely broken when combined with certain
dlopen usages, and completely unnecessary) and some other unnecessary
checks. since this code gets linked into every program, it should be
as small and simple as possible.
the design for TLS in dynamic-linked programs is mostly complete too,
but I have not yet implemented it. cost is nonzero but still low for
programs which do not use TLS and/or do not use threads (a few hundred
bytes of new code, plus dependency on memcpy). i believe it can be
made smaller at some point by merging __init_tls and __init_security
into __libc_start_main and avoiding duplicate auxv-parsing code.
at the same time, I've also slightly changed the logic pthread_create
uses to allocate guard pages to ensure that guard pages are not
counted towards commit charge.
this behavior (opening fds 0-2 for a suid program) is explicitly
allowed (but not required) by POSIX to protect badly-written suid
programs from clobbering files they later open.
this commit does add some cost in startup code, but the availability
of auxv and the security flag will be useful elsewhere in the future.
in particular auxv is needed for static-linked vdso support, which is
still waiting to be committed (sorry nik!)