|Age||Commit message (Collapse)||Author||Lines|
allowing the application to replace malloc (since commit
c9f415d7ea2dace5bf77f6518b6afc36bb7a5732) has brought multiple
headaches where it's used from various critical sections in libc
components. for example:
- the thread-local message buffers allocated for dlerror can't be
freed at thread exit time because application code would then run in
the context of a non-existant thread. this was handled in commit
aa5a9d15e09851f7b4a1668e9dbde0f6234abada by queuing them for free
- the dynamic linker has to be careful not to pass memory allocated at
early startup time (necessarily using its own malloc) to realloc or
free after redoing relocations with the application and all
libraries present. bugs in this area were fixed several times, at
least in commits 0c5c8f5da6e36fe4ab704bee0cd981837859e23f and
2f1f51ae7b2d78247568e7fdb8462f3c19e469a4 and possibly others.
- by calling the allocator from contexts where libc-internal locks are
held, we impose undocumented requirements on alternate malloc
implementations not to call into any libc function that might
attempt to take these locks; if they do, deadlock results.
- work to make fork of a multithreaded parent give the child an
unrestricted execution environment is blocked by lock order issues
as long as the application-provided allocator can be called with
libc-internal locks held.
these problems are all fixed by giving libc internals access to the
original, non-replaced allocator, for use where needed. it can't be
used everywhere, as some interfaces like str[n]dup, open_[w]memstream,
getline/getdelim, etc. are required to provide the called memory
obtained as if by (the public) malloc. and there are a number of libc
interfaces that are "pure library" code, not part of some internal
singleton, and where using the application's choice of malloc
implementation is preferable -- things like glob, regex, etc.
one might expect there to be significant cost to static-linked
programs, pulling in two malloc implementations, one of them
mostly-unused, if malloc is replaced. however, in almost all of the
places where malloc is used internally, care has been taken already
not to pull in realloc/free (i.e. to link with just the bump
allocator). this size optimization carries over automatically.
the newly-exposed internal allocator functions are obtained by
renaming the actual definitions, then adding new wrappers around them
with the public names. technically __libc_realloc and __libc_free
could be aliases rather than needing a layer of wrapper, but this
would almost surely break certain instrumentation (valgrind) and the
size and performance difference is negligible. __libc_calloc needs to
be handled specially since calloc is designed to work with either the
internal or the replaced malloc.
as a bonus, this change also eliminates the longstanding ugly
dependency of the static bump allocator on order of object files in
libc.a, by making it so there's only one definition of the malloc
function and having it in the same source file as the bump allocator.
also fix the lack of declaration (and thus hidden visibility) in
__stdio_close's use of __aio_close.
for namespace-safety with thrd_sleep, this requires an alias, which is
also added. this eliminates all but one direct call point for
nanosleep syscalls, and arranges that 64-bit time_t conversion logic
will only need to exist in one file rather than three.
as a bonus, clock_nanosleep with CLOCK_REALTIME and empty flags is now
implemented as SYS_nanosleep, thereby working on older kernels that
may lack POSIX clocks functionality.
this probably saves a few bytes, avoids duplicating the clunky
lseek/_llseek syscall convention in two places, and sets the stage for
fixing broken seeks on x32 and mipsn32.
In the public header, __errno_location is declared with the "const"
attribute, conditional on __GNUC__. Ensure that its internal alias has
the same attributes.
Maintainer's note: This change also fixes a regression in quality of
code generation -- multiple references to errno in a single function
started generating multiple calls again -- introduced by commit
C11 removed the requirement that FILE be a complete type, which was
deemed erroneous, as part of the changes introduced by N1439 regarding
completeness of types (see footnote 6 for specific mention of FILE).
however the current version of POSIX is still based on C99 and
incorporates the old requirement that FILE be a complete type.
expose an arbitrary, useless complete type definition because the
actual object used to represent FILE streams cannot be public/ABI.
thanks to commit 13d1afa46f8098df290008c681816c9eb89ffbdb, we now have
a framework for suppressing the public complete-type definition of FILE
when stdio.h is included internally, so that a different internal
definition can be provided. this is perfectly well-defined, since the
same struct tag can refer to different types in different translation
units. it would be a problem if the implementation were accessing the
application's FILE objects or vice versa, but either would be
the motivation for this change is twofold. first, it gets the fallback
logic out of the dynamic linker, improving code readability and
organization. second, it provides application code that wants to use
the membarrier syscall, which depends on preregistration of intent
before the process becomes multithreaded unless unbounded latency is
acceptable, with a symbol that, when linked, ensures that this
commit 84d061d5a31c9c773e29e1e2b1ffe8cb9557bc58 inadvertently
introduced namespace violations by using the pthread-namespace rwlock
functions in pthread_key_create, which is in turn used for C11 tss.
fix that and possible future uses of rwlocks elsewhere.
by ABI, the public stdin/out/err macros use extern pointer objects,
and this is necessary to avoid copy relocations that would be
expensive and make the size of the FILE structure part of the ABI.
however, internally it makes sense to access the underlying FILE
objects directly. this avoids both an indirection through the GOT to
find the address of the stdin/out/err pointer objects (which can't be
computed PC-relative because they may have been moved to the main
program by copy relocations) and an indirection through the resulting
in most places this is just a minor optimization, but in the case of
getchar and putchar (and the unlocked versions thereof), ipa constant
propagation makes all accesses to members of stdin/out PC-relative or
GOT-relative, possibly reducing register pressure as well.
this significantly improves codegen in functions that need to access
errno but otherwise have no need for a GOT pointer.
we could probably improve it much more by including an inline version
of the &errno accessor function, but that depends on having the
definitions of struct __pthread and __pthread_self(), which at present
would expose a lot more than is appropriate. moving them to a small
tls.h later might make this more reasonable.
not all prefixed symbols can be made hidden. some are part of
ABI-compat (e.g. __nl_langinfo_l) and others are ABI as a consequence
of the way copy relocations for weak aliases work in ELF shared
libraries. most, however, can be made hidden.
with this commit, there should be no remaining unintentionally visible
symbols exported from libc.so.
this is perhaps not the ideal place, but no better alternatives stand
commits leading up to this one have moved the vast majority of
libc-internal interface declarations to appropriate internal headers,
allowing them to be type-checked and setting the stage to limit their
visibility. the ones that have not yet been moved are mostly
namespace-protected aliases for standard/public interfaces, which
exist to facilitate implementing plain C functions in terms of POSIX
functionality, or C or POSIX functionality in terms of extensions that
are not standardized. some don't quite fit this description, but are
"internally public" interfacs between subsystems of libc.
rather than create a number of newly-named headers to declare these
functions, and having to add explicit include directives for them to
every source file where they're needed, I have introduced a method of
wrapping the corresponding public headers.
parallel to the public headers in $(srcdir)/include, we now have
wrappers in $(srcdir)/src/include that come earlier in the include
path order. they include the public header they're wrapping, then add
declarations for namespace-protected versions of the same interfaces
and any "internally public" interfaces for the subsystem they
along these lines, the wrapper for features.h is now responsible for
the definition of the hidden, weak, and weak_alias macros. this means
source files will no longer need to include any special headers to
access these features.
over time, it is my expectation that the scope of what is "internally
public" will expand, reducing the number of source files which need to
include *_impl.h and related headers down to those which are actually
implementing the corresponding subsystems, not just using them.