|Age||Commit message (Collapse)||Author||Lines|
this change is to get the right tags for C++ ABI matching. it should
have no other effects.
aside from the obvious C++ ABI purpose for this change, it also brings
musl into alignment with the compiler's idea of the definition of
wint_t (use in -Wformat), and makes the situation less awkward on ARM,
where wchar_t is unsigned.
internal code using wint_t and WEOF was checked against this change,
and while a few cases of storing WEOF into wchar_t were found, they
all seem to operate properly with the natural conversion from unsigned
the arch-specific bits/alltypes.h.sh has been replaced with a generic
alltypes.h.in and minimal arch-specific bits/alltypes.h.in.
this commit is intended to have no functional changes except:
- exposing additional symbols that POSIX allows but does not require
- changing the C++ name mangling for some types
- fixing the signedness of blksize_t on powerpc (POSIX requires signed)
- fixing the limit macros for sig_atomic_t on x86_64
- making dev_t an unsigned type (ABI matching goal, and more logical)
in addition, some types that were wrongly defined with long on 32-bit
archs were changed to int, and vice versa; this change is
non-functional except for the possibility of making pointer types
mismatch, and only affects programs that were using them incorrectly,
and only at build-time, not runtime.
the following changes were made in the interest of moving
non-arch-specific types out of the alltypes system and into the
headers they're associated with, and also will tend to improve
- netdb.h now includes netinet/in.h (for socklen_t and uint32_t)
- netinet/in.h now includes sys/socket.h and inttypes.h
- sys/resource.h now includes sys/time.h (for struct timeval)
- sys/wait.h now includes signal.h (for siginfo_t)
- langinfo.h now includes nl_types.h (for nl_item)
for the types in stdint.h:
- types which are of no interest to other headers were moved out of
the alltypes system.
- fast types for 8- and 64-bit are hard-coded (at least for now); only
the 16- and 32-bit ones have reason to vary by arch.
and the following types have been changed for C++ ABI purposes;
- mbstate_t now has a struct tag, __mbstate_t
- FILE's struct tag has been changed to _IO_FILE
- DIR's struct tag has been changed to __dirstream
- locale_t's struct tag has been changed to __locale_struct
- pthread_t is defined as unsigned long in C++ mode only
- fpos_t now has a struct tag, _G_fpos64_t
- fsid_t's struct tag has been changed to __fsid_t
- idtype_t has been made an enum type (also required by POSIX)
- nl_catd has been changed from long to void *
- siginfo_t's struct tag has been removed
- sigset_t's has been given a struct tag, __sigset_t
- stack_t has been given a struct tag, sigaltstack
- suseconds_t has been changed to long on 32-bit archs
- [u]intptr_t have been changed from long to int rank on 32-bit archs
- dev_t has been made unsigned
summary of tests that have been performed against these changes:
- nsz's libc-test (diff -u before and after)
- C++ ABI check symbol dump (diff -u before, after, glibc)
- grepped for __NEED, made sure types needed are still in alltypes
- built gcc 3.4.6
this code has been replaced by portable C code that works on all
archs. the old asm needs to be removed or ctors/dtors will run twice.
these functions were mistakenly assumed to be needed to match glibc
ABI, but glibc has them as part of the non-shared part of libc that's
always statically linked into the main program. moreover, the only
place they are referenced from is glibc's crt1.o.
modern (4.7.x and later) gcc uses init/fini arrays, rather than the
legacy _init/_fini function pasting and crtbegin/crtend ctors/dtors
system, on most or all archs. some archs had already switched a long
time ago. without following this change, global ctors/dtors will cease
to work under musl when building with new gcc versions.
the most surprising part of this patch is that it actually reduces the
size of the init code, for both static and shared libc. this is
achieved by (1) unifying the handling main program and shared
libraries in the dynamic linker, and (2) eliminating the
glibc-inspired rube goldberg machine for passing around init and fini
function pointers. to clarify, some background:
the function signature for __libc_start_main was based on glibc, as
part of the original goal of being able to run some glibc-linked
binaries. it worked by having the crt1 code, which is linked into
every application, static or dynamic, obtain and pass pointers to the
init and fini functions, which __libc_start_main is then responsible
for using and recording for later use, as necessary. however, in
neither the static-linked nor dynamic-linked case do we actually need
crt1.o's help. with dynamic linking, all the pointers are available in
the _DYNAMIC block. with static linking, it's safe to simply access
the _init/_fini and __init_array_start, etc. symbols directly.
obviously changing the __libc_start_main function signature in an
incompatible way would break both old musl-linked programs and
glibc-linked programs, so let's not do that. instead, the function can
just ignore the information it doesn't need. new archs need not even
provide the useless args in their versions of crt1.o. existing archs
should continue to provide it as long as there is an interest in
having newly-linked applications be able to run on old versions of
musl; at some point in the future, this support can be removed.
for conversion specifiers, alloc is always set when the specifier is
parsed. however, if scanf stops due to mismatching literal text,
either an uninitialized (if no conversions have been performed yet) or
stale (from the previous conversion) of the flag will be used,
possibly causing an invalid pointer to be passed to free when the
the sizes in the header and footer for a chunk should always match. if
they don't, the program has definitely invoked undefined behavior, and
the most likely cause is a simple overflow, either of a buffer in the
block being freed or the one just below it.
crashing here should not only improve security of buggy programs, but
also aid in debugging, since the crash happens in a context where you
have a pointer to the likely-overflowed buffer.
while there's no POSIX namespace provision for UIO_* in uio.h, this
exact macro name is reserved in XBD 2.2.2. apparently some
glibc-centric software expects it to exist, so let's provide it.
the main aim of this patch is to ensure that if not all fields are
filled in, they contain zeros, so as not to confuse applications.
reportedly some older kernels, including commonly used openvz kernels,
lack the f_flags field, resulting in applications reading random junk
as the mount flags; the common symptom seems to be wrongly considering
the filesystem to be mounted read-only and refusing to operate. glibc
has some amazingly ugly fallback code to get the mount flags for old
kernels, but having them really is not that important anyway; what
matters most is not presenting incorrect flags to the application.
I have also aimed to fill in some fields of statvfs that were
previously missing, and added code to explicitly zero the reserved
space at the end of the structure, which will make things easier in
the future if this space someday needs to be used.
this change is both to fix one of the remaining type (and thus C++
ABI) mismatches with glibc/LSB and to allow use of the full range of
uid and gid values, if so desired.
passwd/group access functions were not prepared to deal with unsigned
values, so they too have been fixed with this commit.
an empty program is not valid and would be reasonable grounds for the
compiler to give an error, which would break these tests.
prior to this change, using a non-default syslibdir was impractical on
systems where the ordinary library paths contain musl-incompatible
library files. the file containing search paths was always taken from
/etc, which would either correspond to a system-wide musl
installation, or fail to exist at all, resulting in searching of the
default library path.
the new search strategy is safe even for suid programs because the
pathname used comes from the PT_INTERP header of the program being
run, rather than any external input.
as part of this change, I have also begun differentiating the names of
arch variants that differ by endianness or floating point calling
convention. the corresponding changes in the build system and and gcc
wrapper script (to use an alternate dynamic linker name) for these
configurations have not yet been made.
POSIX is not clear on whether it includes the termination, but ISO C
requires that it does. the whole concept of this macro is rather
useless, but it's better to be correct anyway.
patch by Luka Perkov, who noted that all other archs have a newline.
this is both a minor scheduling optimization and a workaround for a
difficult-to-fix bug in qemu app-level emulation.
from the scheduling standpoint, it makes no sense to schedule the
parent thread again until the child has exec'd or exited, since the
parent will immediately block again waiting for it.
on the qemu side, as regular application code running on an underlying
libc, qemu cannot make arbitrary clone syscalls itself without
confusing the underlying implementation. instead, it breaks them down
into either fork-like or pthread_create-like cases. it was treating
the code in posix_spawn as pthread_create-like, due to CLONE_VM, which
caused horribly wrong behavior: CLONE_FILES broke the synchronization
mechanism, CLONE_SIGHAND broke the parent's signals, and CLONE_THREAD
caused the child's exec to end the parent -- if it hadn't already
crashed. however, qemu special-cases CLONE_VFORK and emulates that
with fork, even when CLONE_VM is also specified. this also gives
incorrect semantics for code that really needs the memory sharing, but
posix_spawn does not make use of the vm sharing except to avoid
momentary double commit charge.
programs using posix_spawn (including via popen) should now work
correctly under qemu app-level emulation.
for 0-argument syscalls (1 argument to the macro, the syscall number),
the __SYSCALL_NARGS_X macro's ... argument was not satisfied. newer
compilers seem to care about this.
POSIX mandates EOVERFLOW for this condition.
this commit has two major user-visible parts: zoneinfo-format time
zones are now supported, and overflow handling is intended to be
complete in the sense that all functions return a correct result if
and only if the result fits in the destination type, and otherwise
return an error. also, some noticable bugs in the way DST detection
and normalization worked have been fixed, and performance may be
better than before, but it has not been tested.
apparently this was never noticed before because the linker normally
optimizes dynamic TLS models to non-dynamic ones when static linking,
thus eliminating the calls to __tls_get_addr which crash when the dtv
is missing. however, some libsupc++ code on ARM was calling
__tls_get_addr when static linked and crashing. the reason is unclear
to me, but with this issue fixed it should work now anyway.
patch by Timo Teräs
map_library was saving pointers to an automatic-storage buffer rather
than pointers into the mapping. this should be a fairly simple fix,
but the patch here is slightly complicated by two issues:
1. supporting gratuitously obfuscated ELF files where the program
headers are not right at the beginning of the file.
2. cleaning up the map_library function so that data isn't clobbered
by the time we need it.
there are still several more that are misleading, but SIGFPE (integer
division error misdescribed as floating point) and and SIGCHLD
(possibly non-exit status change events described as exiting) were the
the name format RTnn/RTnnn was chosen to minimized bloat while
uniquely identifying the signal.
also clean up, optimize, and simplify the code, removing branches by
simply pre-setting the result string to an empty string, which will be
preserved if other operations fail.
the main use for this macro seems to be knowing the correct allocation
granularity for dynamic-sized fd_set objects. such usage is
non-conforming and results in undefined behavior, but it is widespread
there are two motivations for this change. one is to avoid
gratuitously depending on a C11 symbol for implementing a POSIX
function. the other pertains to the documented semantics. C11 does not
define any behavior for aligned_alloc when the length argument is not
a multiple of the alignment argument. posix_memalign on the other hand
places no requirements on the length argument. using __memalign as the
implementation of both, rather than trying to implement one in terms
of the other when their documented contracts differ, eliminates this
C11 has no requirement that the alignment be a multiple of
sizeof(void*), and in fact seems to require any "valid alignment
supported by the implementation" to work. since the alignment of char
is 1 and thus a valid alignment, an alignment argument of 1 should be
a research in debian codesearch and grepping over the pkgsrc
directory tree have shown that these macros are all either unused,
or defined by programs in case they need them.
these would not be expensive to actually implement, but reading
/etc/ethers does not sound like a particularly useful feature, so for
now I'm leaving them as stubs.
previously, determination of the list of header files for installation
depended on the include/bits symlink (to the arch-specific files)
already having been created. in other words, running "make install"
immediately after configure without first running "make" caused the
bits headers not to be installed.
the solution I have applied is to pull the list of headers directly
from arch/$(ARCH)/bits rather than include/bits, and likewise to
install directly from arch/$(ARCH)/bits rather than via the symlink.
at this point, the only purpose served by keeping the symlink around
is that it enables use of the in-tree headers and libs directly via -I
and -L, which can be useful when testing against a new version of the
library before installing it. on the other hand, removing the bits
symlink would be beneficial if we ever want to support building
multiple archs in the same source tree.
in theory this should not be an issue, since major() should only be
applied to type dev_t, which is 64-bit. however, it appears some
applications are not using dev_t but a smaller integer type (which
works on Linux because the kernel's dev_t is really only 32-bit). to
avoid the undefined behavior, do it as two shifts.
this change is needed to correctly handle the case where a constructor
creates a new thread which calls dlopen. previously, the lock was not
held in this case. the reason for the complex logic to avoid locking
whenever possible is that, since the mutex is recursive, it will need
to inspect the thread pointer to get the current thread's tid, and
this requires initializing the thread pointer. we do not want
non-multi-threaded programs to attempt to access the thread pointer
unnecessarily; doing so could make them crash on ancient kernels that
don't support threads but which may otherwise be capable of running