|Age||Commit message (Collapse)||Author||Lines|
commit 7be59733d71ada3a32a98622507399253f1d5e48 introduced the
hwcap-based branches to support the SPE FPU, but wrongly coded them as
bitwise tests on the computed address of __hwcap, not a value loaded
from that address. replace the add with indexed load to fix it.
When the soft-float ABI for PowerPC was added in commit
5a92dd95c77cee81755f1a441ae0b71e3ae2bcdb, with Freescale cpus using
the alternative SPE FPU as the main use case, it was noted that we
could probably support hard float on them, but that it would involve
determining some difficult ABI constraints. This commit is the
completion of that work.
The Power-Arch-32 ABI supplement defines the ABI profiles, and indeed
ATR-SPE is built on ATR-SOFT-FLOAT. But setjmp/longjmp compatibility
are problematic for the same reason they're problematic on ARM, where
optional float-related parts of the register file are "call-saved if
present". This requires testing __hwcap, which is now done.
In keeping with the existing powerpc-sf subarch definition, which did
not have fenv, the fenv macros are not defined for SPE and the SPEFSCR
control register is left (and assumed to start in) the default mode.
longjmp should set the return value of setjmp, but 64bit
registers were used for the 0 check while the type is int.
use the code that gcc generates for return val ? val : 1;
Use a branchless sequence that is one byte shorter on 64-bit, same size
on 32-bit. Thanks to Pete Cawley for suggesting this variant.
longjmp 'val' argument is an int, but the assembly is referencing 64-bit
registers as if the argument was a long, or the caller was responsible
for extending the argument. Though the psABI is not clear on this, the
interpretation in GCC is that high bits may be arbitrary and the callee
is responsible for sign/zero-extending the value as needed (likewise for
return values: callers must anticipate that high bits may be garbage).
Therefore testing %rax is a functional bug: setjmp would wrongly return
zero if longjmp was called with val==0, but high bits of %rsi happened
to be non-zero.
Rewrite the prologue to refer to 32-bit registers. In passing, change
'test' to use %rsi, as there's no advantage to using %rax and the new
form is cheaper on processors that do not perform move elimination.
mips32 has two fpu register file variants: FR=0 with 32 32-bit
registers, where pairs of neighboring even/odd registers are used to
represent doubles, and FR=1 with 32 64-bit registers, each of which
can store a single or double.
up through r5 (our "mips" arch), the supported ABI uses FR=0, but
modern compilers generate "fpxx" model code that can safely operate
with either model. r6, which is an incompatible but similar ISA, drops
FR=0 and only provides the FR=1 model. as such, setjmp and longjmp,
which depended on being able to save and restore call-saved doubles by
storing and loading their 32-bit halves, were completely broken in the
presence of floating point code on mips r6.
to fix this, use the s.d and l.d mnemonics to store and load fpu
registers. these expand to the existing swc1 and lwc1 instructions for
pairs of 32-bit fpu registers on mips1, but on mips2 and later they
translate directly to the 64-bit sdc1 and ldc1.
with FR=0, sdc1 and ldc1 behave just like the pairs of swc1 and lwc1
instructions they replace, storing or loading the even/odd pair of fpu
registers that can be treated as separate single-precision floats or
as a unit representing a double. but with FR=1, they store/load
individual 64-bit registers. this yields the ABI-correct behavior on
mips r6, and should make linking of pre-r6 (plain "mips") code with
"fp64" model code workable, although this is and will likely remain
in addition to the mips r6 problem this change fixes, reportedly
clang's internal assembler refuses to assemble swc1 and lwc1
instructions for odd register indices when building for "fpxx" model
(the default). this caused setjmp and longjmp not to build. by using
the s.d and l.d forms, this problem is avoided too.
as a bonus, code size is reduced everywhere but mips1.
armv8 removed the coprocessor instructions other than cp14, so
on an armv8 system the related hwcaps should never be set.
new llvm complains about the use of coprocessor instructions in
armv8-a mode (even though they are never executed at runtime),
so ifdef them out when musl is built for armv8.
Author: Alex Suykov <firstname.lastname@example.org>
Author: Aric Belsito <email@example.com>
Author: Drew DeVault <firstname.lastname@example.org>
Author: Michael Clark <email@example.com>
Author: Michael Forney <firstname.lastname@example.org>
Author: Stefan O'Rear <email@example.com>
This port has involved the work of many people over several years. I
have tried to ensure that everyone with substantial contributions has
been credited above; if any omissions are found they will be noted
later in an update to the authors/contributors list in the COPYRIGHT
The version committed here comes from the riscv/riscv-musl repo's
commit 3fe7e2c75df78eef42dcdc352a55757729f451e2, with minor changes by
me for issues found during final review:
- a_ll/a_sc atomics are removed (according to the ISA spec, lr/sc
are not safe to use in separate inline asm fragments)
- a_cas[_p] is fixed to be a memory barrier
- the call from the _start assembly into the C part of crt1/ldso is
changed to allow for the possibility that the linker does not place
them nearby each other.
- DTP_OFFSET is defined correctly so that local-dynamic TLS works
- reloc.h LDSO_ARCH logic is simplified and made explicit.
- unused, non-functional crti/n asm files are removed.
- an empty .sdata section is added to crt1 so that the
__global_pointer reference is resolvable.
- indentation style errors in some asm files are fixed.
three ABIs are supported: the default with 68881 80-bit fpu format and
results returned in floating point registers, softfloat-only with the
same format, and coldfire fpu with IEEE single/double only. only the
first is tested at all, and only under qemu which has fpu emulation
basic functionality smoke tests have been performed for the most
common arch-specific breakage via libc-test and qemu user-level
emulation. some sysvipc failures remain, but are shared with other big
endian archs and will be fixed separately.
this is a subtle issue with how the assembler/linker work. for the adr
pseudo-instruction used to find __hwcap, the assembler in thumb mode
generates a 16-bit thumb add instruction which can only represent
word-aligned addresses, despite not knowing the alignment of the
label. if the setjmp function is assigned a non-multiple-of-4 address
at link time, the load then loads from the wrong address (the last
instruction rather than the data containing the offset) and ends up
reading nonsense instead of the value of __hwcap. this in turn causes
the checks for floating-point/vector register sets (e.g. IWMMX) to
evaluate incorrectly, crashing when setjmp/longjmp try to save/restore
fix based on bug report by Felix Hädicke.
The TOC pointer is constant within a single dso, but needs to be saved
and restored around cross-dso calls. The PLT stub saves it to the
caller's stack frame, and the linker adds code to the caller to restore
With a local call, as within a single dso or with static linking, this
doesn't happen and the TOC pointer is always in r2. Therefore,
setjmp/longjmp need to save/restore the TOC pointer from/to different
locations depending on whether the call to setjmp was a local or non-local
It is always safe for longjmp to restore to both r2 and the caller's stack.
If the call to setjmp was local, and only r2 matters and the stack location
will be ignored, but is required by the ABI to be reserved for the TOC
pointer. If the call was non-local, then only the stack location matters,
and whatever is restored into r2 will be clobbered anyway when the caller
reloads r2 from the stack.
A little extra care is required for sigsetjmp, because it uses setjmp
internally. After the second return from this setjmp call, r2 will contain
the caller's TOC pointer instead of libc's TOC pointer. We need to save
and restore the correct libc pointer before we can tail call to
sp cannot be used in the ldm/stm register set in thumb mode.
based on patch submitted by Jaydeep Patil, with minor changes.
Some PowerPC CPUs (e.g. Freescale MPC85xx) have a completely different
instruction set for floating point operations (SPE).
Executing regular PowerPC floating point instructions results in
"Illegal instruction" errors.
Make it possible to run these devices in soft-float mode.
patch by Mahesh Bodapati and Jaydeep Patil of Imagination
when adding the fdpic subarchs, the need for these sub files was
overlooked. thus setjmp and longjmp performed illegal instructions.
these files are all accepted as legacy arm syntax when producing arm
code, but legacy syntax cannot be used for producing thumb2 with
access to the full ISA. even after switching to UAL, some asm source
files contain instructions which are not valid in thumb mode, so these
will need to be addressed separately.
the idea of the three-instruction sequence being removed was to be
able to return to thumb code when used on armv4t+ from a thumb caller,
but also to be able to run on armv4 without the bx instruction
available (in which case the low bit of lr would always be 0).
however, without compiler support for generating such a sequence from
C code, which does not exist and which there is unlikely to be
interest in implementing, there is little point in having it in the
asm, and it would likely be easier to add pre-armv4t support via
enhanced linker handling of R_ARM_V4BX than at the compiler level.
removing this code simplifies adding support for building libc in
thumb2-only form (for cortex-m).
the code to save/restore vfp registers needs to build even when the
configured target does not have fpu; this is because code using vfp
fpu (but with the standard soft-float EABI) may call a libc built for
a soft-float only, and the EABI considers these registers call-saved
when they exist. thus, extra directives are used to force the
assembler to allow vfp instructions and to avoid marking the resulting
object files as requiring vfp.
moving away from using hard-coded opcode words is necessary in order
to eventually support producing thumb2-only output for cortex-m.
conditional execution of these instructions based on hwcap flags was
already implemented. when building for arm (non-thumb) output, the
only currently-supported configuration, this commit does not change
the code emitted.
commit 646cb9a4a04e5ed78e2dd928bf9dc6e79202f609 switched sigsetjmp to
use the new hidden ___setjmp symbol for setjmp, but the nofpu variant
of setjmp.s was not updated to match.
analogous to commit 646cb9a4a04e5ed78e2dd928bf9dc6e79202f609 for sh.
these are perfectly fine with ld-time symbol binding, but otherwise
result in textrels. they cannot be replaced with @PLT jump targets
because the PLT thunks require a GOT register to be setup, so use a
hidden alias instead.
analogous to commit 646cb9a4a04e5ed78e2dd928bf9dc6e79202f609 for sh.
these are perfectly fine with ld-time symbol binding, but if the calls
go through a PLT thunk, they are invalid because the caller does not
setup a GOT register. use a hidden alias to bypass the issue.
none of these are actual textrels because of ld-time binding performed
by -Bsymbolic-functions, but I'm changing them with the goal of making
ld-time binding purely an optimization rather than relying on it for
in the case of memmove's call to memcpy, making it explicit that the
memmove asm is assuming the forward-copying behavior of the memcpy asm
is desirable anyway; in case memcpy is ever changed, the semantic
mismatch would be apparent while editing memmcpy.s.
This adds complete aarch64 target support including bigendian subarch.
Some of the long double math functions are known to be broken otherwise
interfaces should be fully functional, but at this point consider this
Initial work on this port was done by Sireesh Tripurari and Kevin Bortis.
this typo did not result in an erroneous setjmp with at least binutils
2.22 but fix it for clarity and compatibility with potentially stricter
With the exception of a fenv implementation, the port is fully featured.
The port has been tested in or1ksim, the golden reference functional
simulator for OpenRISC 1000.
It passes all libc-test tests (except the math tests that
requires a fenv implementation).
The port assumes an or1k implementation that has support for
atomic instructions (l.lwa/l.swa).
Although it passes all the libc-test tests, the port is still
in an experimental state, and has yet experienced very little
r24 was wrongly being saved at a misaligned offset of 30 rather than
the correct offset of 40 in the jmp_buf. the exact effects of this
error have not been studied, but it's clear that the value of r24 was
lost across setjmp/longjmp and the saved values of r21 and/or r22 may
also have been corrupted.
linux, gcc, etc. all use "sh" as the name for the superh arch. there
was already some inconsistency internally in musl: the dynamic linker
was searching for "ld-musl-sh.path" as its path file despite its own
name being "ld-musl-superh.so.1". there was some sentiment in both
directions as to how to resolve the inconsistency, but overall "sh"
the build system has no automatic way to know this code applies to
both big (default) and little endian variants, so explicit .sub files
Userspace emulated floating-point (gcc -msoft-float) is not compatible
with the default mips abi (assumes an FPU or in kernel emulation of it).
Soft vs hard float abi should not be mixed, __mips_soft_float is checked
in musl's configure script and there is no runtime check. The -sf subarch
does not save/restore floating-point registers in setjmp/longjmp and only
provides dummy fenv implementation.
the issue is identical to the recent commit fixing the mips versions:
despite other implementations doing this, it conflicts with the
requirements of ISO C and it's a waste of time and code size.
nothing in the standard requires or even allows the fenv state to be
restored by longjmp. restoring the exception flags is not such a big
deal since it's probably valid to clobber them completely, but
restoring the rounding mode yields an observable side effect not
sanctioned by ISO C. saving/restoring it also wastes a few cycles and
16 bytes of code.
as for historical behavior, reportedly SGI IRIX did save/restore fenv,
and this is where glibc and uClibc got the behavior from. a few other
systems save/restore it too (on archs other than mips), even though
this is apparently wrong. further details are documented here:
as musl aims for standards conformance rather than coddling historical
programs expecting non-conforming behavior, and as it's unlikely that
any historical programs actually depend on the incorrect behavior
(such programs would break on other archs, anyway), I'm making the
change not to save/restore fenv on mips.
based on initial work by rdp, with heavy modifications. some features
including threads are untested because qemu app-level emulation seems
to be broken and I do not have a proper system image for testing.
not heavily tested, but at least they don't seem to break anything on
soft float targets with or without coprocessors. they check the auxv
AT_HWCAP flags to determine which coprocessor, if any, is available.