musl/src/string, branch v1.1.16

disable use of arm memcpy asm if building as thumb code

2016-12-18T00:39:28+00:00

the thumb incompatibilities in the asm are probably only minor and
should be fixable, but for now just use the C version.

fix read past end of haystack buffer for short needles in memmem

2016-04-01T17:36:15+00:00

the two/three/four byte memmem specializations are not prepared to
handle haystacks shorter than the needle; they unconditionally read at
least up to the needle length and subtract from the haystack length.
if the haystack is shorter, the remaining haystack length underflows
and produces an unbounded search which will eventually either crash or
find a spurious match.

the top-level memmem function attempted to avoid this case already by
checking for haystack shorter than needle, but it failed to re-check
after using memchr to remove the maximal prefix not containing the
first byte of the needle.

move arm-specific translation units out of arch/arm/src, to src/*/arm

2016-01-22T00:02:21+00:00

this is possible with the new build system that allows src/*/$(ARCH)/*
files which do not shadow a file in the parent directory, and yields a
more logical organization. eventually it will be possible to remove
arch/*/src from the build system.

adapt build of arm memcpy asm not to use .sub files

2016-01-20T00:35:05+00:00

this depends on commit 9f5eb77992b42d484d69e879d24ef86466f20f21, which
made it possible to use a .c file for arch-specific replacements, and on
commit 2f853dd6b9a95d5b13ee8f9df762125e0588df5d, the out-of-tree build
support, which made it so that src/*/$(ARCH)/* 'replacement' files get
used even if they don't match the base name of a .c file in the parent
directory.

remove non-working pre-armv4t support from arm asm

2015-11-10T03:36:38+00:00

the idea of the three-instruction sequence being removed was to be
able to return to thumb code when used on armv4t+ from a thumb caller,
but also to be able to run on armv4 without the bx instruction
available (in which case the low bit of lr would always be 0).
however, without compiler support for generating such a sequence from
C code, which does not exist and which there is unlikely to be
interest in implementing, there is little point in having it in the
asm, and it would likely be easier to add pre-armv4t support via
enhanced linker handling of R_ARM_V4BX than at the compiler level.

removing this code simplifies adding support for building libc in
thumb2-only form (for cortex-m).

convert arm memcpy asm to UAL, remove .word hacks

2015-11-05T22:21:33+00:00

contrary to commit 9367fe926196f407705bb07cd29c6e40eb1774dd, all
relevant gas versions actually do support .syntax unified.

reimplement strverscmp to fix corner cases

2015-06-23T00:29:57+00:00

this interface is non-standardized and is a GNU invention, and as
such, our implementation should match the behavior of the GNU
function. one peculiarity the old implementation got wrong was the
handling of all-zero digit sequences: they are supposed to compare
greater than digit sequences of which they are a proper prefix, as in
009 < 00.

in addition, high bytes were treated with char signedness rather than
as unsigned. this was wrong regardless of what the GNU function does
since the resulting order relation varied by arch.

the new strverscmp implementation makes explicit the cases where the
order differs from what strcmp would produce, of which there are only
two.

remove potentially PIC-incompatible relocations from x86_64 and x32 asm

2015-04-19T01:18:23+00:00

analogous to commit 8ed66ecbcba1dd0f899f22b534aac92a282f42d5 for i386.

remove the last of possible-textrels from i386 asm

2015-04-19T00:45:39+00:00

none of these are actual textrels because of ld-time binding performed
by -Bsymbolic-functions, but I'm changing them with the goal of making
ld-time binding purely an optimization rather than relying on it for
semantic purposes.

in the case of memmove's call to memcpy, making it explicit that the
memmove asm is assuming the forward-copying behavior of the memcpy asm
is desirable anyway; in case memcpy is ever changed, the semantic
mismatch would be apparent while editing memmcpy.s.

overhaul optimized x86_64 memset asm

2015-02-26T07:07:08+00:00

on most cpu models, "rep stosq" has high overhead that makes it
undesirable for small memset sizes. the new code extends the
minimal-branch fast path for short memsets from size 15 up to size
126, and shrink-wraps this code path. in addition, "rep stosq" is
sensitive to misalignment. the cost varies with size and with cpu
model, but it has been observed performing 1.5 times slower when the
destination address is not aligned mod 16. the new code thus ensures
alignment mod 16, but also preserves any existing additional
alignment, in case there are cpu models where it is beneficial.

this version is based in part on changes proposed by Denys Vlasenko.