musl/src/malloc, branch v0.9.0

ditch the priority inheritance locks; use malloc's version of lock

2012-04-24T20:32:23+00:00

i did some testing trying to switch malloc to use the new internal
lock with priority inheritance, and my malloc contention test got
20-100 times slower. if priority inheritance futexes are this slow,
it's simply too high a price to pay for avoiding priority inversion.
maybe we can consider them somewhere down the road once the kernel
folks get their act together on this (and perferably don't link it to
glibc's inefficient lock API)...

as such, i've switch __lock to use malloc's implementation of
lightweight locks, and updated all the users of the code to use an
array with a waiter count for their locks. this should give optimal
performance in the vast majority of cases, and it's simple.

malloc is still using its own internal copy of the lock code because
it seems to yield measurably better performance with -O3 when it's
inlined (20% or more difference in the contention stress test).

fix issue with excessive mremap syscalls on realloc

2011-11-17T04:59:28+00:00

CHUNK_SIZE macro was defined incorrectly and shaving off at least one
significant bit in the size of mmapped chunks, resulting in the test
for oldlen==newlen always failing and incurring a syscall. fortunately
i don't think this issue caused any other observable behavior; the
definition worked correctly for all non-mmapped chunks where its
correctness matters more, since their lengths are always multiples of
the alignment.

use new a_crash() asm to optimize double-free handler.

2011-08-23T13:43:45+00:00

gcc generates extremely bad code (7 byte immediate mov) for the old
null pointer write approach. it should be generating something like
"xor %eax,%eax ; mov %al,(%eax)". in any case, using a dedicated
crashing opcode accomplishes the same thing in one byte.

simplify and improve double-free check

2011-08-15T05:59:15+00:00

a valid mmapped block will have an even (actually aligned) "extra"
field, whereas a freed chunk on the heap will always have an in-use
neighbor.

this fixes a potential bug if mmap ever allocated memory below the
main program/brk (in which case it would be wrongly-detected as a
double-free by the old code) and allows the double-free check to work
for donated memory outside of the brk area (or, in the future,
secondary heap zones if support for their creation is added).

posix_memalign should fail if size is not a multiple of sizeof(void *)

2011-06-29T23:26:30+00:00

eliminate OOB array hacks in malloc

2011-06-26T20:12:43+00:00

malloc: cast size down to int in bin_index functions

2011-06-12T14:53:42+00:00

even if size_t was 32-bit already, the fact that the value was
unsigned and that gcc is too stupid to figure out it would be positive
as a signed quantity (due to the immediately-prior arithmetic and
conditionals) results in gcc compiling the integer-to-float conversion
as zero extension to 64 bits followed by an "fildll" (64 bit)
instruction rather than a simple "fildl" (32 bit) instruction on x86.
reportedly fildll is very slow on certain p4-class machines; even if
not, the new code is slightly smaller.

use volatile pointers for intentional-crash code.

2011-06-06T22:10:43+00:00

namespace fixes for sys/mman.h

2011-04-20T19:55:58+00:00

fix rare but nasty under-allocation bug in malloc with large requests

2011-04-04T21:26:41+00:00

the bug appeared only with requests roughly 2*sizeof(size_t) to
4*sizeof(size_t) bytes smaller than a multiple of the page size, and
only for requests large enough to be serviced by mmap instead of the
normal heap. it was only ever observed on 64-bit machines but
presumably could also affect 32-bit (albeit with a smaller window of
opportunity).