musl - musl - an implementation of the standard library for Linux-based systems

Age	Commit message (Collapse)	Author	Lines
2012-04-19	fix really bad breakage in strtol, etc.: failure to accept leading spaces	Rich Felker	-4/+5

2012-04-18	fix typo in exponent reading code or floats	Rich Felker	-1/+1
	this was basically harmless, but could have resulted in misreading inputs with more than a few gigabytes worth of digits..
2012-04-17	fix failure to read infinity in scanf	Rich Felker	-3/+4
	this code worked in strtod, but not in scanf. more evidence that i should design a better interface for discarding multiple tail characters than just calling unget repeatedly...
2012-04-17	fix failure of int parser to unget an initial mismatching character	Rich Felker	-0/+1

2012-04-16	use the new integer parser (FILE/shgetc based) for strtol, wcstol, etc.	Rich Felker	-127/+0

2012-04-16	new scanf implementation and corresponding integer parser/converter	Rich Felker	-0/+107
	advantages over the old code: - correct results for floating point (old code was bogus) - wide/regular scanf separated so scanf does not pull in wide code - well-defined behavior on integers that overflow dest type - support for %[a-b] ranges with %[ (impl-defined by widely used) - no intermediate conversion of fmt string to wide string - cleaner, easier to share code with strto* functions - better standards conformance for corner cases the old code remains in the source tree, as the wide versions of the scanf-family functions are still using it. it will be removed when no longer needed.
2012-04-16	fix buggy limiter handling in shgetc	Rich Felker	-4/+3
	this is needed for upcoming new scanf
2012-04-16	fix broken shgetc limiter logic (wasn't working)	Rich Felker	-2/+5

2012-04-16	floatscan: fix incorrect count of leading nonzero digits	Rich Felker	-1/+1
	this off-by-one error was causing values with just one digit past the decimal point to be treated by the integer case. in many cases it would yield the correct result, but if expressions are evaluated in excess precision, double rounding may occur.
2012-04-13	use fast version of the int reading code for the high-order digits too	Rich Felker	-3/+13
	this increases code size slightly, but it's considerably faster, especially for power-of-2 bases.
2012-04-13	use macros instead of inline functions in shgetc.h	Rich Felker	-20/+4
	at -Os optimization level, gcc refuses to inline these functions even though the inlined code would roughly the same size as the function call, and much faster. the easy solution is to make them into macros.
2012-04-13	fix spurious overflows in strtoull with small bases	Rich Felker	-7/+3
	whenever the base was small enough that more than one digit could still fit after UINTMAX_MAX/36-1 was reached, only the first would be allowed; subsequent digits would trigger spurious overflow, making it impossible to read the largest values in low bases.
2012-04-12	remove magic numbers from floatscan	Rich Felker	-5/+5

2012-04-12	optimize more integer cases in floatscan; comment the whole procedure	Rich Felker	-8/+27

2012-04-11	revert invalid optimization in floatscan	Rich Felker	-2/+2

2012-04-11	fix stupid typo in floatscan that caused excess rounding of some values	Rich Felker	-1/+1

2012-04-11	optimize floatscan downscaler to skip results that won't be needed	Rich Felker	-2/+3
	when upscaling, even the very last digit is needed in cases where the input is exact; no digits can be discarded. but when downscaling, any digits less significant than the mantissa bits are destined for the great bitbucket; the only influence they can have is their presence (being nonzero). thus, we simply throw them away early. the result is nearly a 4x performance improvement for processing huge values. the particular threshold LD_B1B_DIG+3 is not chosen sharply; it's simply a "safe" distance past the significant bits. it would be nice to replace it with a sharp bound, but i suspect performance will be comparable (within a few percent) anyway.
2012-04-11	simplify/debloat radix point alignment code in floatscan	Rich Felker	-9/+4
	now that this is the first operation, it can rely on the circular buffer contents not being wrapped when it begins. we limit the number of digits read slightly in the initial parsing loops too so that this code does not have to consider the case where it might cause the circular buffer to wrap; this is perfectly fine because KMAX is chosen as a power of two for circular-buffer purposes and is much larger than it otherwise needs to be, anyway. these changes should not affect performance at all.
2012-04-11	optimize floatscan: avoid excessive upscaling	Rich Felker	-27/+27
	upscaling by even one step too much creates 3-29 extra iterations for the next loop. this is still suboptimal since it always goes by 2^29 rather than using a smaller upscale factor when nearing the target, but performance on common, small-magnitude, few-digit values has already more than doubled with this change. more optimizations on the way...
2012-04-11	fix incorrect initial count in shgetc when data is already buffered	Rich Felker	-1/+1

2012-04-11	fix bug parsing lone zero followed by junk, and hex float over-reading	Rich Felker	-6/+5

2012-04-10	fix float scanning of certain values ending in zeros	Rich Felker	-1/+3
	for example, "1000000000" was being read as "1" due to this loop exiting early. it's necessary to actually update z and zero the entries so that the subsequent rounding code does not get confused; before i did that, spurious inexact exceptions were being raised.
2012-04-10	fix potential overflow in exponent reading	Rich Felker	-1/+1
	note that there's no need for a precise cutoff, because exponents this large will always result in overflow or underflow (it's impossible to read enough digits to compensate for the exponent magnitude; even at a few nanoseconds per digit it would take hundreds of years).
2012-04-10	set errno properly when parsing floating point	Rich Felker	-4/+21

2012-04-10	add "scan helper getc" and rework strtod, etc. to use it	Rich Felker	-73/+111
	the immediate benefit is a significant debloating of the float parsing code by moving the responsibility for keeping track of the number of characters read to a different module. by linking shgetc with the stdio buffer logic, counting logic is defered to buffer refill time, keeping the calls to shgetc fast and light. in the future, shgetc will also be useful for integrating the new float code with scanf, which needs to not only count the characters consumed, but also limit the number of characters read based on field width specifiers. shgetc may also become a useful tool for simplifying the integer parsing code.
2012-04-10	new floating point parser/converter	Rich Felker	-0/+446
	this version is intended to be fully conformant to the ISO C, POSIX, and IEEE standards for conversion of decimal/hex floating point strings to float, double, and long double (ld64 or ld80 only at present) values. in particular, all results are intended to be rounded correctly according to the current rounding mode. further, this implementation aims to set the floating point underflow, overflow, and inexact flags to reflect the conversion performed. a moderate amount of testing has been performed (by nsz and myself) prior to integration of the code in musl, but it still may have bugs. so far, only strto(d\|ld\|f) use the new code. scanf integration will be done as a separate commit, and i will add implementations of the wide character functions later.
2012-03-22	add creal/cimag macros in complex.h (and use them in the functions defs)	Rich Felker	-8/+0

2012-03-19	don't inline __rem_pio2l so the code size is smaller	nsz	-0/+1

2012-03-18	fix loads of missing const in new libm, and some global vars (?!) in powl	Rich Felker	-2/+2

2012-03-16	fix namespace issues for lgamma, etc.	Rich Felker	-0/+2
	standard functions cannot depend on nonstandard symbols
2012-03-13	first commit of the new libm!	Rich Felker	-0/+323
	thanks to the hard work of Szabolcs Nagy (nsz), identifying the best (from correctness and license standpoint) implementations from freebsd and openbsd and cleaning them up! musl should now fully support c99 float and long double math functions, and has near-complete complex math support. tgmath should also work (fully on gcc-compatible compilers, and mostly on any c99 compiler). based largely on commit 0376d44a890fea261506f1fc63833e7a686dca19 from nsz's libm git repo, with some additions (dummy versions of a few missing long double complex functions, etc.) by me. various cleanups still need to be made, including re-adding (if they're correct) some asm functions that were dropped.
2012-03-02	fix obscure bug in strtoull reading the highest 16 possible values	Rich Felker	-1/+1

2012-02-24	new attempt at working around the gcc 3 visibility bug	Rich Felker	-0/+7
	since gcc is failing to generate the necessary ".hidden" directive in the output asm, generate it explicitly with an __asm__ statement...
2012-02-24	remove useless attribute visibility from definitions	Rich Felker	-1/+1
	this was a failed attempt at working around the gcc 3 visibility bug affecting x86_64. subsequent patch will address it with an ugly but working hack.
2012-02-23	cleanup and work around visibility bug in gcc 3 that affects x86_64	Rich Felker	-6/+11
	in gcc 3, the visibility attribute must be placed on both the declaration and on the definition. if it's omitted from the definition, the compiler fails to emit the ".hidden" directive in the assembly, and the linker will either generate textrels (if supported, such as on i386) or refuse to link (on targets where certain types of textrels are forbidden or impossible without further assumptions about memory layout, such as on x86_64). this patch also unifies the decision about when to use visibility into libc.h and makes the visibility in the utf-8 state machine tables based on libc.h rather than a duplicate test.
2011-10-02	synchronize cond var destruction with exiting waits	Rich Felker	-0/+1

2011-09-28	improve pshared barriers	Rich Felker	-1/+1
	eliminate the sequence number field and instead use the counter as the futex because of the way the lock is held, sequence numbers are completely useless, and this frees up a field in the barrier structure to be used as a waiter count for the count futex, which lets us avoid some syscalls in the best case. as of now, self-synchronized destruction and unmapping should be fully safe. before any thread can return from the barrier, all threads in the barrier have obtained the vm lock, and each holds a shared lock on the barrier. the barrier memory is not inspected after the shared lock count reaches 0, nor after the vm lock is released.
2011-09-27	process-shared barrier support, based on discussion with bdonlan	Rich Felker	-3/+5
	this implementation is rather heavy-weight, but it's the first solution i've found that's actually correct. all waiters actually wait twice at the barrier so that they can synchronize exit, and they hold a "vm lock" that prevents changes to virtual memory mappings (and blocks pthread_barrier_destroy) until all waiters are finished inspecting the barrier. thus, it is safe for any thread to destroy and/or unmap the barrier's memory as soon as pthread_barrier_wait returns, without further synchronization.
2011-09-26	fix lost signals in cond vars	Rich Felker	-0/+1
	due to moving waiters from the cond var to the mutex in bcast, these waiters upon wakeup would steal slots in the count from newer waiters that had not yet been signaled, preventing the signal function from taking any action. to solve the problem, we simply use two separate waiter counts, and so that the original "total" waiters count is undisturbed by broadcast and still available for signal.
2011-09-26	cleanup various minor issues reported by nsz	Rich Felker	-3/+3
	the changes to syscall_ret are mostly no-ops in the generated code, just cleanup of type issues and removal of some implementation-defined behavior. the one exception is the change in the comparison value, which is fixed so that 0xf...f000 (which in principle could be a valid return value for mmap, although probably never in reality) is not treated as an error return.
2011-09-26	redo cond vars again, use sequence numbers	Rich Felker	-3/+3
	testing revealed that the old implementation, while correct, was giving way too many spurious wakeups due to races changing the value of the condition futex. in a test program with 5 threads receiving broadcast signals, the number of returns from pthread_cond_wait was roughly 3 times what it should have been (2 spurious wakeups for every legitimate wakeup). moreover, the magnitude of this effect seems to grow with the number of threads. the old implementation may also have had some nasty race conditions with reuse of the cond var with a new mutex. the new implementation is based on incrementing a sequence number with each signal event. this sequence number has nothing to do with the number of threads intended to be woken; it's only used to provide a value for the futex wait to avoid deadlock. in theory there is a danger of race conditions due to the value wrapping around after 2^32 signals. it would be nice to eliminate that, if there's a way. testing showed no spurious wakeups (though they are of course possible) with the new implementation, as well as slightly improved performance.
2011-09-25	new futex-requeue-based pthread_cond_broadcast implementation	Rich Felker	-3/+6
	this avoids the "stampede effect" where pthread_cond_broadcast would result in all waiters waking up simultaneously, only to immediately contend for the mutex and go back to sleep.
2011-09-22	fix deadlock in condition wait whenever there are multiple waiters	Rich Felker	-0/+1
	it's amazing none of the conformance tests i've run even bothered to check whether something so basic works...
2011-09-18	initial commit of the arm port	Rich Felker	-0/+15
	this port assumes eabi calling conventions, eabi linux syscall convention, and presence of the kernel helpers at 0xffff0f?0 needed for threads support. otherwise it makes very few assumptions, and the code should work even on armv4 without thumb support, as well as on systems with thumb interworking. the bits headers declare this a little endian system, but as far as i can tell the code should work equally well on big endian. some small details are probably broken; so far, testing has been limited to qemu/aboriginal linux.
2011-09-18	overhaul clone syscall wrapping	Rich Felker	-2/+1
	several things are changed. first, i have removed the old __uniclone function signature and replaced it with the "standard" linux __clone/clone signature. this was necessary to expose clone to applications anyway, and it makes it easier to port __clone to new archs, since it's now testable independently of pthread_create. secondly, i have removed all references to the ugly ldt descriptor structure (i386 only) from the c code and pthread structure. in places where it is needed, it is now created on the stack just when it's needed, in assembly code. thus, the i386 __clone function takes the desired thread pointer as its argument, rather than an ldt descriptor pointer, just like on all other sane archs. this should not affect applications since there is really no way an application can use clone with threads/tls in a way that doesn't horribly conflict with and clobber the underlying implementation's use. applications are expected to use clone only for creating actual processes, possibly with new namespace features and whatnot.
2011-08-23	security hardening: ensure suid programs have valid stdin/out/err	Rich Felker	-2/+4
	this behavior (opening fds 0-2 for a suid program) is explicitly allowed (but not required) by POSIX to protect badly-written suid programs from clobbering files they later open. this commit does add some cost in startup code, but the availability of auxv and the security flag will be useful elsewhere in the future. in particular auxv is needed for static-linked vdso support, which is still waiting to be committed (sorry nik!)
2011-08-12	pthread and synccall cleanup, new __synccall_wait op	Rich Felker	-0/+2
	fix up clone signature to match the actual behavior. the new __syncall_wait function allows a __synccall callback to wait for other threads to continue without returning, so that it can resume action after the caller finishes. this interface could be made significantly more general/powerful with minimal effort, but i'll wait to do that until it's actually useful for something.
2011-08-06	simplify multi-threaded errno, eliminate useless function pointer	Rich Felker	-2/+1

2011-08-06	use weak aliases rather than function pointers to simplify some code	Rich Felker	-2/+0

2011-08-03	overhaul rwlocks to address several issues	Rich Felker	-4/+2
	like mutexes and semaphores, rwlocks suffered from a race condition where the unlock operation could access the lock memory after another thread successfully obtained the lock (and possibly destroyed or unmapped the object). this has been fixed in the same way it was fixed for other lock types. in addition, the previous implementation favored writers over readers. in the absence of other considerations, that is the best behavior for rwlocks, and posix explicitly allows it. however posix also requires read locks to be recursive. if writers are favored, any attempt to obtain a read lock while a writer is waiting for the lock will fail, causing "recursive" read locks to deadlock. this can be avoided by keeping track of which threads already hold read locks, but doing so requires unbounded memory usage, and there must be a fallback case that favors readers in case memory allocation failed. and all of this must be synchronized. the cost, complexity, and risk of errors in getting it right is too great, so we simply favor readers. tracking of the owner of write locks has been removed, as it was not useful for anything. it could allow deadlock detection, but it's not clear to me that returning EDEADLK (which a buggy program is likely to ignore) is better than deadlocking; at least the latter behavior prevents further data corruption. a correct program cannot invoke this situation anyway. the reader count and write lock state, as well as the "last minute" waiter flag have all been combined into a single atomic lock. this means all state transitions for the lock are atomic compare-and-swap operations. this makes establishing correctness much easier and may improve performance. finally, some code duplication has been cleaned up. more is called for, especially the standard __timedwait idiom repeated in all locks.