| Age | Commit message (Collapse) | Author | Lines | 
|---|
|  |  | 
|  |  | 
|  | at present the i386 code does not support sse floating point, which is
not part of the standard i386 abi. while it may be desirable to
support it later, doing so will reduce performance and require some
tricks to probe if sse support is present.
this first commit is i386-only, but it should be trivial to port the
asm to x86_64. | 
|  | even if size_t was 32-bit already, the fact that the value was
unsigned and that gcc is too stupid to figure out it would be positive
as a signed quantity (due to the immediately-prior arithmetic and
conditionals) results in gcc compiling the integer-to-float conversion
as zero extension to 64 bits followed by an "fildll" (64 bit)
instruction rather than a simple "fildl" (32 bit) instruction on x86.
reportedly fildll is very slow on certain p4-class machines; even if
not, the new code is slightly smaller. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | basically there are 3 choices for how to implement this variable-size
string member:
1. C99 flexible array member: breaks using dirent.h with pre-C99 compiler.
2. old way: length-1 string: generates array bounds warnings in caller.
3. new way: length-NAME_MAX string. no problems, simplifies all code.
of course the usable part in the pointer returned by readdir might be
shorter than NAME_MAX+1 bytes, but that is allowed by the standard and
doesn't hurt anything. | 
|  | this actually inadvertently disallows some valid patterns with
redundant / or * characters, but it's better than allowing unbounded
vla allocation.
eventually i'll write code to move the pattern to the stack and
eliminate redundancy to ensure that it fits in PATH_MAX at the
beginning of glob. this would also allow it to be modified in place
for passing to fnmatch rather than copied at each level of recursion. | 
|  |  | 
|  | there is a resource limit of 0 bits to store the concurrency level
requested. thus any positive level exceeds a resource limit, resulting
in EAGAIN. :-) | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | file actions are not yet implemented, but everything else should be
mostly complete and roughly correct. | 
|  |  | 
|  | also modify wcsncpy to use the same loop logic | 
|  |  | 
|  |  | 
|  | the observed symptom was that the code was incorrectly rounding up
1.0625 to 1.063 despite the rounding mode being round-to-nearest with
ties broken by rounding to even last place. however, the code was just
not right in many respects, and i'm surprised it worked as well as it
did. this time i tested the values that end up in the variables round,
small, and the expression round+small, and all look good. | 
|  |  | 
|  |  | 
|  |  | 
|  | the new approach relies on the fact that the only ways to create
sigset_t objects without invoking UB are to use the sig*set()
functions, or from the masks returned by sigprocmask, sigaction, etc.
or in the ucontext_t argument to a signal handler. thus, as long as
sigfillset and sigaddset avoid adding the "protected" signals, there
is no way the application will ever obtain a sigset_t including these
bits, and thus no need to add the overhead of checking/clearing them
when sigprocmask or sigaction is called.
note that the old code actually *failed* to remove the bits from
sa_mask when sigaction was called.
the new implementations are also significantly smaller, simpler, and
faster due to ignoring the useless "GNU HURD signals" 65-1024, which
are not used and, if there's any sanity in the world, never will be
used. | 
|  | these should be tweaked according to testing. offhand i know 1000 is
too low and 5000 is likely to be sufficiently high. consider trying to
add futexes to file locking, too... | 
|  |  | 
|  | the previous implementation had at least 2 problems:
1. the case where additional threads reached the barrier before the
first wave was finished leaving the barrier was untested and seemed
not to be working.
2. threads leaving the barrier continued to access memory within the
barrier object after other threads had successfully returned from
pthread_barrier_wait. this could lead to memory corruption or crashes
if the barrier object had automatic storage in one of the waiting
threads and went out of scope before all threads finished returning,
or if one thread unmapped the memory in which the barrier object
lived.
the new implementation avoids both problems by making the barrier
state essentially local to the first thread which enters the barrier
wait, and forces that thread to be the last to return. | 
|  | the previous fix was incorrect, as it would prevent f->close(f) from
being called if fflush(f) failed. i believe this was the original
motivation for using | rather than ||. so now let's just use a second
statement to constrain the order of function calls, and to back to
using |. | 
|  | pcc turned up this bug by calling f->close(f) before fflush(f),
resulting in lost output and error on flush. | 
|  | with this patch, musl compiles and mostly works with pcc 1.0.0. a few
tests are still failing and i'm uncertain whether they are due to
portability problems in musl, or bugs in pcc, but i suspect the
latter. | 
|  |  | 
|  |  | 
|  | the old versions worked, but conflicted with programs which declared
their own prototypes and generated warnings with some versions of gcc. | 
|  | Smoothsort is an adaptive variant of heapsort. This version was
written by Valentin Ochs (apo) specifically for inclusion in musl. I
worked with him to get it working in O(1) memory usage even with giant
array element widths, and to optimize it heavily for size and speed.
It's still roughly 4 times as large as the old heap sort
implementation, but roughly 20 times faster given an almost-sorted
array of 1M elements (20 being the base-2 log of 1M), i.e. it really
does reduce O(n log n) to O(n) in the mostly-sorted case. It's still
somewhat slower than glibc's Introsort for random input, but now
considerably faster than glibc when the input is already sorted, or
mostly sorted. | 
|  |  | 
|  |  | 
|  |  | 
|  | 1. failed match of literal chars from the format string would always
return matching failure rather than input failure at eof, leading to
infinite loops in some programs.
2. unread of eof would wrongly adjust the character counts reported by
%n, yielding an off-by-one error. | 
|  |  | 
|  |  | 
|  |  |