|Age||Commit message (Collapse)||Author||Lines|
POSIX requires setvbuf to return non-zero if `mode` is not one of _IONBF,
_IOLBF, or _IOFBF.
the way gets was implemented in terms of fgets, it used the location
of the null termination to determine where to find and remove the
newline, if any. an embedded null byte prevented this from working.
this also fixes a one-byte buffer overflow, whereby when gets read an
N-byte line (not counting newline), it would store two null
terminators for a total of N+2 bytes. it's unlikely that anyone would
care that a function whose use is pretty much inherently a buffer
overflow writes too much, but it could break the only possible correct
uses of this function, in conjunction with input of known format from
a trusted/same-privilege-domain source, where the buffer length may
have been selected to exactly match a line length contract.
there seems to be no correct way to implement gets in terms of a
single call to fgets or scanf, and using multiple calls would require
explicit locking, so we might as well just write the logic out
explicitly character-at-a-time. this isn't fast, but nobody cares if a
catastrophically unsafe function that's so bad it was removed from the
C language is fast.
commit ddc947eda311331959c73dbc4491afcfe2326346 fixed the
corresponding bug for exit which was introduced when commit
0b80a7b0404b6e49b0b724e3e3fe0ed5af3b08ef added support for
caller-provided buffers, making it possible for stderr to be a
fflush(NULL) and __stdio_exit lock individual FILEs while holding the
open file list lock to walk the list. since fclose first locked the
FILE to be closed, then the ofl lock, it could deadlock with these
also, because fclose removed the FILE to be closed from the open file
list before flushing and closing it, a concurrent fclose or exit could
complete successfully before fclose flushed the FILE it was closing,
resulting in data loss.
reorder the body of fclose to first flush and close the file, then
remove it from the open file list only after unlocking it. this
creates a window where consumers of the open file list can see dead
FILE objects, but in the absence of undefined behavior on the part of
the application, such objects will be in an inactive-buffer state and
processing them will have no side effects.
__unlist_locked_file is also moved so that it's performed only for
non-permanent files. this change is not necessary, but preserves
consistency (and thereby provides safety/hardening) in the case where
an application uses one of the standard streams after closing it while
holding an explicit lock on it. such usage is of course undefined
check whether the lock is free before loading the calling thread's
tid. if so, just use a dummy tid value that cannot compare equal to
any actual thread id (because it's one bit wider). this also avoids
the need to save the tid and pass it to locking_getc or locking_putc,
reducing register pressure.
this change might slightly hurt the case where the caller already
holds the lock, but it does not affect the single-threaded case, and
may significantly improve the multi-threaded case, especially on archs
where loading the thread pointer is disproportionately expensive like
early mips and arm ISA levels. but even on i386 it helps, at least on
some machines; I measured roughly a 10-15% improvement.
commit d664061adb4d7f6647ab2059bc351daa394bf5da inadvertently omitted
the new file putc.h.
by ABI, the public stdin/out/err macros use extern pointer objects,
and this is necessary to avoid copy relocations that would be
expensive and make the size of the FILE structure part of the ABI.
however, internally it makes sense to access the underlying FILE
objects directly. this avoids both an indirection through the GOT to
find the address of the stdin/out/err pointer objects (which can't be
computed PC-relative because they may have been moved to the main
program by copy relocations) and an indirection through the resulting
in most places this is just a minor optimization, but in the case of
getchar and putchar (and the unlocked versions thereof), ipa constant
propagation makes all accesses to members of stdin/out PC-relative or
GOT-relative, possibly reducing register pressure as well.
this is the analog of commit dd8f02b7dce53d6b1c4282439f1636a2d63bee01,
but for putc.
with these changes, in a program that has not created any threads
besides the main thread and that has not called f[try]lockfile, getc
performs indistinguishably from getc_unlocked. this was measured on
several i386 and x86_64 models, and should hold on other archs too
simply by the properties of the code generation.
the case where the caller already holds the lock (via flockfile) is
improved significantly as well (40-60% reduction in time on machines
tested) and the case where locking is needed is improved somewhat
the key technique used here is forcing the non-hot path out-of-line
and enabling it to be a tail call. a static noinline function
(conditional on __GNUC__) is used rather than the extern hiddens used
elsewhere for this purpose, so that the compiler can choose
non-default calling conventions, making it possible to tail-call to a
callee that takes more arguments than the caller on archs where
arguments are passed on the stack or must have space reserved on the
stack for spilling the. the tid could just be reloaded via the thread
pointer in locking_getc, but that would be ridiculously expensive on
some archs where thread pointer load requires a trap or syscall.
don't repeat definition in two places.
The condition occurs when
- thread #1 is holding the lock
- thread #2 is waiting for it on __futexwait
- thread #1 is about to release the lock and performs a_swap
- thread #3 enters the __lockfile function and manages to grab the lock
before thread #1 calls __wake, resetting the MAYBE_WAITERS flag
- thread #1 calls __wake
- thread #2 wakes up but goes again to __futexwait as the lock is
held by thread #3
- thread #3 releases the lock but does not call __wake as the
MAYBE_WAITERS flag is not set
This condition results in thread #2 not being woken up. This patch fixes
the problem by making the woken up thread ensure that the flag is
properly set before going to sleep again.
Mainainer's note: This fixes a regression introduced in commit
commit b114190b29417fff6f701eea3a3b3b6030338280 introduced spurious
realloc of the output buffer in cases where the result would exactly
fit in the caller-provided buffer. this is contrary to a strict
reading of the spec, which only allows realloc when the provided
buffer is "of insufficient size".
revert the adjustment of the realloc threshold, and instead push the
byte read by getc_unlocked (for which the adjustment was made) back
into the stdio buffer if it does not fit in the output buffer, to be
read in the next loop iteration.
in order not to leave a pushed-back byte in the stdio buffer if
realloc fails (which would violate the invariant that logical FILE
position and underlying open file description offset match for
unbuffered FILEs), the OOM code path must be changed. it would suffice
move just one byte in this case, but from a QoI perspective, in the
event of ENOMEM the entire output buffer (up to the allocated length
reported via *n) should contain bytes read from the FILE stream.
otherwise the caller has no way to distinguish trunated data from
uninitialized buffer space.
the SIZE_MAX/2 check is removed since the sum of disjoint object sizes
is assumed not to be able to overflow, leaving just one OOM code path.
morally, for null pointers a and b, a-b, a<b, and a>b should all be
defined as 0; however, C does not define any of them.
the stdio implementation makes heavy use of such pointer comparison
and subtraction for buffer logic, and also uses null pos/base/end
pointers to indicate that the FILE is not in the corresponding (read
or write) mode ready for accesses through the buffer.
all of the comparisons are fixed trivially by using != in place of the
relational operators, since the opposite relation (e.g. pos>end) is
logically impossible. the subtractions have been reviewed to check
that they are conditional the stream being in the appropriate reading-
or writing-through-buffer mode, with checks added where needed.
in fgets and getdelim, the checks added should improve performance for
unbuffered streams by avoiding a do-nothing call to memchr, and should
be negligible for buffered streams.
if EINVAL or ENOMEM happened before the first getc_unlocked, it was
possible that the stream orientation had not yet been set.
this further reduces the number of source files which need to include
libc.h and thereby be potentially exposed to libc global state and
this will also facilitate further improvements like adding an inline
fast-path, if we want to do so later.
the LFS64 macro was not self-documenting and barely saved any
characters. simply use weak_alias directly so that it's clear what's
being done, and doesn't depend on a header to provide a strange macro.
libc.h was intended to be a header for access to global libc state and
related interfaces, but ended up included all over the place because
it was the way to get the weak_alias macro. most of the inclusions
removed here are places where weak_alias was needed. a few were
recently introduced for hidden. some go all the way back to when
libc.h defined CANCELPT_BEGIN and _END, and all (wrongly implemented)
cancellation points had to include it.
remaining spurious users are mostly callers of the LOCK/UNLOCK macros
and files that use the LFS64 macro to define the awful *64 aliases.
in a few places, new inclusion of libc.h is added because several
internal headers no longer implicitly include libc.h.
declarations for __lockfile and __unlockfile are moved from libc.h to
stdio_impl.h so that the latter does not need libc.h. putting them in
libc.h made no sense at all, since the macros in stdio_impl.h are
needed to use them correctly anyway.
commits leading up to this one have moved the vast majority of
libc-internal interface declarations to appropriate internal headers,
allowing them to be type-checked and setting the stage to limit their
visibility. the ones that have not yet been moved are mostly
namespace-protected aliases for standard/public interfaces, which
exist to facilitate implementing plain C functions in terms of POSIX
functionality, or C or POSIX functionality in terms of extensions that
are not standardized. some don't quite fit this description, but are
"internally public" interfacs between subsystems of libc.
rather than create a number of newly-named headers to declare these
functions, and having to add explicit include directives for them to
every source file where they're needed, I have introduced a method of
wrapping the corresponding public headers.
parallel to the public headers in $(srcdir)/include, we now have
wrappers in $(srcdir)/src/include that come earlier in the include
path order. they include the public header they're wrapping, then add
declarations for namespace-protected versions of the same interfaces
and any "internally public" interfaces for the subsystem they
along these lines, the wrapper for features.h is now responsible for
the definition of the hidden, weak, and weak_alias macros. this means
source files will no longer need to include any special headers to
access these features.
over time, it is my expectation that the scope of what is "internally
public" will expand, reducing the number of source files which need to
include *_impl.h and related headers down to those which are actually
implementing the corresponding subsystems, not just using them.
this functions is glue for linking dependency logic.
logically these belong to the intersection of the stdio and pthread
subsystems, and either place the declarations could go (stdio_impl.h
or pthread_impl.h) requires a forward declaration for one of the
policy is that all public functions which have a public declaration
should be defined in a context where that public declaration is
visible, to avoid preventable type mismatches.
an audit performed using GCC's -Wmissing-declarations turned up the
violations corrected here. in some cases the public header had not
been included; in others, a feature test macro needed to make the
declaration visible had been omitted.
in the case of gethostent and getnetent, the omission seems to have
been intentional, as a hack to admit a single stub definition for both
functions. this kind of hack is no longer acceptable; it's UB and
would not fly with LTO or advanced toolchains. the hack is undone to
make exposure of the declarations possible.
this requirement is specified by POSIX.
if no output is produced, no underlying fwrite will ever be called,
but byte-oriented printf functions are still required to set the
orientation of the stream to byte-oriented. call __towrite explicitly
if the FILE is not already in write mode.
commit b5a8b28915aad17b6f49ccacd6d3fef3890844d1 setup the write buffer
bound pointers for the temporary buffer manually to fix a buffer
overflow issue, but in doing so, caused vfprintf on unbuffered files
never to call __towrite, thereby failing to set the stream orientation
to byte-oriented, failing to clear any prior read mode, and failing to
produce an error when the stream is not writable.
revert the inline setup of the bounds pointers and instead zero them,
so that the underlying fwrite code will call __towrite to set them up.
commit 0b80a7b0404b6e49b0b724e3e3fe0ed5af3b08ef added the ability to
set application-provided stdio FILE buffers, adding the possibility
that stderr might be buffered at exit time, but __stdio_exit did not
have code to flush it.
this regression was not present in any release.
fundamentally there is no good reason these functions need to set an
orientation (morally it should be possible to write a wchar_t memory
stream using byte functions, or a char memory stream using wide
functions), but it's a part of the specification that they do. aside
from being able to inspect the orientation with fwide, failure to set
the orientation in open_wmemstream is observable if the locale changes
between open_wmemstream and the first operation on the stream; this is
because the encoding rule (locale) for the stream is required to be
bound at the time the stream becomes wide-oriented.
for open_wmemstream, call fwide to avoid duplicating the logic for
binding the encoding rule. for open_memstream it suffices just to set
the mode field in the FILE struct.
the w+ mode is specified to "truncate the buffer contents". like most
of fmemopen, exactly what this means is underspecified. mode w and w+
of course implicitly 'truncate' the buffer if a write from the initial
position is flushed, so in order for this part of the text about w+
not to be spurious, it should be interpreted as requiring something
else, and the obvious reasonable interpretation is that the truncation
is immediately visible if you attempt to read from the stream or the
buffer before writing/flushing.
this interpretation agrees with reported conformance test failures.
this is a POSIX requirement.
also remove the gratuitous locking shenanigans and simply access f->fd
under control of the lock. there is no advantage to not doing so, and
it made the correctness non-obvious at best.
the code to perform rounding to the desired precision wrongly assumed
the long double mantissa was an integral number of nibbles (hex
digits) in length. this is true for 80-bit extended precision (64-bit
mantissa) but not for double (53) or quad (113).
scale the rounding value by 1<<(LDBL_MANT_DIG%4) to compensate.
commit 0b80a7b0404b6e49b0b724e3e3fe0ed5af3b08ef, which added non-stub
setvbuf, applied the UNGET pushback adjustment to the size of the
buffer passed in, but inadvertently omitted offsetting the start by
the same amount, thereby allowing unget to clobber up to 8 bytes
before the start of the buffer. this bug was introduced in the present
release cycle; no releases are affected.
bring these functions up to date with the current idioms we use/prefer
in fmemopen and fopencookie.
rather than manually performing pointer arithmetic to carve multiple
objects out of one allocation, use a containing struct that
encompasses them all.
assign entire struct rather than member-at-a-time. don't repeat buffer
sizes; always use sizeof to ensure consistency.
instead of using a waiters count, add a bit to the lock field
indicating that the lock may have waiters. threads which obtain the
lock after contending for it will perform a potentially-spurious wake
when they release the lock.
add a member of appropriate type to the fpos_t union so that accesses
are well-defined. use long long instead of off_t since off_t is not
always exposed in stdio.h and there's no namespace-clean alias for it.
access is still performed using pointer casts rather than by naming
the union member as a matter of style; to the extent possible, the
naming of fields in opaque types defined in the public headers is not
treated as an API contract with the implementation. access via the
pointer cast is valid as long as the union has a member of matching
this is the idiom that's used elsewhere and should be more efficient
or at least no worse.
they seem to be relics of e3cd6c5c265cd481db6e0c5b529855d99f0bda30
where this code was refactored from a check that previously masked
against (F_ERR|F_NOWR) instead of just F_NOWR.
formally, calling readv with a zero-length first iov component should
behave identically to calling read on just the second component, but
presence of a zero-length iov component has triggered bugs in some
kernels and performs significantly worse than a simple read on some
the stdio FILE read backend's return type is size_t, not ssize_t, and
all of the special (non-fd-backed) FILE types already return the
number of bytes read (zero) on error or eof. only __stdio_read leaked
a syscall error return into its return value.
fread had a workaround for this behavior going all the way back to the
original check-in. remove the workaround since it's no longer needed.
replace with simple conditional that doesn't rely on assumption that
cnt is either 0 or -1.
when a null buffer pointer is passed to fmemopen, requesting it
allocate its own memory buffer, extremely large size arguments near
SIZE_MAX could overflow and result in underallocation. this results
from omission of the size of the cookie structure in the overflow
check but inclusion of it in the calloc call.
instead of accounting for individual small contributions to the total
allocation size needed, simply reject sizes larger than PTRDIFF_MAX,
which will necessarily fail anyway. then adding arbitrary fixed-size
structures is safe without matching up the expressions in the
comparison and the allocation.
commit 78897b0dc00b7cd5c29af5e0b7eebf2396d8dce0 wrongly simplified
Dmitry Levin's original submitted patch fixing alt-form octal with the
zero flag and field width present, omitting the special case where the
value is zero. as a result, printf("%#o",0) wrongly prints "00" rather
the logic prior to this commit was actually better, in that it was
aligned with how the alt-form flag (#) for printf is specified ("it
shall increase the precision"). at the time there was no good way to
avoid the zero flag issue with the old logic, but commit
167dfe9672c116b315e72e57a55c7769f180dffa added tracking of whether an
explicit precision was provided.
revert commit 78897b0dc00b7cd5c29af5e0b7eebf2396d8dce0 and switch to
using the explicit precision indicator for suppressing the zero flag.