|Age||Commit message (Collapse)||Author||Lines|
the FIXME comment here was overlooked at the time locale support was
these changes are motivated by a functionally similar patch by Hauke
Mehrtens to address the needs of the new mips vdso clock_gettime,
which wrongly fails with ENOSYS rather than falling back to making a
syscall for clock ids it cannot handle from userspace. in the process
of preparing to handle that case, it was noticed that the old
clock_gettime use of the vdso was actually wrong with respect to error
handling -- the tail call to the vdso function failed to set errno and
instead returned an error code.
since tail calls to vdso are no longer possible and since the plain
syscall code is now needed as a fallback path anyway, it does not make
sense to use a function pointer to call the plain syscall code path.
instead, it's inlined at the end of the main clock_gettime function.
the new code also avoids the need to test for initialization of the
vdso function pointer by statically initializing it to a self-init
function, and eliminates redundant loads from the volatile pointer
finally, the use of a_cas_p on an object of type other than void *,
which is not permitted aliasing, is replaced by using an object with
the correct type and casting the value.
strftime results are unspecified in this case, but should not invoke
tm_wday, tm_yday, tm_mon and tm_year fields were used in signed int
arithmetic that could overflow.
based on patch by Szabolcs Nagy.
as found and reported by Brian Mastenbrook, the expressions
400*qc_cycles and years+100 in __secs_to_tm were both subject to
integer overflow for extreme values of the input t.
this patch by Szabolcs Nagy fixes the code by switching to larger
types, and matches the original intent I had in mind when writing this
The value of *size is not relevant in case of failure, but it's
better not to copy garbage from the stack into it.
(The compiler cannot see through the syscall, so optimization
was not affected by the unspecified value).
tm_gmtoff is a nonstandard field, but on historical systems which have
this field, it stores the offset of the local time zone from GMT or
UTC. this is the opposite of the POSIX extern long timezone object and
the offsets used in POSIX-form TZ strings, which represent the offset
from local time to UTC. previously we were storing these negated
offsets in tm_gmtoff too.
programs which only used this field indirectly via strftime were not
affected since strftime performed the negation for presentation.
however, some programs and libraries accesse tm_gmtoff directly and
were obtaining negated time zone offsets.
this improves compatibility with the behavior of other systems and
with some applications which set an empty TZ var to disable use of
local time by mktime, etc.
the memory model we use internally for atomics permits plain loads of
values which may be subject to concurrent modification without
requiring that a special load function be used. since a compiler is
free to make transformations that alter the number of loads or the way
in which loads are performed, the compiler is theoretically free to
break this usage. the most obvious concern is with atomic cas
constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be
transformed to a_cas(p,*p,f(*p)); where the latter is intended to show
multiple loads of *p whose resulting values might fail to be equal;
this would break the atomicity of the whole operation. but even more
fundamental breakage is possible.
with the changes being made now, objects that may be modified by
atomics are modeled as volatile, and the atomic operations performed
on them by other threads are modeled as asynchronous stores by
hardware which happens to be acting on the request of another thread.
such modeling of course does not itself address memory synchronization
between cores/cpus, but that aspect was already handled. this all
seems less than ideal, but it's the best we can do without mandating a
C11 compiler and using the C11 model for atomics.
in the case of pthread_once_t, the ABI type of the underlying object
is not volatile-qualified. so we are assuming that accessing the
object through a volatile-qualified lvalue via casts yields volatile
access semantics. the language of the C standard is somewhat unclear
on this matter, but this is an assumption the linux kernel also makes,
and seems to be the correct interpretation of the standard.
previously, the hours were considered as a signed quantity while
minutes and seconds were always treated as positive offsets. however,
semantically the '-' sign should negate the whole hh:mm:ss offset.
this bug only affected timezones east of GMT with non-whole-hours
offsets, such as those used in India and Nepal.
based on patch by Jens Gustedt for inclusion with C11 threads
implementation, but committed separately since it's independent of
this change is presently non-functional since the callees do not yet
use their locale argument for anything.
the way this is implemented, it also allows explicit setting of
TZ=/etc/localtime even for suid programs. this is not a problem
because /etc/localtime is a trusted path, much like the trusted
zoneinfo search path.
such archs are expected to omit definitions of the SYS_* macros for
syscalls their kernels lack from arch/$ARCH/bits/syscall.h. the
preprocessor is then able to select the an appropriate implementation
for affected functions. two basic strategies are used on a
where the old syscalls correspond to deprecated library-level
functions, the deprecated functions have been converted to wrappers
for the modern function, and the modern function has fallback code
(omitted at the preprocessor level on new archs) to make use of the
old syscalls if the new syscall fails with ENOSYS. this also improves
functionality on older kernels and eliminates the incentive to program
with deprecated library-level functions for the sake of compatibility
with older kernels.
in other situations where the old syscalls correspond to library-level
functions which are not deprecated but merely lack some new features,
such as the *at functions, the old syscalls are still used on archs
which support them. this may change at some point in the future if or
when fallback code is added to the new functions to make them usable
(possibly with reduced functionality) on old kernels.
open is handled specially because it is used from so many places, in
so many variants (2 or 3 arguments, setting errno or not, and
cancellable or not). trying to do it as a function would not only
increase bloat, but would also risk subtle breakage.
this is the first step towards supporting "new" archs where linux
lacks "old" syscalls.
%C, %U, %W, and %y handling were completely missing; %C wrongly
fell-through to unrelated cases, and the rest returned failure. for
now, they all parse numbers in the proper forms and range-check the
values, but they do not store the value anywhere.
it's not clear to me whether, as "derived" fields, %U and %W should
produce any result. they certainly cannot produce a result unless the
year and weekday are also converted, but in this case it might be
desirable for them to do so. clarification is needed on the intended
behavior of strptime in cases like this.
%C and %y have well-defined behavior as long as they are used together
(and %y is defined by itself but may change in the future).
implementing them (including their correct interaction) is left as a
later change to be made.
finally, strptime now rejects unknown/invalid format characters
instead of ignoring them.
previously, setting TZ to the pathname of a file which was not a valid
zoneinfo file would usually cause programs using local time zone based
operations to crash. the new code checks the file size and magic at
the beginning of the file, which seems sufficient to prevent
accidental misconfiguration from causing crashes. attempting to make
fully-robust validation would be futile unless we wanted to drop use
of mmap (shared zoneinfo) and instead read it into a local buffer,
since such validation would be subject to race conditions with
modification of the file.
since the form TZ=name is reserved for POSIX-form time zone strings,
TZ=:name needs to be used when the zoneinfo filename is in the
top-level zoneinfo directory and therefore does not contain a slash.
previously the leading colon was merely dropped, making it impossible
to access such zones without a full absolute pathname.
changes based on patch by Timo Teräs.
the vdso symbol lookup code is based on the original 2011 patch by
Nicholas J. Kain, with some streamlining, pointer arithmetic fixes,
and one symbol version matching fix.
on the consumer side (clock_gettime), per-arch macros for the
particular symbol name and version to lookup are added in
syscall_arch.h, and no vdso code is pulled in on archs which do not
define these macros. at this time, vdso is enabled only on x86_64.
the vdso support at the dynamic linker level is no longer useful to
libc, but is left in place for the sake of debuggers (which may need
the vdso in the link map to find its functions) and possibly use with
this practice came from very early, before internal/syscall.h defined
macros that could accept pointer arguments directly and handle them
correctly. aside from being ugly and unnecessary, it looks like it
will be problematic when we add support for 32-bit ABIs on archs where
registers (and syscall arguments) are 64-bit, e.g. x32 and mips n32.
these functions were spuriously failing in the case where the buffer
size was exactly the number of bytes/characters to be written,
including null termination. since these functions do not have defined
error conditions other than buffer size, a reasonable application may
fail to check the return value when the format string and buffer size
are known to be valid; such an application could then attempt to use a
in addition to fixing the bug, I have changed the error handling
behavior so that these functions always null-terminate the output
except in the case where the buffer size is zero, and so that they
always write as many characters as possible before failing, rather
than dropping whole fields that do not fit. this actually simplifies
the logic somewhat anyway.
it's not clear why I originally wrote O_NOFOLLOW into this; I suspect
the reason was with an aim of making the function more general for
mapping partially or fully untrusted files provided by the user.
however, the timezone code already precludes use of absolute or
relative pathnames in suid/sgid programs, and disallows .. in
pathnames which are relative to one of the system timezone locations,
so there is no threat of opening a symlink which is not trusted by
appropriate user. since some users may wish to put symbolic links in
the zoneinfo directories to alias timezones, it seems preferable to
the rest of the code is not prepared to handle an empty TZ string, so
falling back to __gmt ("GMT"), just as if TZ had been blank or unset,
is the preferable action.
try+l points to \0, so only one iteration was ever tried.
we need to skip to the second TZif header, which starts at
skip+44, and then skip another header (20 bytes) plus the following
6 32bit values.
if sizeof(time_t) == 8, this code path was missing the correct
offset into the zoneinfo file, using the header magic to do
the 6 32bit fields to be read start at offset 20.
despite being marked legacy, this was specified by SUSv3 as part of
the XSI option; only the most recent version of the standard dropped
it. reportedly there's actual code using it.
this is a nonstandard extension but will be required in the next
version of POSIX, and it's widely used/useful in shell scripts
utilizing the date utility.
%e pads with spaces instead of zeros.
in this case, the first standard-time and first daylight-time rules
should be taken as the "default" ones to expose.
if a zoneinfo file is not (or is no longer) in use, don't check the
abbrevs pointers, which may be invalid.
this may need further revision in the future, since POSIX is rather
unclear on the requirements, and is designed around the assumption of
POSIX TZ specifiers which are not sufficiently powerful to represent
real-world timezones (this is why zoneinfo support was added).
the basic issue is that strftime gets the string and numeric offset
for the timezone from the extra fields in struct tm, which are
initialized when calling localtime/gmtime/etc. however, a conforming
application might have created its own struct tm without initializing
these fields, in which case using __tm_zone (a pointer) could crash.
other zoneinfo-based implementations simply check for a null pointer,
but otherwise can still crash of the field contains junk.
simply ignoring __tm_zone and using tzname would "work" but would
give incorrect results in time zones with more complex rules. I feel
like this would lower the quality of implementation.
instead, simply validate __tm_zone: unless it points to one of the
zone name strings managed by the timezone system, assume it's invalid.
this commit also fixes several other minor bugs with formatting:
tm_isdst being negative is required to suppress printing of the zone
formats, and %z was using the wrong format specifiers since the type
of val was changed, resulting in bogus output.
the empty TZ string was matching equal to the initial value of the
cached TZ name, thus causing do_tzset never to run and never to
initialize the time zone data.
this bug was masked by local experimental CFLAGS in my config.mak.
at present, since POSIX requires %F to behave as %+4Y-%m-%d and ISO C
requires %F to behave as %Y-%m-%d, the default behavior for %Y has
been changed to match %+4Y. this seems to be the only way to conform
to the requirements of both standards, and it does not affect years
prior to the year 10000. depending on the outcome of interpretations
from the standards bodies, this may be adjusted at some point.
use a long long value so that even with offsets, values cannot
overflow. instead of using different format strings for different
numeric formats, simply use a per-format width and %0*lld for all of
this width specifier is not for use with strftime field widths; that
will be a separate step in the caller.
make __strftime_fmt_1 return a string (possibly in the caller-provided
temp buffer) rather than writing into the output buffer. this approach
makes more sense when padding to a minimum field width might be
required, and it's also closer to what wcsftime wants.
unblocking it in the pthread_once init function is not sufficient,
since multiple threads, some of them with the signal blocked, could
already exist before this is called; timers started from such threads
would be non-functional.
this is needed for reused threads in the SIGEV_THREAD timer
notification system, and could be reused elsewhere in the future if
needed, though it should be refactored for such use.
for static linking, __init_tls.c is simply modified to export the TLS
info in a structure with external linkage, rather than using statics.
this perhaps makes the code more clear, since the statics were poorly
named for statics. the new __reset_tls.c is only linked if it is used.
for dynamic linking, the code is in dynlink.c. sharing code with
__copy_tls is not practical since __reset_tls must also re-zero
1. the thread result field was reused for storing a kernel timer id,
but would be overwritten if the application code exited or cancelled
2. low pointer values were used as the indicator that the timer id is
a kernel timer id rather than a thread id. this is not portable, as
mmap may return low pointers on some conditions. instead, use the fact
that pointers must be aligned and kernel timer ids must be
non-negative to map pointers into the negative integer space.
3. signals were not blocked until after the timer thread started, so a
race condition could allow a signal handler to run in the timer thread
when it's not supposed to exist. this is mainly problematic if the
calling thread was the only thread where the signal was unblocked and
the signal handler assumes it runs in that thread.
this is a nonstandard extension.
LC_GLOBAL_LOCALE refers to the global locale, controlled by setlocale,
not the thread-local locale in effect which these functions should be
using. neither LC_GLOBAL_LOCALE nor 0 has an argument to the *_l
functions has behavior defined by the standard, but 0 is a more
logical choice for requesting the callee to lookup the current locale.
in the future I may move the current locale lookup the the caller (the
at this point, all of the locale logic is dummied out, so no harm was
done, but it should at least avoid misleading usage.