summaryrefslogtreecommitdiff
path: root/src/env
AgeCommit message (Collapse)AuthorLines
2015-05-06fix stack protector crashes on x32 & powerpc due to misplaced TLS canaryRich Felker-1/+1
i386, x86_64, x32, and powerpc all use TLS for stack protector canary values in the default stack protector ABI, but the location only matched the ABI on i386 and x86_64. on x32, the expected location for the canary contained the tid, thus producing spurious mismatches (resulting in process termination) upon fork. on powerpc, the expected location contained the stdio_locks list head, so returning from a function after calling flockfile produced spurious mismatches. in both cases, the random canary was not present, and a predictable value was used instead, making the stack protector hardening much less effective than it should be. in the current fix, the thread structure has been expanded to have canary fields at all three possible locations, and archs that use a non-default location must define a macro in pthread_arch.h to choose which location is used. for most archs (which lack TLS canary ABI) the choice does not matter.
2015-04-23fix misalignment of dtv in static-linked programs with odd-sized TLSRich Felker-1/+2
both static and dynamic linked versions of the __copy_tls function have a hidden assumption that the alignment of the beginning or end of the memory passed is suitable for storing an array of pointers for the dtv. pthread_create satisfies this requirement except when libc.tls_size is misaligned, which cannot happen with dynamic linking due to way update_tls_size computes the total size, but could happen with static linking and odd-sized TLS.
2015-04-23remove dead store from static __init_tlsRich Felker-2/+0
commit dab441aea240f3b7c18a26d2ef51979ea36c301c, which made thread pointer init mandatory for all programs, rendered this store obsolete by removing the early-return path for static programs with no TLS.
2015-04-23make __init_tp function static when static linkingRich Felker-0/+3
this slightly reduces the code size cost of TLS/thread-pointer for static linking since __init_tp can be inlined into its only caller and removed. this is analogous to the handling of __init_libc in __libc_start_main, where the function only has external linkage when it needs to be called from the dynamic linker.
2015-04-22fix inconsistent visibility for __hwcap and __sysinfo symbolsRich Felker-3/+0
these are used as hidden by asm files (and such use is the whole reason they exist), but their actual definitions were not hidden.
2015-04-22remove useless visibility application from static-linking-only codeRich Felker-3/+2
part of the goal here is to eliminate use of the ATTR_LIBC_VISIBILITY macro outside of libc.h, since it was never intended to be 'public'.
2015-04-13allow libc itself to be built with stack protector enabledRich Felker-0/+10
this was already essentially possible as a result of the previous commits changing the dynamic linker/thread pointer bootstrap process. this commit mainly adds build system infrastructure: configure no longer attempts to disable stack protector. instead it simply determines how so the makefile can disable stack protector for a few translation units used during early startup. stack protector is also disabled for memcpy and memset since compilers (incorrectly) generate calls to them on some archs to implement struct initialization and assignment, and such calls may creep into early initialization. no explicit attempt to enable stack protector is made by configure at this time; any stack protector option supported by the compiler can be passed to configure in CFLAGS, and if the compiler uses stack protector by default, this default is respected.
2015-04-13remove remnants of support for running in no-thread-pointer modeRich Felker-5/+3
since 1.1.0, musl has nominally required a thread pointer to be setup. most of the remaining code that was checking for its availability was doing so for the sake of being usable by the dynamic linker. as of commit 71f099cb7db821c51d8f39dfac622c61e54d794c, this is no longer necessary; the thread pointer is now valid before any libc code (outside of dynamic linker bootstrap functions) runs. this commit essentially concludes "phase 3" of the "transition path for removing lazy init of thread pointer" project that began during the 1.1.0 release cycle.
2015-04-10optimize out setting up robust list with kernel when not neededRich Felker-0/+1
as a result of commit 12e1e324683a1d381b7f15dd36c99b37dd44d940, kernel processing of the robust list is only needed for process-shared mutexes. previously the first attempt to lock any owner-tracked mutex resulted in robust list initialization and a set_robust_list syscall. this is no longer necessary, and since the kernel's record of the robust list must now be cleared at thread exit time for detached threads, optimizing it out is more worthwhile than before too.
2015-03-11copy the dtv pointer to the end of the pthread struct for TLS_ABOVE_TP archsSzabolcs Nagy-1/+1
There are two main abi variants for thread local storage layout: (1) TLS is above the thread pointer at a fixed offset and the pthread struct is below that. So the end of the struct is at known offset. (2) the thread pointer points to the pthread struct and TLS starts below it. So the start of the struct is at known (zero) offset. Assembly code for the dynamic TLSDESC callback needs to access the dynamic thread vector (dtv) pointer which is currently at the front of the pthread struct. So in case of (1) the asm code needs to hard code the offset from the end of the struct which can easily break if the struct changes. This commit adds a copy of the dtv at the end of the struct. New members must not be added after dtv_copy, only before it. The size of the struct is increased a bit, but there is opportunity for size optimizations.
2015-03-06fix over-alignment of TLS, insufficient builtin TLS on 64-bit archsRich Felker-2/+8
a conservative estimate of 4*sizeof(size_t) was used as the minimum alignment for thread-local storage, despite the only requirements being alignment suitable for struct pthread and void* (which struct pthread already contains). additional alignment required by the application or libraries is encoded in their headers and is already applied. over-alignment prevented the builtin_tls array from ever being used in dynamic-linked programs on 64-bit archs, thereby requiring allocation at startup even in programs with no TLS of their own.
2014-08-13fix #ifdef inside a macro argument list in __init_tls.cSzabolcs Nagy-4/+3
C99 6.10.3p11 disallows such constructs so use an #ifdef outside of the argument list of __syscall
2014-07-05eliminate use of cached pid from thread structureRich Felker-1/+1
the main motivation for this change is to remove the assumption that the tid of the main thread is also the pid of the process. (the value returned by the set_tid_address syscall was used to fill both fields despite it semantically being the tid.) this is historically and presently true on linux and unlikely to change, but it conceivably could be false on other systems that otherwise reproduce the linux syscall api/abi. only a few parts of the code were actually still using the cached pid. in a couple places (aio and synccall) it was a minor optimization to avoid a syscall. caching could be reintroduced, but lazily as part of the public getpid function rather than at program startup, if it's deemed important for performance later. in other places (cancellation and pthread_kill) the pid was completely unnecessary; the tkill syscall can be used instead of tgkill. this is actually a rather subtle issue, since tgkill is supposedly a solution to race conditions that can affect use of tkill. however, as documented in the commit message for commit 7779dbd2663269b465951189b4f43e70839bc073, tgkill does not actually solve this race; it just limits it to happening within one process rather than between processes. we use a lock that avoids the race in pthread_kill, and the use in the cancellation signal handler is self-targeted and thus not subject to tid reuse races, so both are safe regardless of which syscall (tgkill or tkill) is used.
2014-07-02add locale frameworkRich Felker-0/+1
this commit adds non-stub implementations of setlocale, duplocale, newlocale, and uselocale, along with the data structures and minimal code needed for representing the active locale on a per-thread basis and optimizing the common case where thread-local locale settings are not in use. at this point, the data structures only contain what is necessary to represent LC_CTYPE (a single flag) and LC_MESSAGES (a name for use in finding message translation files). representation for the other categories will be added later; the expectation is that a single pointer will suffice for each. for LC_CTYPE, the strings "C" and "POSIX" are treated as special; any other string is accepted and treated as "C.UTF-8". for other categories, any string is accepted after being truncated to a maximum supported length (currently 15 bytes). for LC_MESSAGES, the name is kept regardless of whether libc itself can use such a message translation locale, since applications using catgets or gettext should be able to use message locales libc is not aware of. for other categories, names which are not successfully loaded as locales (which, at present, means all names) are treated as aliases for "C". setlocale never fails. locale settings are not yet used anywhere, so this commit should have no visible effects except for the contents of the string returned by setlocale.
2014-07-01fix typo in a comment in __libc_start_mainRich Felker-1/+1
2014-06-19separate __tls_get_addr implementation from dynamic linker/init_tlsRich Felker-5/+0
such separation serves multiple purposes: - by having the common path for __tls_get_addr alone in its own function with a tail call to the slow case, code generation is greatly improved. - by having __tls_get_addr in it own file, it can be replaced on a per-arch basis as needed, for optimization or ABI-specific purposes. - by removing __tls_get_addr from __init_tls.c, a few bytes of code are shaved off of static binaries (which are unlikely to use this function unless the linker messed up).
2014-06-10simplify errno implementationRich Felker-1/+0
the motivation for the errno_ptr field in the thread structure, which this commit removes, was to allow the main thread's errno to keep its address when lazy thread pointer initialization was used. &errno was evaluated prior to setting up the thread pointer and stored in errno_ptr for the main thread; subsequently created threads would have errno_ptr pointing to their own errno_val in the thread structure. since lazy initialization was removed, there is no need for this extra level of indirection; __errno_location can simply return the address of the thread's errno_val directly. this does cause &errno to change, but the change happens before entry to application code, and thus is not observable.
2014-06-10add thread-pointer support for pre-2.6 kernels on i386Rich Felker-9/+4
such kernels cannot support threads, but the thread pointer is also important for other purposes, most notably stack protector. without a valid thread pointer, all code compiled with stack protector will crash. the same applies to any use of thread-local storage by applications or libraries. the concept of this patch is to fall back to using the modify_ldt syscall, which has been around since linux 1.0, to setup the gs segment register. since the kernel does not have a way to automatically assign ldt entries, use of slot zero is hard-coded. if this fallback path is used, __set_thread_area returns a positive value (rather than the usual zero for success, or negative for error) indicating to the caller that the thread pointer was successfully set, but only for the main thread, and that thread creation will not work properly. the code in __init_tp has been changed accordingly to record this result for later use by pthread_create.
2014-05-29support linux kernel apis (new archs) with old syscalls removedRich Felker-0/+5
such archs are expected to omit definitions of the SYS_* macros for syscalls their kernels lack from arch/$ARCH/bits/syscall.h. the preprocessor is then able to select the an appropriate implementation for affected functions. two basic strategies are used on a case-by-case basis: where the old syscalls correspond to deprecated library-level functions, the deprecated functions have been converted to wrappers for the modern function, and the modern function has fallback code (omitted at the preprocessor level on new archs) to make use of the old syscalls if the new syscall fails with ENOSYS. this also improves functionality on older kernels and eliminates the incentive to program with deprecated library-level functions for the sake of compatibility with older kernels. in other situations where the old syscalls correspond to library-level functions which are not deprecated but merely lack some new features, such as the *at functions, the old syscalls are still used on archs which support them. this may change at some point in the future if or when fallback code is added to the new functions to make them usable (possibly with reduced functionality) on old kernels.
2014-05-24support kernels with no SYS_open syscall, only SYS_openatRich Felker-1/+1
open is handled specially because it is used from so many places, in so many variants (2 or 3 arguments, setting errno or not, and cancellable or not). trying to do it as a function would not only increase bloat, but would also risk subtle breakage. this is the first step towards supporting "new" archs where linux lacks "old" syscalls.
2014-04-21make __init_libc static for non-shared libcRich Felker-0/+3
being static allows it to be inlined in __libc_start_main; inlining should take place at all levels since the function is called exactly once. this further reduces mandatory startup code size for static binaries.
2014-04-21further micro-optimize startup code for sizeRich Felker-23/+14
there is no reason (and seemingly there never was any) for __init_security to be its own function. it's linked unconditionally so it can just be placed inline in __init_libc.
2014-04-21micro-optimize some startup code for sizeRich Felker-7/+4
moving the call to __init_ssp from __init_security to __init_libc makes __init_security a leaf function, which allows the compiler to make it smaller. __init_libc is already non-leaf, and the additional call makes no difference to the amount of register spillage. in addition, it really made no sense for the call to __init_ssp to be buried inside __init_security rather than parallel with other init functions.
2014-04-07remove some cruft from libc/tls init codeRich Felker-3/+0
2014-04-04remove cruft left behind when lazy thread pointer init was removedRich Felker-8/+0
the function itself was static, but the weak alias provided an externally visible reference and thus prevented the dead code from being omitted from the output. so this change actually reduces bloat in mandatory static-linked code.
2014-03-25remove lazy ssp initializationTimo Teräs-15/+5
now that thread pointer is initialized always, ssp canary initialization can be done unconditionally. this simplifies the ldso as it does not try to detect ssp usage, and the init function itself as it is always called exactly once. this also merges ssp init path for shared and static linking.
2014-03-24always initialize thread pointer at program startRich Felker-13/+50
this is the first step in an overhaul aimed at greatly simplifying and optimizing everything dealing with thread-local state. previously, the thread pointer was initialized lazily on first access, or at program startup if stack protector was in use, or at certain random places where inconsistent state could be reached if it were not initialized early. while believed to be fully correct, the logic was fragile and non-obvious. in the first phase of the thread pointer overhaul, support is retained (and in some cases improved) for systems/situation where loading the thread pointer fails, e.g. old kernels. some notes on specific changes: - the confusing use of libc.main_thread as an indicator that the thread pointer is initialized is eliminated in favor of an explicit has_thread_pointer predicate. - sigaction no longer needs to ensure that the thread pointer is initialized before installing a signal handler (this was needed to prevent a situation where the signal handler caused the thread pointer to be initialized and the subsequent sigreturn cleared it again) but it still needs to ensure that implementation-internal thread-related signals are not blocked. - pthread tsd initialization for the main thread is deferred in a new manner to minimize bloat in the static-linked __init_tp code. - pthread_setcancelstate no longer needs special handling for the situation before the thread pointer is initialized. it simply fails on systems that cannot support a thread pointer, which are non-conforming anyway. - pthread_cleanup_push/pop now check for missing thread pointer and nop themselves out in this case, so stdio no longer needs to avoid the cancellable path when the thread pointer is not available. a number of cases remain where certain interfaces may crash if the system does not support a thread pointer. at this point, these should be limited to pthread interfaces, and the number of such cases should be fewer than before.
2014-03-23reduce static linking overhead from TLS support by inlining mmap syscallRich Felker-1/+9
the external mmap function is heavy because it has to handle error reporting that the kernel cannot do, and has to do some locking for arcane race-condition-avoidance purposes. for allocating initial TLS, we do not need any of that; the raw syscall suffices. on i386, this change shaves off 13% of the size of .text for the empty program.
2013-12-12include cleanups: remove unused headers and add feature test macrosSzabolcs Nagy-6/+2
2013-10-07remove errno setting from setenv, malloc sets it correctly on oomSzabolcs Nagy-1/+0
2013-10-04fix failure to check malloc result in setenvRich Felker-9/+9
2013-09-15support configurable page size on mips, powerpc and microblazeSzabolcs Nagy-0/+1
PAGE_SIZE was hardcoded to 4096, which is historically what most systems use, but on several archs it is a kernel config parameter, user space can only know it at execution time from the aux vector. PAGE_SIZE and PAGESIZE are not defined on archs where page size is a runtime parameter, applications should use sysconf(_SC_PAGE_SIZE) to query it. Internally libc code defines PAGE_SIZE to libc.page_size, which is set to aux[AT_PAGESZ] in __init_libc and early in __dynlink as well. (Note that libc.page_size can be accessed without GOT, ie. before relocations are done) Some fpathconf settings are hardcoded to 4096, these should be actually queried from the filesystem using statfs.
2013-08-03add system for resetting TLS to initial valuesRich Felker-14/+40
this is needed for reused threads in the SIGEV_THREAD timer notification system, and could be reused elsewhere in the future if needed, though it should be refactored for such use. for static linking, __init_tls.c is simply modified to export the TLS info in a structure with external linkage, rather than using statics. this perhaps makes the code more clear, since the statics were poorly named for statics. the new __reset_tls.c is only linked if it is used. for dynamic linking, the code is in dynlink.c. sharing code with __copy_tls is not practical since __reset_tls must also re-zero thread-local bss.
2013-07-21remove __libc_csu_* cruftRich Felker-10/+0
these functions were mistakenly assumed to be needed to match glibc ABI, but glibc has them as part of the non-shared part of libc that's always statically linked into the main program. moreover, the only place they are referenced from is glibc's crt1.o.
2013-07-21add support for init/fini array in main program, and greatly simplifyRich Felker-13/+13
modern (4.7.x and later) gcc uses init/fini arrays, rather than the legacy _init/_fini function pasting and crtbegin/crtend ctors/dtors system, on most or all archs. some archs had already switched a long time ago. without following this change, global ctors/dtors will cease to work under musl when building with new gcc versions. the most surprising part of this patch is that it actually reduces the size of the init code, for both static and shared libc. this is achieved by (1) unifying the handling main program and shared libraries in the dynamic linker, and (2) eliminating the glibc-inspired rube goldberg machine for passing around init and fini function pointers. to clarify, some background: the function signature for __libc_start_main was based on glibc, as part of the original goal of being able to run some glibc-linked binaries. it worked by having the crt1 code, which is linked into every application, static or dynamic, obtain and pass pointers to the init and fini functions, which __libc_start_main is then responsible for using and recording for later use, as necessary. however, in neither the static-linked nor dynamic-linked case do we actually need crt1.o's help. with dynamic linking, all the pointers are available in the _DYNAMIC block. with static linking, it's safe to simply access the _init/_fini and __init_array_start, etc. symbols directly. obviously changing the __libc_start_main function signature in an incompatible way would break both old musl-linked programs and glibc-linked programs, so let's not do that. instead, the function can just ignore the information it doesn't need. new archs need not even provide the useless args in their versions of crt1.o. existing archs should continue to provide it as long as there is an interest in having newly-linked applications be able to run on old versions of musl; at some point in the future, this support can be removed.
2013-07-13fix omission of dtv setup in static linked programs on TLS variant I archsRich Felker-1/+1
apparently this was never noticed before because the linker normally optimizes dynamic TLS models to non-dynamic ones when static linking, thus eliminating the calls to __tls_get_addr which crash when the dtv is missing. however, some libsupc++ code on ARM was calling __tls_get_addr when static linked and crashing. the reason is unclear to me, but with this issue fixed it should work now anyway.
2013-04-06add support for program_invocation[_short]_nameRich Felker-2/+8
this is a bit ugly, and the motivation for supporting it is questionable. however the main factors were: 1. it will be useful to have this for certain internal purposes anyway -- things like syslog. 2. applications can just save argv[0] in main, but it's hard to fix non-portable library code that's depending on being able to get the invocation name without the main application's help.
2013-02-17remove unused #undef environ now that libc.h no longer #defines itRich Felker-1/+0
2012-12-25fix reference to libc struct in static tls init codeRich Felker-1/+1
libc is the macro, __libc is the internal symbol, but under some configurations on old/broken compilers, the symbol might not actually exist and the libc macro might instead use __libc_loc() to obtain access to the object.
2012-11-30fix ordering of shared library ctors with respect to libc initRich Felker-0/+5
previously, shared library constructors were being called before important internal things like the environment (extern char **environ) and hwcap flags (needed for sjlj to work right with float on arm) were initialized in __libc_start_main. rather than trying to have to dynamic linker make sure this stuff all gets initialized right, I've opted to just defer calling shared library constructors until after the main program's entry point is reached. this also fixes the order of ctors to be the exact reverse of dtors, which is a desirable property and possibly even mandated by some languages. the main practical effect of this change is that shared libraries calling getenv from ctors will no longer fail.
2012-11-08clean up sloppy nested inclusion from pthread_impl.hRich Felker-0/+2
this mirrors the stdio_impl.h cleanup. one header which is not strictly needed, errno.h, is left in pthread_impl.h, because since pthread functions return their error codes rather than using errno, nearly every single pthread function needs the errno constants. in a few places, rather than bringing in string.h to use memset, the memset was replaced by direct assignment. this seems to generate much better code anyway, and makes many functions which were previously non-leaf functions into leaf functions (possibly eliminating a great deal of bloat on some platforms where non-leaf functions require ugly prologue and/or epilogue).
2012-11-01fix unused variable warningsRich Felker-2/+1
2012-10-21as an extension, have putenv("VAR") behave as unsetenv("VAR")Rich Felker-5/+5
the behavior of putenv is left undefined if the argument does not contain an equal sign, but traditional implementations behave this way and gnulib replaces putenv if it doesn't do this.
2012-10-19fix crashes in static-linked multithreaded programs without TLSRich Felker-0/+2
2012-10-15add support for TLS variant I, presently needed for arm and mipsRich Felker-1/+8
despite documentation that makes it sound a lot different, the only ABI-constraint difference between TLS variants II and I seems to be that variant II stores the initial TLS segment immediately below the thread pointer (i.e. the thread pointer points to the end of it) and variant I stores the initial TLS segment above the thread pointer, requiring the thread descriptor to be stored below. the actual value stored in the thread pointer register also tends to have per-arch random offsets applied to it for silly micro-optimization purposes. with these changes applied, TLS should be basically working on all supported archs except microblaze. I'm still working on getting the necessary information and a working toolchain that can build TLS binaries for microblaze, but in theory, static-linked programs with TLS and dynamic-linked programs where only the main executable uses TLS should already work on microblaze. alignment constraints have not yet been heavily tested, so it's possible that this code does not always align TLS segments correctly on archs that need TLS variant I.
2012-10-11i386 vsyscall support (vdso-provided sysenter/syscall instruction based)Rich Felker-0/+3
this doubles the performance of the fastest syscalls on the atom I tested it on; improvement is reportedly much more dramatic on worst-case cpus. cannot be used for cancellable syscalls.
2012-10-08ensure that buffer for decoding auxv at startup is initially zeroRich Felker-1/+1
2012-10-07clean up and refactor program initializationRich Felker-31/+30
the code in __libc_start_main is now responsible for parsing auxv, rather than duplicating the parsing all over the place. this should shave off a few cycles and some code size. __init_libc is left as an external-linkage function despite the fact that it could be static, to prevent it from being inlined and permanently wasting stack space when main is called. a few other minor changes are included, like eliminating per-thread ssp canaries (they were likely broken when combined with certain dlopen usages, and completely unnecessary) and some other unnecessary checks. since this code gets linked into every program, it should be as small and simple as possible.
2012-10-06fix buggy TLS size/alignment computations in static-linked TLSRich Felker-5/+22
2012-10-05support for TLS in dynamic-loaded (dlopen) modulesRich Felker-2/+2
unlike other implementations, this one reserves memory for new TLS in all pre-existing threads at dlopen-time, and dlopen will fail with no resources consumed and no new libraries loaded if memory is not available. memory is not immediately distributed to running threads; that would be too complex and too costly. instead, assurances are made that threads needing the new TLS can obtain it in an async-signal-safe way from a buffer belonging to the dynamic linker/new module (via atomic fetch-and-add based allocator). I've re-appropriated the lock that was previously used for __synccall (synchronizing set*id() syscalls between threads) as a general pthread_create lock. it's a "backwards" rwlock where the "read" operation is safe atomic modification of the live thread count, which multiple threads can perform at the same time, and the "write" operation is making sure the count does not increase during an operation that depends on it remaining bounded (__synccall or dlopen). in static-linked programs that don't use __synccall, this lock is a no-op and has no cost.