path: root/src/locale/__setlocalecat.c
AgeCommit message (Collapse)AuthorLines
2015-05-27rename internal locale file handling locale mapsRich Felker-124/+0
since the __setlocalecat function was removed, the filename __setlocalecat.c no longer made sense.
2015-05-27overhaul locale internals to treat categories roughly uniformlyRich Felker-53/+63
previously, LC_MESSAGES was treated specially as the only category which could be set to a locale name without a definition file, in order to facilitate gettext message translations when no libc locale was available. LC_NUMERIC was completely un-settable, and LC_CTYPE stored a flag intended to be used for a possible future byte-based C locale, instead of storing a __locale_map pointer like the other categories use. this patch changes all categories to be represented by pointers to __locale_map structures, and allows locale names without definition files to be treated as valid locales with trivial definition when used in any category. outwardly visible functional changes should be minor, limited mainly to the strings read back from setlocale and the way gettext handles translations in categories other than LC_MESSAGES. various internal refactoring has also been performed, and improvements in const correctness have been made.
2015-05-27replace atomics with locks in locale-setting codeRich Felker-16/+19
this is part of a general program of removing direct use of atomics where they are not necessary to meet correctness or performance needs, but in this case it's also an optimization. only the global locale needs synchronization; allocated locales referenced with locale_t handles are immutable during their lifetimes, and using atomics to initialize them increases their cost of setup.
2015-03-03make all objects used with atomic operations volatileRich Felker-1/+1
the memory model we use internally for atomics permits plain loads of values which may be subject to concurrent modification without requiring that a special load function be used. since a compiler is free to make transformations that alter the number of loads or the way in which loads are performed, the compiler is theoretically free to break this usage. the most obvious concern is with atomic cas constructs: something of the form tmp=*p;a_cas(p,tmp,f(tmp)); could be transformed to a_cas(p,*p,f(*p)); where the latter is intended to show multiple loads of *p whose resulting values might fail to be equal; this would break the atomicity of the whole operation. but even more fundamental breakage is possible. with the changes being made now, objects that may be modified by atomics are modeled as volatile, and the atomic operations performed on them by other threads are modeled as asynchronous stores by hardware which happens to be acting on the request of another thread. such modeling of course does not itself address memory synchronization between cores/cpus, but that aspect was already handled. this all seems less than ideal, but it's the best we can do without mandating a C11 compiler and using the C11 model for atomics. in the case of pthread_once_t, the ABI type of the underlying object is not volatile-qualified. so we are assuming that accessing the object through a volatile-qualified lvalue via casts yields volatile access semantics. the language of the C standard is somewhat unclear on this matter, but this is an assumption the linux kernel also makes, and seems to be the correct interpretation of the standard.
2014-07-31harden locale name handling and prevent slashes in LC_MESSAGESRich Felker-3/+3
the code which loads locale files was already rejecting locale names containing slashes. however, LC_MESSAGES records a locale name even if libc does not have a matching locale file, so that gettext or application code can use the recorded locale name for message translations to languages that libc does not support. this recorded name was not being checked for slashes, meaning that such code could potentially be tricked into directory traversal. in addition, since the value of a locale category is sometimes used as a pathname component by callers, the improved code rejects any value beginning with a dot. this prevents traversal to the parent directory via "..", use of the top-level locale directory via ".", and also avoids "hidden" directories as a side effect. finally, overly long locale names are now rejected (treated as an unrecognized name and thus as an alias for C.UTF-8) rather than being truncated.
2014-07-26implement mo file string lookup for translationsRich Felker-0/+7
the core is based on a binary search; hash table is not used. both native and reverse-endian mo files are supported. all offsets read from the mapped mo file are checked against the mapping size to prevent the possibility of reads outside the mapping. this commit has no observable effects since there are not yet any callers to the message translation code.
2014-07-24implement locale file loading and state for remaining locale categoriesRich Felker-0/+58
there is still no code which actually uses the loaded locale files, so the main observable effect of this commit is that calls to setlocale store and give back the names of the selected locales for the remaining categories (LC_TIME, LC_COLLATE, LC_MONETARY) if a locale file by the requested name could be loaded.
2014-07-24fix locale environment variable logic for empty stringsRich Felker-3/+3
per POSIX (XBD 8.2) LC_*/LANG environment variables set to to the empty string are supposed to be treated as if they were not set at all.
2014-07-02add locale frameworkRich Felker-0/+46
this commit adds non-stub implementations of setlocale, duplocale, newlocale, and uselocale, along with the data structures and minimal code needed for representing the active locale on a per-thread basis and optimizing the common case where thread-local locale settings are not in use. at this point, the data structures only contain what is necessary to represent LC_CTYPE (a single flag) and LC_MESSAGES (a name for use in finding message translation files). representation for the other categories will be added later; the expectation is that a single pointer will suffice for each. for LC_CTYPE, the strings "C" and "POSIX" are treated as special; any other string is accepted and treated as "C.UTF-8". for other categories, any string is accepted after being truncated to a maximum supported length (currently 15 bytes). for LC_MESSAGES, the name is kept regardless of whether libc itself can use such a message translation locale, since applications using catgets or gettext should be able to use message locales libc is not aware of. for other categories, names which are not successfully loaded as locales (which, at present, means all names) are treated as aliases for "C". setlocale never fails. locale settings are not yet used anywhere, so this commit should have no visible effects except for the contents of the string returned by setlocale.