summaryrefslogtreecommitdiff
path: root/src/regex
AgeCommit message (Collapse)AuthorLines
2012-04-14fix signedness error handling invalid multibyte sequences in regexecRich Felker-2/+2
the "< 0" test was always false due to use of an unsigned type. this resulted in infinite loops on 32-bit machines (adding -1U to a pointer is the same as adding -1) and crashes on 64-bit machines (offsetting the string pointer by 4gb-1b when an illegal sequence was hit).
2012-04-13remove invalid code from TRERich Felker-14/+0
TRE wants to treat + and ? after a +, ?, or * as special; ? means ungreedy and + is reserved for future use. however, this is non-conformant. although redundant, these redundant characters have well-defined (no-op) meaning for POSIX ERE, and are actually _literal_ characters (which TRE is wrongly ignoring) in POSIX BRE mode. the simplest fix is to simply remove the unneeded nonstandard functionality. as a plus, this shaves off a small amount of bloat.
2012-04-13fix broken regerror (typo) and missing messageRich Felker-2/+2
2012-03-20upgrade to latest upstream TRE regex code (0.8.0)Rich Felker-1168/+1037
the main practical results of this change are 1. the regex code is no longer subject to LGPL; it's now 2-clause BSD 2. most (all?) popular nonstandard regex extensions are supported I hesitate to call this a "sync" since both the old and new code are heavily modified. in one sense, the old code was "more severely" modified, in that it was actively hostile to non-strictly-conforming expressions. on the other hand, the new code has eliminated the useless translation of the entire regex string to wchar_t prior to compiling, and now only converts multibyte character literals as needed. in the future i may use this modified TRE as a basis for writing the long-planned new regex engine that will avoid multibyte-to-wide character conversion entirely by compiling multibyte bracket expressions specific to UTF-8.
2012-01-23make glob mark symlinks-to-directories with the GLOB_MARK flagRich Felker-1/+1
POSIX is unclear on whether it should, but all historical implementations seem to behave this way, and it seems more useful to applications.
2012-01-22support GLOB_PERIOD flag (GNU extension) to glob functionRich Felker-1/+2
patch by sh4rm4
2011-06-16duplicate re_nsub in LSB/glibc ABI compatible locationRich Felker-1/+1
2011-06-06fix handling of d_name in struct direntRich Felker-3/+2
basically there are 3 choices for how to implement this variable-size string member: 1. C99 flexible array member: breaks using dirent.h with pre-C99 compiler. 2. old way: length-1 string: generates array bounds warnings in caller. 3. new way: length-NAME_MAX string. no problems, simplifies all code. of course the usable part in the pointer returned by readdir might be shorter than NAME_MAX+1 bytes, but that is allowed by the standard and doesn't hurt anything.
2011-06-05safety fix for glob's vla usage: disallow patterns longer than PATH_MAXRich Felker-0/+2
this actually inadvertently disallows some valid patterns with redundant / or * characters, but it's better than allowing unbounded vla allocation. eventually i'll write code to move the pattern to the stack and eliminate redundancy to ensure that it fits in PATH_MAX at the beginning of glob. this would also allow it to be modified in place for passing to fnmatch rather than copied at each level of recursion.
2011-06-05eliminate (harmless in this case) vla usage in fnmatch.cRich Felker-1/+1
2011-04-07fix bug in TRE found by clang (typo && instead of &)Rich Felker-1/+1
2011-02-12initial check-in, version 0.5.0v0.5.0Rich Felker-0/+5364