|Age||Commit message (Collapse)||Author||Lines|
unsigned char promotes to int, which can overflow when shifted left by
24 bits or more. this has been reported multiple times but then
forgotten. it's expected to be benign UB, but can trap when built with
explicit overflow catching (ubsan or similar). fix it now.
note that promotion to uint32_t is safe and portable even outside of
the assumptions usually made in musl, since either uint32_t has rank
at least unsigned int, so that no further default promotions happen,
or int is wide enough that the shift can't overflow. this is a
desirable property to have in case someone wants to reuse the code
first, the condition (mem && k < p) is redundant, because mem being
nonzero implies the needle is periodic with period exactly p, in which
case any byte that appears in the needle must appear in the last p
bytes of the needle, bounding the shift (k) by p.
second, the whole point of replacing the shift k by mem (=l-p) is to
prevent shifting by less than mem when discarding the memory on shift,
in which case linear time could not be guaranteed. but as written, the
check also replaced shifts greater than mem by mem, reducing the
benefit of the shift. there is no possible benefit to this reduction of
the shift; since mem is being cleared, the full shift is valid and
more optimal. so only replace the shift by mem when it would be less
mem0 && mem && ... is redundant since mem can only be nonzero when
mem0 is nonzero.
Reported by Leah Neukirchen.
the two/three/four byte memmem specializations are not prepared to
handle haystacks shorter than the needle; they unconditionally read at
least up to the needle length and subtract from the haystack length.
if the haystack is shorter, the remaining haystack length underflows
and produces an unbounded search which will eventually either crash or
find a spurious match.
the top-level memmem function attempted to avoid this case already by
checking for haystack shorter than needle, but it failed to re-check
after using memchr to remove the maximal prefix not containing the
first byte of the needle.
the logic for this loop was copied from null-terminated-string logic
in strstr without properly adapting it to work with explicit lengths.
presumably this error could result in false negatives (wrongly
comparing past the end of the needle/haystack), false positives
(stopping comparison early when the needle contains null bytes), and
crashes (from runaway reads past the end of mapped memory).
in cases where the memorized match range from the right factor
exceeded the length of the left factor, it was wrongly treated as a
mismatch rather than a match.
issue reported by Yves Bastide.
to optimize the search, memchr is used to find the first occurrence of
the first character of the needle in the haystack before switching to
a search for the full needle. however, the number of characters
skipped by this first step were not subtracted from the haystack
length, causing memmem to search past the end of the haystack.
based on strstr. passes gnulib tests and a few quick checks of my own.