Skip to content

Commit

Permalink
bsdgrep: fix -w -v matching improperly with certain patterns
Browse files Browse the repository at this point in the history
-w and -v flag matching was mostly functional but had some minor
problems:

1. -w flag processing only allowed one iteration through pattern
   matching on a line. This was problematic if one pattern could match
   more than once, or if there were multiple patterns and the earliest/
   longest match was not the most ideal, and

2. Previous work "fixed" things to not further process a line if the
   first iteration through patterns produced no matches. This is clearly
   wrong if we're dealing with the more restrictive -w matching.

#2 breakage could have also occurred before recent broad rewrites, but
it would be more arbitrary based on input patterns as to whether or not
it actually affected things.

Fix both of these by forcing a retry of the patterns after advancing
just past the start of the first match if we're doing more restrictive
-w matching and we didn't get any hits to start with. Also move -v flag
processing outside of the loop so that we have a greater change to match
in the more restrictive cases. This wasn't strictly wrong, but it could
be a little more error prone.

While here, introduce some regressions tests for this behavior and fix
some excessive wrapping nearby that hindered readability. GNU grep
passes these new tests.

PR:		218467, 218811
Submitted by:	Kyle Evans <kevans91 at ksu.edu>
Reviewed by:	cem, ngie
Differential Revision:	https://reviews.freebsd.org/D10329
  • Loading branch information
emaste committed May 2, 2017
1 parent 11e7e39 commit 93b20c9
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 3 deletions.
24 changes: 24 additions & 0 deletions contrib/netbsd-tests/usr.bin/grep/t_grep.sh
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,12 @@ word_regexps_body()
{
atf_check -o file:"$(atf_get_srcdir)/d_word_regexps.out" \
grep -w separated $(atf_get_srcdir)/d_input

# Begin FreeBSD
printf "xmatch pmatch\n" > test1

atf_check -o inline:"pmatch\n" grep -Eow "(match )?pmatch" test1
# End FreeBSD
}

atf_test_case begin_end
Expand Down Expand Up @@ -439,6 +445,23 @@ grep_sanity_body()

atf_check -o inline:"M\n" grep -o -e "M\{1\}" test2
}

atf_test_case wv_combo_break
wv_combo_break_head()
{
atf_set "descr" "Check for incorrectly matching lines with both -w and -v flags (PR 218467)"
}
wv_combo_break_body()
{
printf "x xx\n" > test1
printf "xx x\n" > test2

atf_check -o file:test1 grep -w "x" test1
atf_check -o file:test2 grep -w "x" test2

atf_check -s exit:1 grep -v -w "x" test1
atf_check -s exit:1 grep -v -w "x" test2
}
# End FreeBSD

atf_init_test_cases()
Expand Down Expand Up @@ -467,6 +490,7 @@ atf_init_test_cases()
atf_add_test_case escmap
atf_add_test_case egrep_empty_invalid
atf_add_test_case zerolen
atf_add_test_case wv_combo_break
atf_add_test_case fgrep_sanity
atf_add_test_case egrep_sanity
atf_add_test_case grep_sanity
Expand Down
27 changes: 24 additions & 3 deletions usr.bin/grep/util.c
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,7 @@ procline(struct str *l, int nottext)
unsigned int i;
int c = 0, m = 0, r = 0, lastmatches = 0, leflags = eflags;
int startm = 0;
int retry;

/* Initialize to avoid a false positive warning from GCC. */
lastmatch.rm_so = lastmatch.rm_eo = 0;
Expand All @@ -313,6 +314,7 @@ procline(struct str *l, int nottext)
while (st <= l->len) {
lastmatches = 0;
startm = m;
retry = 0;
if (st > 0)
leflags |= REG_NOTBOL;
/* Loop to compare with all the patterns */
Expand Down Expand Up @@ -356,6 +358,17 @@ procline(struct str *l, int nottext)
else if (iswword(wbegin) ||
iswword(wend))
r = REG_NOMATCH;
/*
* If we're doing whole word matching and we
* matched once, then we should try the pattern
* again after advancing just past the start of
* the earliest match. This allows the pattern
* to match later on in the line and possibly
* still match a whole word.
*/
if (r == REG_NOMATCH &&
(retry == 0 || pmatch.rm_so + 1 < retry))
retry = pmatch.rm_so + 1;
}
if (r == 0) {
lastmatches++;
Expand Down Expand Up @@ -385,9 +398,14 @@ procline(struct str *l, int nottext)
}
}

if (vflag) {
c = !c;
break;
/*
* Advance to just past the start of the earliest match, try
* again just in case we still have a chance to match later in
* the string.
*/
if (lastmatches == 0 && retry > 0) {
st = retry;
continue;
}

/* One pass if we are not recording matches */
Expand All @@ -410,6 +428,9 @@ procline(struct str *l, int nottext)
}


if (vflag)
c = !c;

/* Count the matches if we have a match limit */
if (mflag)
mcount -= c;
Expand Down

0 comments on commit 93b20c9

Please sign in to comment.