From nobody Thu Apr 17 01:05:55 2025 X-Original-To: dev-commits-src-branches@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ZdKTJ14TLz5tVRh; Thu, 17 Apr 2025 01:05:56 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ZdKTH6fKmz3XpK; Thu, 17 Apr 2025 01:05:55 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1744851956; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=3IGcRJbE+ayr3aAaEbs9ltMM+WhBOj+xUcfE7sa5txw=; b=gSzum/Wn4GRvWTEUaeL9tMNkpN3UgGEmBUxSMdS/iHUCbdi4T2Q7tPzT7nKXXBRoKA9FYm dD9HpWKqVNOqKpvNSoGAxOtMWhsqTgpFCsyRW4mFTvF0UV4I8D95LEBbXUiETX1wg5FbL4 gLTr0kOvVCnXatoHT67y0sTSh9rUQawR2leRjnUEHjuS56SCT8GRmlSpPx8faAn6aMx53m mrW1cF3TzL+qs+3CTKkDGiG9pKu/SIIj7CoY4SyYxpAZ5aDaqUTOOkoLx0IcQdknzxot9M HWuO28SWne4FuzVbpoVL/th3YDsqrmdnFq/Ni5+8R0Q466a/QMzTuqiJkJK0ng== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1744851956; a=rsa-sha256; cv=none; b=Is4kytD+aWObQaXhXLLn8Nyd31wfRmPBqZKwOa1rgEImerGkSy9A0sBvYymdEyTeRDPGcY 7WGxSqFgMIxE+SkYeXE8AK/gDJLZqQbcnWS59VKiMoAKBwZ6cLbuY2T6F+g2+ecwRzyfbH mX/Zmx4mjA92y7CofNFQxI1m9HQEso2SAstjCgMDnu7fi3R0HewVu3yBzusXZUMlomntec q6LZf7CRcjwCRA8cDsL343/Xxnep+UxcXXbnecsk691lfcPBoZVCn2k36rINN6qJaHSjAC DyrBzzemv1+3glr0keCc8oZZPD+lWhjajQYl3pXJsaCjuChFnQ/bgmLnUZryaQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1744851956; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=3IGcRJbE+ayr3aAaEbs9ltMM+WhBOj+xUcfE7sa5txw=; b=JFeY6dfrxr1fkxUnZiYKe11BfgPMLC+JXbsM6EVJXNVG9x8K9zD1qV5Jk151yIVkY4NEJZ 0L2qNlBNF9LETy6Ty/foy2u3lufxmveLPT3x0Gnd1OLbN7HY5HVYdAg96kwA/5lUbh6AJ7 /kj/M1Xll8ouYV9qt4wFXAruaohXkoRvCjoXd+QSUVjdIrMmTPT9iuHQQLoIU3V59bGkNQ z1YbCzfEj0G4weVlHqq+4PmojqgMF+ueSdaz9eFyyggfUzMKySE7HZghJI5qpUE9aNqwbq UHgh4t98MDY6hjo6VNWGjuu7mhwfHRHZzMMcDpJGcs3pd10aIzHGmoZB1R15Bw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4ZdKTH6Fl1zBJF; Thu, 17 Apr 2025 01:05:55 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 53H15t70062411; Thu, 17 Apr 2025 01:05:55 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 53H15toI062408; Thu, 17 Apr 2025 01:05:55 GMT (envelope-from git) Date: Thu, 17 Apr 2025 01:05:55 GMT Message-Id: <202504170105.53H15toI062408@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Kyle Evans Subject: git: a420e1b1cee6 - stable/14 - grep: avoid duplicated lines when we're coloring output List-Id: Commits to the stable branches of the FreeBSD src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-branches List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-branches@freebsd.org Sender: owner-dev-commits-src-branches@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kevans X-Git-Repository: src X-Git-Refname: refs/heads/stable/14 X-Git-Reftype: branch X-Git-Commit: a420e1b1cee696f6f0fdeacdc04fd4f1e992234b Auto-Submitted: auto-generated The branch stable/14 has been updated by kevans: URL: https://cgit.FreeBSD.org/src/commit/?id=a420e1b1cee696f6f0fdeacdc04fd4f1e992234b commit a420e1b1cee696f6f0fdeacdc04fd4f1e992234b Author: Kyle Evans AuthorDate: 2025-03-20 04:34:13 +0000 Commit: Kyle Evans CommitDate: 2025-04-17 01:01:29 +0000 grep: avoid duplicated lines when we're coloring output For the default uncolored output, we'll just output a line once and then move on. For colored output, we'll output multiple matches per line with context from the line interspersed and may end up writing out some match context multiple times as we don't persist which part of the lines have already been printed. Fix it by tracking the length of line printed thus far in printline() and retaining it across successive calls to printline() in the same line. printline() should indicate whether it terminated the line or not to avoid tracking the logic for that in multiple places: -o lines are always terminated, so it's generally only some --color contexts where we wouldn't have terminated. Add a test to make sure that we're only printing one line going forward. Reported and tested by: Jamie Landeg-Jones Reviewed by: emaste (cherry picked from commit 4c9ffb13dd74159bd3ed7e1c4c706dbd15a70df2) --- usr.bin/grep/tests/grep_freebsd_test.sh | 15 +++++++ usr.bin/grep/util.c | 72 +++++++++++++++++++++++++++------ 2 files changed, 74 insertions(+), 13 deletions(-) diff --git a/usr.bin/grep/tests/grep_freebsd_test.sh b/usr.bin/grep/tests/grep_freebsd_test.sh index 77017529843e..906b70645151 100755 --- a/usr.bin/grep/tests/grep_freebsd_test.sh +++ b/usr.bin/grep/tests/grep_freebsd_test.sh @@ -103,10 +103,25 @@ zflag_body() atf_check grep -qz "foo.*bar" in } +atf_test_case color_dupe +color_dupe_body() +{ + + # This assumes a MAX_MATCHES of exactly 32. Previously buggy procline() + # calls would terminate the line premature every MAX_MATCHES matches, + # meaning we'd see the line be output again for the next MAX_MATCHES + # number of matches. + jot -nb 'A' -s '' 33 > in + + atf_check -o save:color.out grep --color=always . in + atf_check -o match:"^ +1 color.out" wc -l color.out +} + atf_init_test_cases() { atf_add_test_case grep_r_implied atf_add_test_case rgrep atf_add_test_case gnuext atf_add_test_case zflag + atf_add_test_case color_dupe } diff --git a/usr.bin/grep/util.c b/usr.bin/grep/util.c index 4e1c44b442f2..ed87e56956f6 100644 --- a/usr.bin/grep/util.c +++ b/usr.bin/grep/util.c @@ -72,7 +72,7 @@ static int litexec(const struct pat *pat, const char *string, size_t nmatch, regmatch_t pmatch[]); #endif static bool procline(struct parsec *pc); -static void printline(struct parsec *pc, int sep); +static bool printline(struct parsec *pc, int sep, size_t *last_out); static void printline_metadata(struct str *line, int sep); bool @@ -214,15 +214,29 @@ procmatch_match(struct mprintc *mc, struct parsec *pc) /* Print the matching line, but only if not quiet/binary */ if (mc->printmatch) { - printline(pc, ':'); + size_t last_out; + bool terminated; + + last_out = 0; + terminated = printline(pc, ':', &last_out); while (pc->matchidx >= MAX_MATCHES) { /* Reset matchidx and try again */ pc->matchidx = 0; if (procline(pc) == !vflag) - printline(pc, ':'); + terminated = printline(pc, ':', &last_out); else break; } + + /* + * The above loop processes the entire line as long as we keep + * hitting the maximum match count. At this point, we know + * that there's nothing left to be printed and can terminate the + * line. + */ + if (!terminated) + printline(pc, ':', &last_out); + first_match = false; mc->same_file = true; mc->last_outed = 0; @@ -748,26 +762,39 @@ printline_metadata(struct str *line, int sep) } /* - * Prints a matching line according to the command line options. + * Prints a matching line according to the command line options. We need + * *last_out to be populated on entry in case this is just a continuation of + * matches within the same line. + * + * Returns true if the line was terminated, false if it was not. */ -static void -printline(struct parsec *pc, int sep) +static bool +printline(struct parsec *pc, int sep, size_t *last_out) { - size_t a = 0; + size_t a = *last_out; size_t i, matchidx; regmatch_t match; + bool terminated; + + /* + * Nearly all paths below will terminate the line by default, but it is + * avoided in some circumstances in case we don't have the full context + * available here. + */ + terminated = true; /* If matchall, everything matches but don't actually print for -o */ if (oflag && matchall) - return; + return (terminated); matchidx = pc->matchidx; /* --color and -o */ - if ((oflag || color) && matchidx > 0) { + if ((oflag || color) && (pc->printed > 0 || matchidx > 0)) { /* Only print metadata once per line if --color */ - if (!oflag && pc->printed == 0) + if (!oflag && pc->printed == 0) { printline_metadata(&pc->ln, sep); + } for (i = 0; i < matchidx; i++) { match = pc->matches[i]; /* Don't output zero length matches */ @@ -780,9 +807,10 @@ printline(struct parsec *pc, int sep) if (oflag) { pc->ln.boff = match.rm_so; printline_metadata(&pc->ln, sep); - } else + } else { fwrite(pc->ln.dat + a, match.rm_so - a, 1, stdout); + } if (color) fprintf(stdout, "\33[%sm\33[K", color); fwrite(pc->ln.dat + match.rm_so, @@ -793,13 +821,31 @@ printline(struct parsec *pc, int sep) if (oflag) putchar('\n'); } - if (!oflag) { - if (pc->ln.len - a > 0) + + /* + * Don't terminate if we reached the match limit; we may have + * other matches on this line to process. + */ + *last_out = a; + if (!oflag && matchidx != MAX_MATCHES) { + if (pc->ln.len - a > 0) { fwrite(pc->ln.dat + a, pc->ln.len - a, 1, stdout); + *last_out = pc->ln.len; + } putchar('\n'); + } else if (!oflag) { + /* + * -o is terminated on every match output, so this + * branch is only designed to capture MAX_MATCHES in a + * line which may be a signal to us for a lack of + * context. The caller will know more and call us again + * to terminate if it needs to. + */ + terminated = false; } } else grep_printline(&pc->ln, sep); pc->printed++; + return (terminated); }