[Bug 224160] [patch] wc -c is slow
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Fri Dec 8 14:34:53 UTC 2017
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224160
Conrad Meyer <cem at freebsd.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |patch
Status|New |In Progress
Summary|wc -c is slow |[patch] wc -c is slow
Assignee|freebsd-bugs at FreeBSD.org |cem at freebsd.org
--- Comment #2 from Conrad Meyer <cem at freebsd.org> ---
wc(1) uses a stack buffer of size MAXBSIZE, or 64kB. Increasing this may help
(move it to the heap).
Secondly, there is an optimization for counting lines, and that same
optimization counts characters, but it is not used if wc is only asked to count
characters! Silly. It's also not used if wc is asked to count stdin! Stupid.
Just fixing stdin + character count optimization gives much better results,
comparable to GNU wc:
2097152000
~/obj/usr/home/conrad/src/freebsd/amd64.amd64/usr.bin/wc/wc -c 0.01s user
0.43s system 45% cpu 0.964 total
Bumping the buffer size to 4 MB yields big improvement in system time. (Note
that the dd size was increased 10x.)
Before:
20971520000
~/obj/usr/home/conrad/src/freebsd/amd64.amd64/usr.bin/wc/wc -c 0.14s user
3.99s system 42% cpu 9.653 total
After:
20971520000
~/obj/usr/home/conrad/src/freebsd/amd64.amd64/usr.bin/wc/wc -c 0.12s user
1.90s system 40% cpu 4.954 total
GNU wc is actually worse:
20971520000
gwc -c 0.21s user 2.91s system 48% cpu 6.490 total
Here is the PoC patch (whitespace changes elided (-w) for legibility). Note
that it leaks memory. 4 MB may be totally inappropriate for small devices,
too.
--- a/usr.bin/wc/wc.c
+++ b/usr.bin/wc/wc.c
@@ -199,15 +199,17 @@ cnt(const char *file)
size_t clen;
short gotsp;
u_char *p;
- u_char buf[MAXBSIZE];
+ u_char *buf;
wchar_t wch;
mbstate_t mbs;
+#define MY_BUF_SIZE (4 * 1024 * 1024)
+ buf = malloc(MY_BUF_SIZE);
+
linect = wordct = charct = llct = tmpll = 0;
if (file == NULL)
fd = STDIN_FILENO;
- else {
- if ((fd = open(file, O_RDONLY, 0)) < 0) {
+ else if ((fd = open(file, O_RDONLY, 0)) < 0) {
xo_warn("%s: open", file);
return (1);
}
@@ -218,8 +220,8 @@ cnt(const char *file)
* lines than to get words, since the word count requires some
* logic.
*/
- if (doline) {
- while ((len = read(fd, buf, MAXBSIZE))) {
+ if (doline || dochar) {
+ while ((len = read(fd, buf, MY_BUF_SIZE))) {
if (len == -1) {
xo_warn("%s: read", file);
(void)close(fd);
@@ -230,6 +232,7 @@ cnt(const char *file)
llct);
}
charct += len;
+ if (doline) {
for (p = buf; len--; ++p)
if (*p == '\n') {
if (tmpll > llct)
@@ -239,7 +242,9 @@ cnt(const char *file)
} else
tmpll++;
}
+ }
reset_siginfo();
+ if (doline)
tlinect += linect;
if (dochar)
tcharct += charct;
@@ -270,13 +275,12 @@ cnt(const char *file)
return (0);
}
}
- }
/* Do it the hard way... */
word: gotsp = 1;
warned = 0;
memset(&mbs, 0, sizeof(mbs));
- while ((len = read(fd, buf, MAXBSIZE)) != 0) {
+ while ((len = read(fd, buf, MY_BUF_SIZE)) != 0) {
if (len == -1) {
xo_warn("%s: read", file != NULL ? file : "stdin");
(void)close(fd);
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list