svn commit: r210578 - head/usr.bin/grep

Xin LI delphij at delphij.net
Thu Jul 29 22:25:12 UTC 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi, Jilles,

Thanks for your comments.

On 2010/07/29 05:20, Jilles Tjoelker wrote:
> On Thu, Jul 29, 2010 at 12:11:14AM +0000, Gabor Kovesdan wrote:
>> Author: gabor
>> Date: Thu Jul 29 00:11:14 2010
>> New Revision: 210578
>> URL: http://svn.freebsd.org/changeset/base/210578
> 
>> Log:
>>   - Some improvements on the exiting code, like replacing memcpy with
>>     strlcpy/strcpy
> 
> Hmm, I don't think this is an improvement :(
> 
> If you know the length of the string, then memcpy() is usually the right
> function to use. I think this is clearer, and it is definitely more
> efficient.

Agreed.

> The strlcpy() function is for cases where the code does not know the
> length. The receiving buffer typically has a fixed length and overflow
> either leads to an error condition or silent truncation. (In the latter
> case, a security problem might still result.)
>
> Note that this is the reasoning of the glibc people why they do not want
> strlcat/strlcpy. I think these functions are acceptable but should be
> used with care, not as a panacea against all buffer overflows.
>
>> Modified: head/usr.bin/grep/fastgrep.c
>> ==============================================================================
>> --- head/usr.bin/grep/fastgrep.c	Wed Jul 28 21:52:09 2010	(r210577)
>> +++ head/usr.bin/grep/fastgrep.c	Thu Jul 29 00:11:14 2010	(r210578)
>> @@ -119,8 +119,7 @@ fastcomp(fastgrep_t *fg, const char *pat
>>  	 * string respectively.
>>  	 */
>>  	fg->pattern = grep_malloc(fg->len + 1);
>> -	memcpy(fg->pattern, pat + (bol ? 1 : 0) + wflag, fg->len);
>> -	fg->pattern[fg->len] = '\0';
>> +	strlcpy(fg->pattern, pat + (bol ? 1 : 0) + wflag, fg->len + 1);
>>  
>>  	/* Look for ways to cheat...er...avoid the full regex engine. */
>>  	for (i = 0; i < fg->len; i++) {
> 
> :(
> 
> Note that this code may not be safe if fg->len comes from an untrusted
> user, as fg->len + 1 is 0 if fg->len == SIZE_MAX. This is not the case
> if fg->len is an actual length from strlen() or similar.

Speaking for this piece of code, I have to say that the modified version
is actually safer (an improvement, as the attacker could not overwrite
arbitrary memory).

If fg->len + 1 == 0, fg->pattern would point to a small area (assuming
normal malloc.conf setting without V) where, for memcpy, it would
overwrite fg->len bytes, while strlcpy() will do nothing.

By the way how can fg->len come from an untrusted party?  It's
strlen(pat) which I don't think can ever reach SIZE_MAX without crashing
the program.

I'll dig further for this piece of code anyways.

>> Modified: head/usr.bin/grep/grep.c
>> ==============================================================================
>> --- head/usr.bin/grep/grep.c	Wed Jul 28 21:52:09 2010	(r210577)
>> +++ head/usr.bin/grep/grep.c	Thu Jul 29 00:11:14 2010	(r210578)
> [snip]
>> @@ -234,32 +237,44 @@ add_pattern(char *pat, size_t len)
>>  		--len;
>>  	/* pat may not be NUL-terminated */
>>  	pattern[patterns] = grep_malloc(len + 1);
>> -	memcpy(pattern[patterns], pat, len);
>> -	pattern[patterns][len] = '\0';
>> +	strlcpy(pattern[patterns], pat, len + 1);
>>  	++patterns;
>>  }
> 
> :(
> 
> Alternatively, consider strndup() here.

I think we should probably revert the code back to its previous state.
Using memcpy and assigning nul seems to be a better approach.

strndup() needs to re-evaluate strlen() which doesn't sound like a good
idea here.

>> Modified: head/usr.bin/grep/queue.c
>> ==============================================================================
>> --- head/usr.bin/grep/queue.c	Wed Jul 28 21:52:09 2010	(r210577)
>> +++ head/usr.bin/grep/queue.c	Thu Jul 29 00:11:14 2010	(r210578)
>> @@ -60,7 +60,7 @@ enqueue(struct str *x)
>>  	item->data.len = x->len;
>>  	item->data.line_no = x->line_no;
>>  	item->data.off = x->off;
>> -	memcpy(item->data.dat, x->dat, x->len);
>> +	strcpy(item->data.dat, x->dat);
>>  	item->data.file = x->file;
>>  
>>  	STAILQ_INSERT_TAIL(&queue, item, list);
> 
> :(

Oops, I didn't caught this :-/  Sorry.

>> Modified: head/usr.bin/grep/util.c
>> ==============================================================================
>> --- head/usr.bin/grep/util.c	Wed Jul 28 21:52:09 2010	(r210577)
>> +++ head/usr.bin/grep/util.c	Thu Jul 29 00:11:14 2010	(r210578)
[snip]
>> @@ -102,30 +141,19 @@ grep_tree(char **argv)
>>  		default:
>>  			/* Check for file exclusion/inclusion */
>>  			ok = true;
>> -			if (exclflag) {
>> +			if (dexclude || dinclude) {
>>  				if ((d = strrchr(p->fts_path, '/')) != NULL) {
>>  					dir = grep_malloc(sizeof(char) *
>>  					    (d - p->fts_path + 2));
>>  					strlcpy(dir, p->fts_path,
>>  					    (d - p->fts_path + 1));
> 
> Why do the buffer sizes differ here? Also, this can be a memcpy plus a
> '\0' write instead of a strlcpy.

I think in this case the malloc part should be d - p->fts_path + 1.

Speaking for strlcpy vs memcpy for this case, I don't see much
difference as grep_tree() is being called during fts operation, these
string can never be long enough, nor the code path be "hot" enough
without I/O.

I'll take care for the rest of changes.

Cheers,
- -- 
Xin LI <delphij at delphij.net>	http://www.delphij.net/
FreeBSD - The Power to Serve!	       Live free or die
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (FreeBSD)

iQEcBAEBCAAGBQJMUf+2AAoJEATO+BI/yjfBPuoH/0zahV2GYxBJIyuxw+WYMvjT
WnPI4I9piIuGtREYS2kUYY5HGswPwOiRGxEoNQ4m1wWYkpSf3gtjnR7B+lCkzYeJ
WloTK8jfuYg7stGozJvOVGd4SIQBjchI3OepR86YbKCqGTDMo3iL+Fou/RQruJ5G
HyAvZv00HNrGVagdBT/0KHrRiZ/W5bT7+lwuRxR8VsSdxBXjDBS0L87oiBGfvr5u
bKmEWNUWNELJb+j2yvIh7M7M5CJIrlqx7enpGhyGciaEaWZzWuRF2aVmqGhlcBVt
m5XcChcd7jpGr/HOQq0o3hAvfG684BKG+UPm7yBSs8bAZwrXVdECXnbYS9MxcAc=
=8dGx
-----END PGP SIGNATURE-----


More information about the svn-src-all mailing list