standards/177742: conflict of dd's bs= option with use of conv=sparse

Matthew Rezny mrezny at hexaneinc.com
Wed Apr 10 00:50:01 UTC 2013


>Number:         177742
>Category:       standards
>Synopsis:       conflict of dd's bs= option with use of conv=sparse
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-standards
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Apr 10 00:50:00 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     Matthew Rezny
>Release:        9.1-RELEASE
>Organization:
RezTek, s.r.o.
>Environment:
FreeBSD 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec 4 06:55:39 UTC 2012 root at obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386

>Description:
Invoking dd with a blocksize specified disables sparse conversion. 

The dd utility has numerous options which affect the behavior of other options. One of the broader reaching options is blocksize as altering the size of blocks alters any options that work in terms of blocks (e.g. seek, skip, count, etc). That change is obvious, but the impact on conversion options in non-obvious.

Excerpts from dd's man page:
bs=n     Set both input and output block size to n bytes, superseding the
	 ibs and obs operands.  If no conversion values other than
	 noerror, notrunc or sync are specified, then each input block is
	 copied to the output as a single block without any aggregation
	 of short blocks.
conv=value[,value ...]
	 Where value is one of the symbols from the following list.
sparse   If one or more output blocks would consist solely of
	 NUL bytes, try to seek the output file by the required
	 space instead of filling them with NULs, resulting in a
	 sparse file.

The documentation of the bs= option only mentions an affect on aggregation of blocks with some conversions operations but does not state anything about disabling conversion. The sparse conversion is not explicitly documented as being affected by the bs option. As it states the sparse conversion works on blocks of output, it's implied that the effect of bs on sparse would be to set the size of of a run of NULs that must be found to trigger the lseek.

Looking at the source code to dd, the function dd_out(int force) works as expected: loop through the output buffer, if sparse option is set then set sparse flag, if output block has any non-NULLs then clear sparse flag, if sparse && !force then seek otherwise write output block, continue through loop.

Moving on to the function dd_in() reveals the smoking gun along with an interesting comment.
/*
 * POSIX states that if bs is set and no other conversions
 * than noerror, notrunc or sync are specified, the block
 * is output without buffering as it is read.
 */
if (ddflags & C_BS) {
    out.dbcnt = in.dbcnt;
    dd_out(1);
    in.dbcnt = 0;
    continue;
}

The comment says that if we aren't doing conversions then we output without buffering, but that is not quite what the code does. If the bs= option was set, then regardless of conversions that may be specified, dd_out() is called with force=1. When force is set, then dd_out() will skip all the sparse processing.

If I want to efficiently image a disk that is not largely free, I would expect to be able to do so with something like dd if=/dev/da0 of=disk.img bs=64k conv=sparse. Unfortunately, doing so will end up with a full size image of the disk. If I want to actually get sparse conversion, I must drop the bs= options and take the speed hit of single sector I/O requests. The difference in total throughput of 1 sector vs 128 sectors per I/O request is more than 10x. Worse than that performance hit, if there is some reason I want to see multiple sectors of NULLs before doing a lseek on the output (perhaps same size as file system clusters) there is just no way to do so.

>How-To-Repeat:
# dd if=/dev/zero of=test1 count=1024 conv=sparse
1024+0 records in
1024+0 records out
524288 bytes transferred in 0.000758 secs (691734274 bytes/sec)

# dd if=/dev/zero of=test2 bs=1k count=512 conv=sparse
512+0 records in
512+0 records out
524288 bytes transferred in 0.001709 secs (306783378 bytes/sec)

# ls -l
total 1224
drwxrwxr-x  2 root  operator     512 Apr 10 02:10 .snap
-rw-r--r--  1 root  wheel     524288 Apr 10 02:14 test1
-rw-r--r--  1 root  wheel     524288 Apr 10 02:15 test2

# du -a
8       ./.snap
128     ./test1
1088    ./test2
1232    .

With /dev/zero as the input and conv=sparse, the output should be nothing and the space on disk should be minimal. When bs= is part of the invocation, we can see that the output is full size.


>Fix:
Make the code do what the comment says it should. Call dd_out with force iff no conversion flags are set other than noerror, notrunc or sync.

Suggested patch:

 if (ddflags & C_BS) {
     out.dbcnt = in.dbcnt;
-    dd_out(1);
+    dd_out(dd_flags & !(C_NOERROR | C_NOTRUNC | C_SYNC) == 0);
     in.dbcnt = 0;
     continue;
 }

This fix only addresses the sparse conversion as that happens on the dd_out() function. All other conversions are normally done as the last call made by dd_in(). The section of bad code in dd_in() will call dd_out() whenever the bs= option is specified. Without adding additional checks, it appears all other conversions will also be skipped. This block of code look like an attempt at optimization gone bad.


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-standards mailing list