"tar tfv /dev/cd0" speedup patch

Juergen Lock nox at jelal.kn-bremen.de
Sat Feb 20 10:22:09 UTC 2010


On Fri, Feb 19, 2010 at 09:20:30PM -0800, Tim Kientzle wrote:
> Juergen,
Hi!
> 
> I was looking at your Linux code here and thought
> the technique of trying lseek(SEEK_END) might work.
> Unfortunately, it doesn't: lseek(fd, 0, SEEK_END) gives
> zero for both /dev/sa0 (a tape drive) and /dev/cd0
> (an optical drive).  Are you sure it works on Linux?
> 
 Yeah that code is Linux-specific, I know it doesn't work on BSDs. :)
Here's some output on Linux after changing O_RDWR to O_RDONLY:

$ ./a.out /dev/sr0
fd=3
lseek(fd, 0, SEEK_SET)=0
lseek(fd, 10240, SEEK_SET)=10240
lseek(fd, 10240, SEEK_CUR)=20480
lseek(fd, 0, SEEK_END)=-2057306112
lseek(fd, 0, SEEK_SET)=0
$ ./a.out /dev/fd0
fd=3
lseek(fd, 0, SEEK_SET)=0
lseek(fd, 10240, SEEK_SET)=10240
lseek(fd, 10240, SEEK_CUR)=20480
lseek(fd, 0, SEEK_END)=1474560
lseek(fd, 0, SEEK_SET)=0
$

 Ok /dev/sr0 was a dvd iso and you casted the lseek return value to int...
(And this was on amd64, on i386 your version gets an overflow error for
SEEK_END there too i.e. -1 because on Linux off_t defaults to be a long.)

 If I fix that I get:

$ ./a.out /dev/sr0
fd=3
lseek(fd, 0, SEEK_SET)=0
lseek(fd, 10240, SEEK_SET)=10240
lseek(fd, 10240, SEEK_CUR)=20480
lseek(fd, 0, SEEK_END)=2237661184
lseek(fd, 0, SEEK_SET)=0
$

 ...which matches the size of the iso.  (and bsdtar with the patch
also is much faster on /dev/sr0 there than without it, which was how I
originally confirmed the patch is working.  There are two 850 MB files
on that iso...)

 I'll append my version of the test program below.

 Cheers,
	Juergen

PS: Seeking on tape is a whole other can of worms, I don't even think
lseek on Linux works as you might expect there...  If you really want to
implement this you'd have to try the MTFSR ioctl (at least that's what
its usually called, look for `fsr' in the mt(1) manpage and source),
but since that counts in blocks not bytes you'd have to know the tape's
blocksize too (which can also be variable i.e. depend on how that
particular tape was written, tho I think in case of a tar archive
you can get away with just using the blocksize arg passed to bsdtar
there.)  And also I'm not sure how some drives may behave if you use
lots of MTFSR ioctls with small block counts so maybe you should
only use them when the amount of data to skip is big enough so that
switching to `fast forward' is actually worth it.  (and continue to
skip over small amounts by doing regular reads.)

-------snip-----------
/* make sure we don't get 32 bit off_t */
#define _FILE_OFFSET_BITS 64

#include <fcntl.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int
main(int argc, char **argv)
{
         int fd;

         if (argv[1] == NULL) {
                 fprintf(stderr, "Need to specify a pathname.\n");
                 exit(1);
         }

         fd = open(argv[1], O_RDONLY);
         printf("fd=%d\n", fd);
         printf("lseek(fd, 0, SEEK_SET)=%jd\n",
             (intmax_t)lseek(fd, 0, SEEK_SET));
         printf("lseek(fd, 10240, SEEK_SET)=%jd\n",
             (intmax_t)lseek(fd, 10240, SEEK_SET));
         printf("lseek(fd, 10240, SEEK_CUR)=%jd\n",
             (intmax_t)lseek(fd, 10240, SEEK_CUR));
         printf("lseek(fd, 0, SEEK_END)=%jd\n",
             (intmax_t)lseek(fd, 0, SEEK_END));
         printf("lseek(fd, 0, SEEK_SET)=%jd\n",
             (intmax_t)lseek(fd, 0, SEEK_SET));

         return (0);
}


More information about the freebsd-hackers mailing list