bin/111081: [libarchive] problem with big file

Patrick Lamaiziere patpr at
Sat Mar 31 23:00:12 UTC 2007

>Number:         111081
>Category:       bin
>Synopsis:       [libarchive] problem with big file
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Mar 31 23:00:11 GMT 2007
>Originator:     Patrick Lamaiziere
>Release:        6.2-RELEASE
FreeBSD 6.2-RELEASE-p3 FreeBSD 6.2-RELEASE-p3 #0: Sat Mar 24 14:08:07 CET 2007     patrick at  i386

With a .tar file that include a big file (6 Go), bsdtar fails to extract or list the files into the archive because a bug into libarchive :
$ tar tf samba.tar
tar: (Empty error message)

$ truss tf samba.tar
write(1,"\n",1)                                  = 1 (0x1)
samba/bigfile.rarwrite(1,"samba/bigfile.rar"...,79) = 79 (0x4f)
lseek(3,0x0,SEEK_CUR)                            = 843239424 (0x3242d000)
lseek(3,0x846ca000,SEEK_CUR)                     = -1230016512 (0xb6af7000)

write(1,"\n",1)                                  = 1 (0x1)
tar: write(2,"tar: ",5)                          = 5 (0x5)
(Empty error message)write(2,"(Empty error message)",21) = 21 (0x15)


I think the problem is into the "file_skip" functions because this is the only place where there are two lseek(). I don't know witch one : there is one function into "archive_read_open_fd.c" and the other into "archive_read_open_file.c". Anyway they are similar:

file archive_read_open_fd.c :

static ssize_t
file_skip(struct archive *a, void *client_data, size_t request)
        struct read_fd_data *mine = client_data;
        off_t old_offset, new_offset;

        /* Reduce request to the next smallest multiple of block_size */
        request = (request / mine->block_size) * mine->block_size;
         * Hurray for lazy evaluation: if the first lseek fails, the second
         * one will not be executed.
        if (((old_offset = lseek(mine->fd, 0, SEEK_CUR)) < 0) ||
            ((new_offset = lseek(mine->fd, request, SEEK_CUR)) < 0))
        return (new_offset - old_offset);

The result is a ssize_t (int32) and new_offset, old_offset are off_t (int64)
There is an owerflow here :
from truss :
lseek(3,0x0,SEEK_CUR)                            = 843239424 (0x3242d000)
lseek(3,0x846ca000,SEEK_CUR)                     = -1230016512 (0xb6af7000)

So :
new_offset - old_offset
0xb6af7000 - 0x3242d000 = 0x846ca000 => this is a negative value on int32.

I don't know how to solve this problem, sorry.

Also, may be it would be better to compare lseek() == -1 instead lseek () < 0 ?

Best regards.
list or extract big files (several Go) from a .tar with bsdtar.

$tar tf tar_with_big_file.tar


More information about the freebsd-bugs mailing list