[Bug 187315] unzip(1): base unzip does not recognize *.zip archives from dropbox.com

Thu Sep 18 18:22:37 UTC 2014

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187315

oliver at beefrankly.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |oliver at beefrankly.org

--- Comment #2 from oliver at beefrankly.org ---
Some time ago I checked out the bug and tried to contact the author, but did
not get response...maybe he did not get it...here is a copy of the mail...

---

Hello des,

I contact you because you are the main author of
the /usr/src/usr.bin/unzip utility if I got it correct. 

Well I took a glimpse into this PR bin/187315 and could need some
advice.

unzip(1) uses libarchive(3) for working with the archives. 

To determine the filetype, there is a function called
"archive_entry_filetype()" in libarchive. As this function uses
the file acl.mode as input, it fails if an entry has no file mode and
returns a filetype of 0x0. 

As the implementation of unzip expects to get a filetype of either
a regular file or a directory, it checks for that. And so
that sanity check for S_ISREG and S_ISDIR fails and the program skips
the entry.

unzip.c 

/* I don't think this can happen in a zipfile.. */
        if (!S_ISDIR(filetype) && !S_ISREG(filetype)) {
                warningx("skipping non-regular entry '%s'", pathname);
                ac(archive_read_data_skip(a));
                free(pathname);
                return;
}

The cause of this may be that dropbox creates the zipfile for you
on-the-fly. That means streaming it out of a database directly into a
zipfile. In this special circumstance, where there is no file and the
file comes from stdin, it is allowed by ZIP file archive standard to
keep the external file attribute 0x0. (see [1] 4.4.15 external file
attributes). As I understand it, the libarchive code uses this field for
filetype check.

I think that is what happens here (at least in the dropbox-file the
filetype is returned zero for all files and directories). I can
reproduce the error like that:

$ echo "testtext" | python -c "import 
sys                                       
import
zipfile                                                                  
z =
zipfile.ZipFile(sys.argv[1],'w')                                            
z.writestr(sys.argv[2],sys.stdin.read())
z.close()                                                                       
" test.zip testfile1
$ unzip -l test.zip
Archive:  test.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
        9  03-16-14 00:47   testfile1
$ unzip test.zip
Archive:  test.zip
unzip: skipping non-regular entry 'testfile1'
$ /usr/local/bin/unzip test.zip
Archive:  test.zip
 extracting: testfile1               
$ cat testfile1
testtext
$ 

for a correct file zipinfo shows (example):
  Unix file attributes (100744 octal):            -rwxr--r--  
  Unix file attributes (040744 octal):            drwxr--r--

for dropbox or above example:
  Unix file attributes (000600 octal):            ?rw-------

recognize the questionmark where filetype should be (=0x00).

The extraction seems to work correctly if we remove that sanity check
for S_ISDIR and S_ISREG. But as the program uses the information for
program flow that may be a problem.

As more and more archives are generated on the fly, maybe that issue
will get more serious. 

Maybe you can give me a hint if it's okay to remove that sanity check
or if you want to keep it.

[1] https://www.pkware.com/documents/casestudies/APPNOTE.TXT

-- 
You are receiving this mail because:
You are the assignee for the bug.