limit to number of files seen by ls?

Sun Jul 26 07:35:08 UTC 2009

Karl Vogel wrote:

>    That arbitrary number has worked very nicely for me for 20 years under
>    Solaris, Linux, and several BSD variants.  The main reason I stick
>    with 1000 is because directories are read linearly unless you're using
>    something like ReiserFS, and I get impatient waiting for more than that
>    number of filenames to be sorted when using ls.

Um... you mean filesystems like FreeBSD UFS2 with DIRHASH?  The problem
with linear time scanning of directory contents has been solved for a while,
and directories containing of the order of 10^5 files are nowadays usable.

You are entirely right that with ls(1) one of the biggest causes of delay
when returning a long list of files is actually sorting the output, but
that would happen whatever filesystem you're using.  If this is a problem,
most of the time you can just avoid the sorting altogether by using 'ls -f'

>    If your application is trying to create hundreds of thousands or
>    millions of files in any one directory, or you're creating lots of
>    200-character filenames from hell, then your design is a poor match for
>    most varieties of Unix; small directories perform better than enormous
>    ones, and lots of commonly-used scripts and programs will fall over
>    when handed zillion-file argument lists.

Yep.  You are correct here, but I think your concept of 'small' could grow
by an order of magnitude when using a reasonably up to date OS -- anything
produced in the last 3 -- 5 years should be able to cope with directories
containing tens of thousands of files without slowing down disastrously.

Most of the limitation on the number of arguments a command will accept
are due to the shell imposing a maximum, which in turn is due to limitations
on the size of the argv[] array allowed for execve(2).  This is a general
limitation and applies to anything listed on a command line, not just file
names.  It's fairly rare to run into this as a practical limitation during
most day to day use, and there are various tricks like using xargs(1) to
extend the usable range.  Even so, for really big applications that need
to process long lists of data, you'ld have to code the whole thing to
input the list via a file or pipe.

Long filenames per-se aren't a bad thing -- a descriptive filename is
quite beneficial to human users. But long names aren't necessary: after all,
just 12 alphanumeric characters will give you:

  8114042066856017096132973186621192079364039587244176589984832159744

possible different filenames, which should be enough for anyone, and all
the computer cares about is that the name is distinct.  (that doesn't even
include punctuation characters)).  So what if you can't remember whether
Jx64rQWundkS contains "The Beatles: White Album" or "Annual Report of the
Departmental Sub-committee in Charge of Producing Really Boring Reports."

I've seen long filenames taken to extremes: people saving the text of a letter
using a filename that consists of the names of the sender and recipients, the
date sent and a precis of the contents.  I'm pretty sure that some of those
filenames were longer than the actual letters...

>    I'm sure the latest version of <insert-cool-OS-or-filesystem-here>
>    fixes all these objections, but not everyone gets to run the latest
>    and greatest.  Don't fight your filesystem, and it won't fight you.

So, FreeBSD is a *cool OS*?   But we knew that already...

	Cheers,

	Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.                   7 Priory Courtyard
                                                  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey     Ramsgate
                                                  Kent, CT11 9PW

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20090726/739e6bf6/signature.pgp