sort(1) memory usage

Erik Trulsson ertr1013 at student.uu.se
Sun Feb 3 09:16:50 PST 2008


On Sun, Feb 03, 2008 at 04:31:34PM +0100, Dag-Erling Smørgrav wrote:
> Dag-Erling Smørgrav <des at des.no> writes:
> > Erik Trulsson <ertr1013 at student.uu.se> writes:
> > > Yep, it seems that GNU sort allocates a quite large buffer by default when
> > > the size of the input is unknown (such as when it reads input from stdin.)
> > > A quick check in the source code indicates that it tries to size this buffer
> > > according to how much memory the system has (and according to any limits set
> > > on how much memory the process is allowed to use.)
> >
> > Uh, OK.  This scaling doesn't seem to work correctly.  It seems to
> > allocate 27 MB on 32-bit machines and 54 MB on 64-bit machines,
> > regardless of memory size.

I said it *tries* to the size the buffer according the amount of memory
available.  I didn't say it succeded in doing so, or that it even made a
good attempty at it.

Those 27MB/54MB is probably because it hits some kind of limit.
On a machine having only 64MB RAM, sort(1) "only" allocated 21MB adress space.

I suspect the scaling algorithm was designed for older machines which
rarely, if ever, had more than maybe 64MB RAM (and usually less than that),
and that little thought was given to multi-gigabyte machines like those
common today.


> 
> Looking at the code, it seems to go to extreme lengths to get it
> absolutely wrong.  For instance, if hw.physmem / 8 > hw.usermem, it will
> pick the former, which means it's pretty much guaranteed to either fail
> or hose your system (or both).
> 
> In the immortal words of Blazing Star: YOU FAIL IT
> 
> Count this as a vote for ditching GNU sort in favor of a BSD-licensed
> implementation (from {Net,Open}BSD for instance).
> 

If any such implementation was a true drop-in replacement of GNU sort
(supporting all the same options etc.) and did not have noticably worse
performance, then I certainly would not raise any objections to that.



-- 
<Insert your favourite quote here.>
Erik Trulsson
ertr1013 at student.uu.se


More information about the freebsd-hackers mailing list