sort(1) memory usage

Erik Trulsson ertr1013 at student.uu.se
Sun Feb 3 07:31:03 PST 2008


On Sun, Feb 03, 2008 at 02:13:22PM +0100, Ed Schouten wrote:
> * Dag-Erling Smørgrav <des at des.no> wrote:
> > I've been trying to figure out why some periodic scripts consume so much
> > memory.  I've narrowed it down to sort(1).
> > 
> > At first, I thought the scripts were using it inefficiently, feeding it
> > more data than was really needed.  Then I discovered this:
> > 
> > des at ds4 ~% (sleep 10 | sort) & (sleep 5 ; top -o res | grep sort)
> > [1] 66024
> > 66024 des          1  -8    5 54796K 52680K piperd 1   0:00  0.88% sort
> > 
> > That's right - sort(1) consumes 50+ MB of memory doing *nothing*.
> > 
> > (roughly half that on a 32-bit box)
> > 
> > Something is rotten in the state of GNU...
> 
> On my i386 box it spends 27M, but when I replace `sort' with `sed',
> without any arguments, it's only 1.4 MB. I tried this on RELENG_6. I can
> also reproduce this on Linux.
> 

Yep, it seems that GNU sort allocates a quite large buffer by default when
the size of the input is unknown (such as when it reads input from stdin.)
A quick check in the source code indicates that it tries to size this buffer
according to how much memory the system has (and according to any limits set
on how much memory the process is allowed to use.)
The size of this buffer can be controlled with the --buffer-size option to
sort(1).



-- 
<Insert your favourite quote here.>
Erik Trulsson
ertr1013 at student.uu.se


More information about the freebsd-hackers mailing list