uniq truncates lines > 2048 bytes

Kris Kennaway kris at obsecurity.org
Tue Jan 25 14:38:35 PST 2005


On Wed, Jan 26, 2005 at 09:10:47AM +1100, Tim Robbins wrote:
> On Tue, Jan 25, 2005 at 11:51:51AM -0600, Scot Hetzel wrote:
> > I noticed that if a file has lines > 2048 bytes, uniq will truncate
> > the line to LINE_MAX (2048 bytes). An easy way to test this is to do
> > the following:
> > 
> > cd /usr/ports/accessibility/gnomemag
> > make fetch-list > test.list
> > make fetch-list >> test.list
> > uniq test.list > test2.list
> > 
> > test2.list should be half the size of test.list, but it is 2048 bytes.
> > 
> > I have come up with a patch to uniq that fixes this problem.
> > 
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=76578
> 
> This looks good except for failure to check for realloc() returning NULL
> and a few minor style problems. It may be possible to use fgetwln()
> to read lines instead of getwc() + realloc() etc., but this function is
> new and peculiar to FreeBSD.
> 
> I was planning on going through all text-processing utilities in the base
> system some time and either fixing line length problems or documenting them,
> similar to what I did with multibyte character support. I may make a start
> at that today.

If someone could fix comm(1) that would be a big help for me, because
I have a local hack I have to carry around in all of my local package
source trees.

Kris

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-current/attachments/20050125/e93cad9e/attachment.bin


More information about the freebsd-current mailing list